## LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Forum View Use Monospaced Font Show Text Part by Default Show All Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

 Subject: Re: Multilingual Encodings Summary 2.2 From: Hans Aberg <[log in to unmask]> Reply To: Mailing list for the LaTeX3 project <[log in to unmask]> Date: Sat, 19 May 2001 20:22:36 +0200 Content-Type: text/plain Parts/Attachments: text/plain (179 lines)
At 16:43 +0200 2001/05/19, Lars Hellström wrote:
>>The reason one is getting stuck with it is for backwards compatibility, and
>
>Indeed. \epsilon and \varepsilon could probably not be identified earlier
>than in LaTeX3.

I am not sure what you mean here: The two types of epsilon dates back al
long time. I am not sure exactly how far, but perhaps back to the thirties
of the last century.

A long time, mathematicians refused to use LaTeX because it was not capable
to produce the output required in math.

I think (but Frank or somebody will know this better) that one reason for
creating the LaTeX3 project was to ensure that mathematicians could use
LaTeX to produce the output they want.

>>further there is no guarantee that mathematicians will use the symbols the
>>way you dictate.
>
>You mean saying \in for the set membership relation rather than \epsilon?
>\epsilon is just plain wrong (and has always been so) since it generates an
>Ord math atom, not a Rel math atom as a relation command should.

The main point is that to some mathematicians, using one of the epsilon
variations has been right at least in the past.

As for TeX, if it is the binary relations setting you have in your mind,
that can be fixed, I recall.

And if things cannot be fixed in TeX by some general mechanism, one can
always use kerning in the particular formulas in order to fix up the look.

>>Later, one would expect LaTeX, or whatever scientific typesetting system,
>>being capable to support them all without restrictions. Plus admitting
>>future additions.
>
>Yes, but not necessarily supporting them by default. There is an important
>difference between the default set-up making \epsilon and \varepsilon
>different, and providing a mechanism that makes it easy to (on a per
>document basis) add such a distinction. What is provided by the default
>set-up becomes the minimal core which _all_ set-ups must provide.

The problem is that you want to impose a default restriction that cannot be
motivated by some knowledge of actual usage: The \epsilon and \varepsilon
look sufficiently different that they could be used side by side in the
same formula, and they may already have.

> The
>larger you make this core, the bigger the effort needed to support it will
>be, and the alternatives to the default will be correspondingly fewer. It's
>easy to request that all fonts provide everything that is in Unicode if you
>anyway would never help with providing anything.

In this case I think it is clear that ever font that will be used with
Unicode that supplies one of the epsilon types will supply the other,
because I recall the were fit into the same group of 1024 math character
symbols.

So there is no gain in trying to restrict what already is present in
Unicode and TeX.

>>I have seen examples of both types of epsilon being used to denote set
>>membership,
>
>No doubt due to "limitations in past typesetting".

Whatever; the main thing is that they now are present as different
characters and may have already been used as such because it is perfectly
legal. And you do not know for sure that in every manuscript in the past
before the advent of TeX they have too been used side by side in the same
manuscript.

>>and I have seen examples of both types of epsilon being used as
>>a small number > 0. You could probably add a whole range of characters
>>moving from \varepsilon to \epsilon to \in for set membership.
>
>That's where I suspect you get it all wrong.

Please do not be so rude in your formulations as the Cambridge wannabe
geniuses. :-)

> You're talking about a whole
>range of _glyphs_, in appearence similar to anything between the
>\varepsilon and the \in of Computer Modern, but they're all the same
>semantic atom (i.e., character) and thus shouldn't have distinct internal
>representations in LaTeX.

All those variations derive from the beginning, I surmise, from the same
glyph in the Greek language, but they have since migrated. It is the guy
who writes the math paper in question that decides what is the correct
semantic interpretation, and not you, and there is nothing you can do about
that.

The \in is also originally an epsilon and nothing else.

> That at least part of that range of glyphs may
>also be used to represent another character (the greek letter small
>epsilon) which should have its own internal representation is another
>matter.

Right. It is very difficult to tell how those characters evolve and to
impose restrictions onto that evolution.

If, when all this has done, and somebody comes up with the evidence of a
new variation that must be added in order to get the math papers right,
then that variation should be added as well.

>>Knuth, being wise, realized how disparate the use of the symbols are in
>>math, and introduced a macro symbols system so that anyone can define them
>>as they please:
>
>The point is that the macro system Knuth created has no internal
>representation for characters, neither in text nor math---instead it is
>based on the user specifying what glyph (or combination of glyphs) is
>desired. LaTeX, by contrast, has an internal representation for characters
>as of version 2e, but still uses the Knuthian glyph selection commands in
>math. What I argue is that by version 3 of LaTeX there should be an
>internal math character representation as well.

I think that over the past years, there has been several ideas of providing
a better math representation on different levels of abstraction, but the
difficulty is always how mathematicians use them according to their own
objectives. What is a must in some areas is totally unacceptable in other.

For example, a few years ago there was this discussion about how
engineering standard about how tensors should be typeset, but which would
be totally unacceptable in a paper in differential geometry.

Therefore, I do not think that there has been viable proposal along such lines.

The best one can hope for, I think, is to provide optional packages that
people may decide to use if they so want on top of the regular LaTeX model.

>>Further, if you want to make it impossible to use \varepsilon and \epsilon
>>side by side in the same document, you will have to make sure that in all
>>of the world literature in the past up till now it has never been used that
>>way, because that is how the requirements of Unicode were set up.
>
>I'm not saying that it should be completely impossible to use them side by
>side (even though I would question any attempts to do so), but they
>shouldn't be provided as distinct characters in the default set-up.

I think it would be unwise to impose any kind of restrictions onto the math
characters in the default settings: If they appears as distinct entities,
one is free to use them as that.

And mathematicians seem to always invent new notation, they will probably
be used in new unexpected ways.

>>As for the math characters, I do not see there is any point in trying to
>>impose equivalences because the way the may be used in math, and it is just
>>an unnecessary additional work in implementation.
>
>It is very little additional work in the implementation of LaTeX (adding an
>OCP which normalizes the input somewhat further than what Unicode precribes
>will do), but it saves much (largely unnecessary) work in the
>implementation of fonts for LaTeX, and thereby it facilitates the creations
>of new fonts.

You will have to check with the font experts how they think that the future
fonts will be developed.

But I think that one possibility is that font developers merely take a
Unicode chunk and develop the characters in it. That would mean that the
two epsilon variations will always be developed together, because they
appear both in the 0x1D700 - 0x1D7FF group.

Then, if LaTeX is based on a TeX that is based on 32-bit padded characters
with Unicode in the bottom, it will have to follow that.

(The Omega draft did not explicitly say if it uses 16-bit or 32-bit Unicode
characters, but I figure that perhaps it is only using 16-bit Unicode
characters. Then the two epsilon variations fall without this range. If
that is what is causing the complication, I figure it would be best to
first make an Omega that is based on 32-bit characters or whatever.)

Hans Aberg