At 14:28 +0100 2001/02/13, Marcel Oliver wrote:
>2.1. Problems with Current TeX:
>It has been remarked that TeX does not really have an "internal
>representation". Rather, TeX keeps text as a string of ASCII
>characters that are re-parsed through the one-and-only TeX parser
>whenever something is to be done with it. (TeX gurus: is this
>simplistic statement essentially correct???)
I am not a TeX guru, but I get the impression that the TeX looks like this:
<string of TeX tokens> <not yet gulped up ASCII (or 8-bit)>
The string of TeX tokens buffer is normally empty, but sometimes a macro
may insert a string of tokens (perhaps a macro expansion can be viewed as
though the body is first inserted in this buffer, before being evaluated).
The <not yet gulped up ASCII (or 8-bit) buffer is read converted into
tokens at need. TeX does not back-track.
>- Hyphenation patterns are specified in terms of the output encoding.
> This means that every character appearing in the hyphenation rules
> must have a physical slot in the selected font. However, logically
> hyphenation should not depend on output encoding, and one should be
> able to mix fonts with different output encodings without losing
> correct hyphenation.
I get the impression that this is the result of TeX's lack of being able to
create suitable objects: If TeX was able to first create objects of type
"word", to which other operations, such as hyphenation are applied, then
this kind of problems would go away.
Let's hear from some TeX gurus how TeX really works.
>- Unicode is currently receiving a lot of attention and publicity. So
> it may be advantageous to ride that wave, in particular as it seems
> technically sound.
The new MacOS X (which is Mach 3 & 4.4BSD based), which exists in a beta
and is released in a regular version at the end of this upcoming March,
evidently supports Unicode fully. -- The main point is that if personal
computers now finally support Unicode, Unicode will soon become ubiquitous.