At 15:18 +0200 2001/05/13, Lars Hellstršm wrote:
>>> This is why current LaTeX converts everything to
>>>LICR before it is written to the .aux file: the elements of the input
>>>encoding (as Frank called them above) do not have a single welldefined
>>>meaning. What has been discussed is that one might used some form of
>>>Unicode (most likely UTF-8) in these files instead.
>>Forget everything about variable sized characters as far as the extension
>>of TeX goes, and hook onto translators outside that recognize other
>>formats. Variable sized characters just complicates programming.
>Well, the \InputTranslation and \OutputTranslation primitives of Omega
>already provide that functionality, so there is no need to deal with
>variable-sized characters in the TeX programming. The problem is that one
>might want to employ additional sets of translations (which would then act
>on streams of equally-sized characters) between those extremes of the
>program, but Omega doesn't provide for this.
I am not sure what you mean here: UTF-8 is variable sized.
I suggested that for every file not using a 32-bit character type, one has
an additional file (in ASCII) identified by some kind of file name ending
with information about the encoding. (For example, if the file "<name>" is
not 32-bit, is there si also an ASCII file named "<name>.encoding".)
This way, one can provide as many IO code converters as one bothers to
write, without the extended TeX ever knows anything about it. (If Omega
uses C++ for IO, one can use something called a codecvt. Or use pipes,