Re: Multilingual Encodings Summary

Date:

Wed, 14 Feb 2001 09:41:51 +0100

 Hello, On Tue, 13 Feb 2001, Frank Mittelbach wrote: > > - On all major platforms, support for editing and displaying UTF8 > > exists and either is currently moving into mass deployment. Major > > programming languages have UTF8 libraries, so the basic > > infrastructure for UTF8 is or will be in place shortly. > > remains to be seen. in the long term most likely yes, but how many of the > people on this list can easily (in their favorite editing system) edit or > generate a utf8 encoded file? hands up? The standard encoding of BeOS is UTF8. I don't know whether the number of TeX-installations under BeOS exceeds, say, 100, though. I don't think that Omega or NTS will replace TeX anytime soon, so here are some rough ideas how to implement unicode support in TeX: (a) Internally unicode characters can be encodes as command sequences of the form \, i.e., A' would become \0041'. (b) Each font would define these sequences appropriately, i.e, \def\0041{A}'. Characters not included in the font would raise an error message. (c) To convert the input file to the internal representation one could write a preprocessor in TeX which is invoked by the \documentclass command. That's IMHO the easiest way and I don't think the runtime penalty would be that great. The preprocessor should leave command sequences and braces alone, i.e., \begin{bar}' would become \begin{\0062\0061\0072}'. The only problem I see with this approache are \catcode-changes. Any thoughts? Achim -- ________________________________________________________________________                                                               | \_____/ |    Achim Blumensath \O/ \___/\ |    Mathematische Grundlagen der Informatik =o= \ /\ \|    www-mgi.informatik.rwth-aachen.de/~blume /"\ o----| ____________________________________________________________________\___|`