Laurent Siebenmann
Sat, 10 Feb 2001
Marcel, Frank

 M> - Hyphenation tables should really be Unicode (so possibly UTF8
 >   encoded).  They are logically neither input nor output encoding
 >   related, and should work regardless whether either refers to a
 >   castrated font set.

I would add that they are in a similar sense not even TeX related.

 F> In other words you have a hyphenation file, say for German, which
 > can be used with T1 encoding but also with OT1 encoding by simply
 > removing all patterns which contain references to umlauts or sharp
 > s.

It seems to be common practice (for French, German...) to even use
the *same hyphenation trie* for OT1 and T1?  This is something of a
lucky accident, but I think it works well (no damage at all to T1
hyphenation, no big damage to OT1).

 > writing in a cyrillic languages where i want to use both in text as
 > in math cyrillic letters

I suspect this need is rare since Russian math is 99% Latin
and Greek and international math symbols. So typing $\Sh$
for a rare Shafarevitz symbol is pretty adequate notation.
Russians always have the full US ASCII keyboard.

More problematic is the situation for ordinary text in
Serbian and Macedonian.  There are official (heavily
accented) Latin and (mostly unaccented) Cyrillic versions
even of the same text; and I expect both typographies often
appear in the same document. There is not much problem
typing such a document in MicrosoftWord or any similar
wordprocessor, and extracting the typescript as 8-bit text,
for TeXing. But moving the 8-bit text to Linux or Mac with
its two meshed and system dependent 8-bit encodings must be

ASCII is the royal road to portability.


                          Laurent S