On Mon, 16 Jun 1997, Vladimir Volovich wrote:
> > My goal is to extend this---there are
> > so many language specific solutions which can't be made more general (e.g.
> > Japanese TeX). I don't like this, and I will always try to find a
> > multilingual solution. For example, you can embed Babel seamlessly into
> > the CJK package (provided you have only 7bit input encoding for non-CJK
> > languages: this will be done by the Mule-CJK interface).
> I did not see MULE yet (I have only xemacs-19.14).
AFAIK the latest version of xemacs has Mule built in. You can also get the
(original) GNU emacs 20 beta versions from
etlport.etl.go.jp/pub/mule/.notready or a mirror site (e.g.
ftp.lrz-muenchen.de/pub/culture/east-asia) which have also Mule.
> > > Moreover, do the _one_ default lccode+uccode settings conform to _all_
> > > languages (which are already TeXized)?
> > Of course not. But it is better to try to follow this standard if
> > possible. And it *is* possible for Russian to use these settings.
> Well, I agree. But
> * Until new T2-based LH fonts will not be widely available,
> LCY will be the main encoding.
Sigh. Using the old encodings will always be provisional. Of course, you
have a vital interest that Russian works, but I want to see a general
solution as soon as possible to avoid (presumably already existing)
> * Again, when we will use T2-based fonts, we'll have to have
> russian letters to be active chars (to translate from input encoding
> into TeX's internal encoding---T2), and this has some disadvantages.
You can't get all :-)
> * Because of `strange' (non-monomorphic ;-)) settings of lccode and uccode values for
> characters "19, "1a, "9d and "9e in T1 encoding (this fact is mentioned in
> T2 home page), these positions will not probably used in T2 encoding :-(
> May be, it would possible to change the lccode and uccode values for these
> characters in T1?
T1 does not define any lccode or uccode values. This does the latex team
in the latex core. We should ask them what they think about that.
-> mixing Turkish and English within T1 in the same paragraph seems
impossible to me (due to the dotted and undotted I and i). One language
will always lose the battle on hyphenation.
> > > 2) use TeX Code Pages (TCP). This has advantage because it is universal,
> > > and lets us preserve catcodes of translated characters.
> > > But TCP are not supported by all TeX implementations.
> > only emTeX, AFAIK. Non-portable.
> And is e-TeX or Omega having in plans to implement this?
> Why other TeX implementation did not implemented this? It seems to
> be not too hard. :-(
Do it! Contribute a change file to web2c! Omega does not need this---it
uses a completely different approach: it use so-called Omega Translation
Processes (OTPs) which are final automata to translate input encodings,
apply correct casification etc.
> > > * letters are now `not letters' -- so it becomes impossible
> > > to define and use macroses with names consisting of letters of this language.
> > this is bad anyway, since TeX does not have a mechanism for separating
> > control characters from normal TeX (Omega can do this...)
> Could you explain what you mean?
> I mean that when russian letters have the same input encoding and internal TeX incoding
> (the case of LCY), then these characters will have catcodes equal to 11 (letter).
> So it will be possible to use macroses with russian names.
> It is impossible if russian characters are active.
Again, think of multilingualism! Your approach will not work if one input
encoding is not sufficient. Using only ASCII seems to be the best
> > Have you ever seen a log file of my CJK package for Chinese? :-)
> No. The only thing from CJK package I tried to play with, was a ttf2pk package,
> which faied to work on my computer. ;-(
Failed? I assume that you've tried to use non-CJK fonts... This will work
soon after I've changed the font engine to FreeType. Expect something in
the next few months (I hope that I can show something at TUG 97).
> I can explain, why log files become unreadable.
> To be readable, the encoding of russian characters in log files
> must be the same as the input encoding (8 bit!).
> But when TeX reads russian character (with code >= 0x80) in some
> external encoding (say, koi-8) and translates this character to
> the corresponding place in T2 table (therefore the character should be active),
> then all russian letters which will go to log files (and to the screen)
> in case of TeX errors, will have TeX internal encoding (T2), but *not*
> the external encoding (koi-8). Am I right?
> So, the log (and the screen) will be unreadable.
You are absolutely right. Log messages will always be cryptic :-)
> > > BTW, do you have the new alpha version of LH fonts (with T2 encoding)?
> > Oh! This is good news! Where can I get them?
> They are not yet widely available. I hope that I'll get them soon,
> and I'll inform you about this.
Thanks in advance.