## LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Classic View Use Monospaced Font Show Text Part by Default Show All Mail Headers Topic: [<< First] [< Prev] [Next >] [Last >>]

 Re: Multilingual TeX --- and a successor to TeX Vladimir Volovich <[log in to unmask]> Sat, 14 Jun 1997 22:32:05 +0400 text/plain (143 lines) ```Hello, Frank Mittelbach wrote: > > This is my first post to this mailing list, so I'm sorry. > > why being sorry for that? May be, I'll say something which is off-topick, or so... :-) > you pick up on a disscusion that happened > not too long ago on this list and with a new topic Oh, it's very good. [...] > therefore, to support fonts with different encodings within the same > paragraph (you don't even need different languages since in tex > hyphenation is tied to the font encoding so you need different > patterns for different fonts even for the same language) we have to > enforce a single lccode table. > > that's an unfortunate fact of life. But... Consider, e.g. a russian language. It is not like most european languages, since all the russian letters have codes above 128 (well, except the LWN encoding, which is of course not a `native' encoding). Because of this, it is possible, for example, to `combine' one of european languages (most offten English) and Russian language as a *one* `combined' language. I.e. it will be \language0, with hyphenation patterns loaded for English and Russian (it is possible, because the codes of english letters do not intersect with the codes of russian ones). If this is done, there is no need to switch the language, and hyphenation will be correct for both English and Russian text. Moreover, if one needs to use some third language in the same document, he can easily do this using Babel system. Next, when somebody uses Russian language in his TeX document, very offten the other language(s) used in the same document will have the codes of it's letters below 128. So, why not to allow Babel to change lccodes for Russian in this case? The change of lccodes and uccodes for Russian language is very useful because the most common russian encodings `conflict' with the `built-in' setting of lccodes and uccodes in LaTeX. :-( > now that doesn't help very much, i agree. so the questions we have to > ask is what can be done about it > > a) just support a single lccode table > > b) don't care about rubbish hyphenation when languages are mixed > and allow for several hyphenation tables > > c) allow to change lccode tables only between paragraphs and disable > hyphenation within language fragments in a paragraph that do not have > the have the right (meaning: current) lccode table O, this will be *very* useful... > d) hope for a successor to TeX to fix it And this is an ideal... > right now we basically have situation a) which means LaTeX does not > support changing the lccode table. this does not mean that packages > can't do it but any such code is likely to break at some time in the > future and we, from the LaTeX3 project don't feel able to support > problems with a LaTeX system that does use a modified table. in > addition documents written for such a system will produce strange > results on others, ie we can't have portability there and definitely > not real multi-lingual. IMHO, if we have e.g. a russian-english document, then the results will be the same on all TeX systems (because English does not use characters with codes above 128). The strange results in this case will be if one will use some third language which also significantly uses the high half of ASCII table... > b) means putting lccode changes into the framework of Babel as it is > now. that is sometimes done right now and Babel supports setting any > resetting things when entering or leaving a language environment so if > you are happy with those poor results when mixing language that sort > of works BTW, I'm now trying to improve support of russian LCY encoding in babel. The variant with using \babel@savevariable inside of \addto\extrasrussian to save all lccode, uccode, sfcode and mathcode values for all 66 russian letters (33 upper case+ 33 lower case) works, but very slow. I have a significant delays on my P100 when I use this variant. It is strange. because if I simply define these lccode, uccode, sfcode and mathcode values just after \usepackage{babel} (which is wrong because it defines these registers globally and without a proper restoration), this works very quickly. > c) is the situation where i think we can get to as long as we use TeX > as a basis and it is the scheme i intend to adopt for the new language > interface for LaTeX for which the conceptual work is mostly done and > trial implementation is done in parts. this would give a clean > interface and only minor inconvenience, ie, if languages are mixed > within a paragraph then LaTeX will not not hyphenate part of it > (warning you about it) and you have to put in explicit hyphens there. > but it would not produce rubbish in this case, eg something you do > only notice after publication. Ok, I'll wait for this nice feature. Will it appear in LaTeX2e soon? > > Here I see another problem. Consider some multilingual > > phrase and let's assume that we need to do > > \uppercase{} or \lowercase{} of this phrase. If the languages contained > > in this phrase have a conflicting \lccode and \uccode > > values, than the result of \uppercase{} will be incorrect. > > However, some time ago I solved this problem by redefining > > TeX's builtins \uppercase and \lowercase so that > > these new macros split the multilingual phrase into > > one-language pieces and do \uppercase of these pieces, > > and then merge them again. > > again a deficiency of TeX that really isn't solvable generally, yes > you can do some hacks by trying to write your own interpreter within > TeX but it is so easy to make it fall over, just think about the user > putting something in a macro to save typing, so how do you find those > hidden language changes? I will simply \edef{} the argument of \uppercase before processing it, -- so all macros should be `expanded'. I'm sending these macros to this lish in a sepatate letter. > reliably this can right now only be solved by using uppercase fonts in > such places. Or, using Omega ... Unfortunately, I haven't yet tried to use Omega... I only read something about e-TeX, and it is nice that the work on improving TeX is going on... But I'd like it to go more quickly and co-ordinately... :-) With best regards,                    Vladimir. ```