LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Classic View

Use Proportional Font
Show HTML Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Frank Mittelbach <[log in to unmask]>
Thu, 15 Feb 2001 08:44:11 +0100
text/plain (119 lines)

 > >  > - Hyphenation patterns are specified in terms of the output encoding.
 > >  >   This means that every character appearing in the hyphenation rules
 > >  >   must have a physical slot in the selected font.
 > >
 > > only in the internal storage format for patterns used within TeX. On the
 > > abstract level this is not at all true even though the source format of
 > > existing patterns tend to be written in this form as well.
 > Frank also made an earlier comment in this regard, which, while true,
 > is of marginal relevance to the discussion of multi-lingual/-encoded
 > documents.

Marcel's summary was trying to put forward technical points to be able to
weight them against each other. I was simply trying to put them technicaly
right where i considered them wrong. but i disagree with you when you say that
it has nothing to do with it.

 > Yes, it is nice if hyphenation pattern input files use symbolic
 > representaions of characters (\ss) rather than hex code values
 > so they can be used for different font encodings.  But the font
 > encoding must be selected when the format is generated!  This
 > doesn't help the user who wants to use various font encodings,
 > and it certainly does not facilitate multi-encoded documents.

you are right that the hyphenation patterns have to be selected at format
generation time so that you are unable to actually extend that set for a
single document. however, with "these days" TeX implementations  there is
typically enough room to actually store mutiple patterns and that means that
for typical usages at a site you can combine all the patterns needed (for
several languages and several font encodings)

don't forget that in many cases the largest pattern set can in fact serve for
several font encodings if they (for the character set of the corresponding
language) actually have the same slot positions  in the font encoding.

 > For hyphenation purposes, multiple encodings must be treated as
 > multiple languages.

technically you are right but you are in fact pointing with your statement at
the basic error Don made with TeX3x: calling something \newlanguage and
\language when in fact it should have been called something very different.
(eg pointer-to-hyphenation-patterns-related-to-some-output-encoding)
a lot of the problems result from using the TeXnical) term.

so no: not multiple encodings have to be treated as multiple languages but
within one language you need to store for each font encoding used which of the
pointer-to-hyphenation-patterns-related-to-some-output-encoding's you have to
apply when typesetting in this encoding.

and given that you (these days) can store a suitable number of such
pointer-to-hyphenation-patterns-related-to-some-output-encoding's you can with
a single format typeset multiscript documents for a number of combinations of
scripts. Clearly you have a limit so if you want to be able to typeset in too
many combinations you need a number of formats but thats in practice not a
real issue.

However, to be able to automatically generate those internal
pointer-to-hyphenation-patterns-related-to-some-output-encoding's you need the
hyphenation patterns externally stored in something which is independed of the
output encoding.

 > This again points to the need for babel to
 > specify the desired/required font (encoding) when it selects
 > a language.

yes. if we take the approach outlined with my xnfss code enabling the the
specification and use of multiple encodings per language then such a list
should be attached to each language. you would then associate with each such
encoding per language a suitable
pointer-to-hyphenation-patterns-related-to-some-output-encoding (in a number
of cases it could be the same one, eg OT1 and T1 for German would share the
same) and if you don't have an appropriate
pointer-to-hyphenation-patterns-related-to-some-output-encoding you could
select the "no-hypenation" one (or raise an error and ask for a different

 > >  >   However, logically
 > >  >   hyphenation should not depend on output encoding, and one should be
 > >  >   able to mix fonts with different output encodings without losing
 > >  >   correct hyphenation.
 > >
 > > yes, and it is possible without technical problems (in theory)
 > Possible in TeX as it stands???  Only by loading the patterns
 > resolved for each encoding.

of course, but the resolvement process you be happening abslutely
automatically in the background which why i say possible without technical
problems. and i said "in theory" because that requires hyphenation patterns
externally stored in a way that i can (for any font encoding) generate from
them the appropriate internal
pointer-to-hyphenation-patterns-related-to-some-output-encoding form without

 > Or does "in theory" really mean
 > what it says -- not in practice.


 > >  > - It is rather hard to make a new font available under LaTeX.
 > >  >   Essentially one must create a virtual font which has all the
 > >  >   character slots in the places where hyphenation expects them to be.
 > >
 > > wrong.
 > Wrong...I guess.   Maybe Frank runs LaTeX on initex and has patched
 > fontenc.sty to load patterns for whatever font encoding is requested.

unfortunately that isn't any longer possible with TeX as the hyphenation tree
is compacted and doesn't allow addition after the first use (ie after a
paragraph started.