Frank Mittelbach <[log in to unmask]> writes:
> > - Hyphenation patterns are specified in terms of the output encoding.
> > This means that every character appearing in the hyphenation rules
> > must have a physical slot in the selected font.
> only in the internal storage format for patterns used within TeX. On the
> abstract level this is not at all true even though the source format of
> existing patterns tend to be written in this form as well.
Frank also made an earlier comment in this regard, which, while true,
is of marginal relevance to the discussion of multi-lingual/-encoded
Yes, it is nice if hyphenation pattern input files use symbolic
representaions of characters (\ss) rather than hex code values
so they can be used for different font encodings. But the font
encoding must be selected when the format is generated! This
doesn't help the user who wants to use various font encodings,
and it certainly does not facilitate multi-encoded documents.
For hyphenation purposes, multiple encodings must be treated as
multiple languages. This again points to the need for babel to
specify the desired/required font (encoding) when it selects
> > However, logically
> > hyphenation should not depend on output encoding, and one should be
> > able to mix fonts with different output encodings without losing
> > correct hyphenation.
> yes, and it is possible without technical problems (in theory)
Possible in TeX as it stands??? Only by loading the patterns
resolved for each encoding. Or does "in theory" really mean
what it says -- not in practice.
> > - It is rather hard to make a new font available under LaTeX.
> > Essentially one must create a virtual font which has all the
> > character slots in the places where hyphenation expects them to be.
Wrong...I guess. Maybe Frank runs LaTeX on initex and has patched
fontenc.sty to load patterns for whatever font encoding is requested.
This does bring us to the point about "internal representation".
TeX has different levels of internals, and at the level where it builds
a horizontal list (as opposed to the higher level of the macro definitions)
the character tokens must map directly to the corresponding font.
Some can call this a lack of distinct internal representation.
Others can say the relevant representation is in the macros (as
with inputenc). Still others can say TeX's internal representation
is independent because of virtual fonts.
> > - TeX diagnostic messages output the "internal representation", which
> > can quickly become unreadable for scripts that are not essentially
> > ASCII.
> which diagnostics we are talking about here? some of them are in the font
> encoding (which is not the LICR at all)
I agree. This is not the issue. In fact, it is only an issue for
system configuration! Most TeX implementations now allow messages
to be printed without conversion to ^^ format.
Donald Arseneau [log in to unmask]