LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Forum View Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: default inputenc/fontenc tight to language
From:	Frank Mittelbach <[log in to unmask]>
Reply To:	Mailing list for the LaTeX3 project <[log in to unmask]>
Date:	Fri, 2 Feb 2001 22:05:59 +0100
Content-Type:	text/plain
Parts/Attachments:	text/plain (113 lines)

Chris wrote:

 > > a bit inconsistent that, isn't it?
 >
 > Not really: since input encoding really does mean just that.

i meant inconsistent that we got input encodings fine but font encodings not
(or rather font encodings as well but missed out an important extra bit)

 > Once the text is `inside LaTeX' the input encoding is irrelevant: that
 > is the beauty and strength of the LaTeX text character model.

yes it is :-)

so inputencodings are fine.

but the problem that i was trying to point at is this:

 assuming we have a bit of text in the internal LaTeX representation, eg this:

   Trank der G\"otter \M{d} Trank der ...

 then there is no way for LaTeX without further help to determine the best
 font encoding to typeset this in.

 why is this so?

 - one first would need to analyse the whole text to find out which collection
   of glyphs are needed (that would result in a number of possible encodings,
   but it also might result in the need for more than one encoding)

 - but which of the possible encodings to use can depend on factors like
   do i have the desired fonts in this encoding or only in others ...

anyway, already the first analysis is a problem inside TeX because TeX works
sequentially so you would need to implement a multi pass system leaning about all
the snippest of text as you go along and then reuse that information on later
passes. looks like a nightmare to me.

so if TeX can't do it automatically, we have to tell it what to use and with
NFSS2 we need to tell it which font encodings to use at those points. And this
is bad because users shouldn't be forced to bother about this font only available
in encoding A and that one in B and ...

Karsten pointed to some undocumented alpha code autofe.sty which attempts to
provide a solution for the problem. But this really is intended for a
different environment where you can (or more easily) change font encodings as
you go along.

so back to the strange text above and think about how some algorithm (like
autofe) would work on finding the right encodings. assuming we start in OT1

 Trank der G   % no problem up to this point

 \"o           %* ahh, now this is in OT1 but it would be far better to use T1
               % now. but switching would be bad as well since we are in the
               % middle of a word ...
 tter          % so we are now either in T1 or OT1 depending on the decision
               % above

 \M{d}         % but this strange beast only exists in T4 so we have to switch

 Trank der     %* so what do we use now for this?
               %  T4 does contain those letter. do we carry on?

whatever happens at the points marked * the typeset result would be a mess.

when we write

\fontencoding{FOO}\selectfont

we tell the system that we want it to select a font with the current
characteristics (ie family,shape...) in a very specific encoding but what we
actually only should say is "the following text is in a certain glyph
collection, ie contains certain glyphs"

we unfortunately can't express the latter so we are forced to do the former.

with moving argument, eg a section head this becomes a real problem. if the
section head is, say in Russian (as in Denis example) we have to somehow state
that the glyph collection for typesetting is one with cyrillic characters.

since we have no concept for this we can only express that it should be in the
encoding TA2 or X2 or whatever, which is (technically) fine for the heading
itself being typeset. but passing the information about the FONT encoding to,
say, the toc is wrong, since the toc might be typeset with different fonts or
different sizes for which we do not have TA2 fonts but only X2 fonts

this is i think a longer example of what Chris wrote:

 > > but would it help if the language has a tie
 > > to the [font] encoding?
 >
 > Whether the `intended font encoding' should be part of a moving
 > argument leads to an important question.
 >
 > Note the word `intended': will it always be the case that text from a
 > moving argument should be turned into glyphs using the same font encoding
 > as was used for the original text?

no it need not, it only needs the same glyph collection.

so we would do better by tying "glyph collections" to languages and let the
system worry about which actual font encoding to use given other constraints
during the typesetting process.

this is the kind of extension NFSS2 would need in my opinion.

frank

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung