LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Frank Mittelbach <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Tue, 30 Jan 2001 21:22:21 +0100
text/plain (89 lines)

 > I hope then that by default there are supported names like
 > \textgreater \textasciitilde, etc. for all 33 non-alphanumeric
 > printable ascii characters including 0x20 that work properly in
 > various LaTeX contexts.

Lars already answered that: even with OT1 all those chars are already there
(most of them at least), the main reason for a switch from OT1 to something
else is not extra chars but hyphenate-able typesetting with languages other
than English (for those who don't use MLTeX)

 > My point of view is that of one writing a formatter from an XML
 > document type to LaTeX source.  Of course, David's Carlisle's xmltex,

you should perhaps not translate that to an 8bit input code page but to the
latex internal character representation which is 7bit

you might want to have a look at the old talk of 1995 i gave in Brno about the
relationship between input/internal/output encodings in latex and what role
inputenc and fontenc plays therein, you find it at in
the papers section.

 > In the general context of formatting from XML to LaTeX source, though
 > not so much in my specific context, nor in the context of authors coming
 > from a LaTeX or TeX background, I am concerned about what happens with
 > 8 bit characters in the range 0xA0 - 0xFF from the various ISO 8 bit
 > character sets.

well if you translate XML to latex you have control about that range and you
can map it to LaTeX's internal form depending on the source input encoding of
your XML file. alternatively you could late latex do the mapping if the XML
source input encoding is one that is recognised by inputenc (or if not by
providing an inputenc mapping for that codepage)

 > By default with T1, I believe, the input encoding for these characters
 > matches the "cork" encoding.  But when inputenc is set to something

T1 is a font encoding not an input encoding. there is no inputenc method in
LaTeX that supports raw 8bit to be passed straight from input to output (well
there is in the sense that if you use vanilla LaTeX without any inputenc --
but this is really only there for compatibility and not officially supported)

 > with a standard public name -- for example an 8 bit name that would be
 > recognized by one of James Clark's XML parsers "xp" or "SP" I think it
 > highly desirable that the typeset appearance of the characters match
 > what *should* be the screen appearance in a web browser when the
 > character set is properly specified.
 > In particular under such an encoding absent an explicit author
 > indication for math there should be no math.  For example, the
 > miniature "1/2" at data point 0xBD in ISO-8859-1 (Latin 1) should
 > *not* be regarded as math unless an author should choose for some
 > reason I do not anticipate to place it inside math.

i agree and in some sense i'm quite happy that inputenc still says beta
because i'm for year against having the inputenc files to map to anything
other than text objects. in other words, i would want to have the 10 or so odd
mappings in the various inputenc defs that do map to math be replaced by

i'm currently trying to document the internal representation of LaTeX
including inputenc and the like and the current status is impossible to

 > (Probably, however, the present inputenc name "latin1" needs to remain
 > as it is for backward compatibility.)

well, probably, but then how many people would have known (and used the fact)
that current inputenc latin1 actually has

\DeclareInputText{189}{\textonehalf} % so that gives an error if placed in
                                     % math



would that also make an uproar on ctt? i.e., changing the inputencs to be text
objects by default

comments anybody? (Mr. from the grave?)