LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Frank Mittelbach <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Tue, 17 Mar 2009 23:49:31 +0100
text/plain (55 lines)
Manuel Pégourié-Gonnard writes:
 > James Cloos a écrit :
 > > As for utf-8 or other, it may be useful to default to the character set
 > > specified for the current $LOCALE.  Maybe. :-/
 > > 
 > Please don't do anything in the compilation of the document depend on the
 > locale! It would completely ruin portability of the source files.

perhaps. it might be a straight path into long-term disaster.On the other
hand the whole area is a disaster in the first place. When we started out with
inputenc in 2e I also thought that it is really good to keep the encoding with
the file (which you do by stating \usepackage[latin1]{inputenc} and the like)
and that worked for a while fairly good. But then OSes started to convert on
the fly so by cut-n-paste sometimes even on the same machine an old latin1 got
translated into something else (except for the string specifying the encoding
inside)... so ... not easy really
 > A file must be assumed to be either utf-8 (auxiliary file written by
 > XeTeX/LuaTeX) or in the encoding declared as the option of inputenc. Exactly
 > what xetex-inputenc and luatex-inputenc do.
 > The difficult problem is to guess when a file is an auxiliary file. I suppose
 > the heuristics for doing so will improve when the solution gets tested.

how much guessing is really needed? Are you targetting an existing 2e env
unchanged or are you intending to design an interface that is robust if used?
Or something inbetween?

Couple of thoughts of the top of my head:

 - new solution, ie not for 2e as such: design a proper interface for handling
   internal auxilary file reading and writing. That would then have hooks to
   maintain encoding. We certainly have to do something along those lines for expl3

 - partial 2e solution: use \@input as a proposed way to read internal files
   back in (as suggested by Will) and handle those correctly. booh at those
   packages that don't use \@input but \input for their internal files (which
   is is already wrong in 2e proper) and ask them to change or ignore them.

 - possible 2e solution: steal \openout to always write
   to the top of each such file; fix the cases where this is not appropriate
   in 2e, such as filecontents env ... and wait for the packages to blow up
   and fix those (probably only a few if any)


ps interestingly enough, in 2e on top of anormal TeX engine that problem was
properly solved as we ensured that internally written files were always
written in LICR which is unicode in 7bit so it was always coming back
properly. That was at the cost of translating everything into LICR on input
(with active chars) but that was necessary anyway because of the different
8bit encodings around.