Print

Print


Frank Mittelbach a écrit :
> perhaps. it might be a straight path into long-term disaster.On the other
> hand the whole area is a disaster in the first place. When we started out with
> inputenc in 2e I also thought that it is really good to keep the encoding with
> the file (which you do by stating \usepackage[latin1]{inputenc} and the like)
> and that worked for a while fairly good. But then OSes started to convert on
> the fly so by cut-n-paste sometimes even on the same machine an old latin1 got
> translated into something else (except for the string specifying the encoding
> inside)... so ... not easy really

You're right. I still think stating the encoding inside the file is the saner
approach, even if it also has its drawbacks. (Btw, when posting on
newsgroups/mailing-lists I always take care to use [ascii]{inputenc} in order to
avoid such copy-paste problems.)

> how much guessing is really needed? Are you targetting an existing 2e env
> unchanged or are you intending to design an interface that is robust if used?
> Or something inbetween?
> 
I think both are interesting, but concerning the present discussion (more
precisely, the packages Will mentions), I guess the target is more an existing
as-unchanged-as-possible 2e env. The idea is to make the transition between
8-bit TeX engines and Unicode TeX engine as easy as possible for the user.

>  - possible 2e solution: steal \openout to always write
>    \InternallyWrittenFileHookToHandleWhatWeNeedToHandle
>    to the top of each such file; fix the cases where this is not appropriate
>    in 2e, such as filecontents env ... and wait for the packages to blow up
>    and fix those (probably only a few if any)
> 
Probably an interesting approach.

> ps interestingly enough, in 2e on top of anormal TeX engine that problem was
> properly solved as we ensured that internally written files were always
> written in LICR which is unicode in 7bit so it was always coming back
> properly. That was at the cost of translating everything into LICR on input
> (with active chars) but that was necessary anyway because of the different
> 8bit encodings around.

At some point, Élie Roux tried to reproduce this approach for lua-inputenc
(converting from the input in fake utf-8, then loading the normal inputenc which
works correctly and correctly translates things to LICR on output). But it
breaks if a non-ascii character is however written to the file for some reason
(such as VerbatimOut, eg), and I'm afraid there is nothing to do about it.

Manuel.