LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Will Robertson <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Date:
Wed, 18 Mar 2009 13:37:04 +1030
Content-Type:
multipart/signed
Parts/Attachments:
text/plain (3865 bytes) , smime.p7s (2446 bytes)
On 18/03/2009, at 9:19 AM, Frank Mittelbach wrote:

> Manuel Pégourié-Gonnard writes:
>> James Cloos a écrit :
>>> As for utf-8 or other, it may be useful to default to the  
>>> character set
>>> specified for the current $LOCALE.  Maybe. :-/
>>>
>> Please don't do anything in the compilation of the document depend  
>> on the
>> locale! It would completely ruin portability of the source files.
>
> perhaps. it might be a straight path into long-term disaster.On the  
> other
> hand the whole area is a disaster in the first place. When we  
> started out with
> inputenc in 2e I also thought that it is really good to keep the  
> encoding with
> the file (which you do by stating \usepackage[latin1]{inputenc} and  
> the like)
> and that worked for a while fairly good. But then OSes started to  
> convert on
> the fly so by cut-n-paste sometimes even on the same machine an old  
> latin1 got
> translated into something else (except for the string specifying the  
> encoding
> inside)... so ... not easy really

Yep, agreed that dealing with encodings is annoying :)


>> A file must be assumed to be either utf-8 (auxiliary file written by
>> XeTeX/LuaTeX) or in the encoding declared as the option of  
>> inputenc. Exactly
>> what xetex-inputenc and luatex-inputenc do.
>>
>> The difficult problem is to guess when a file is an auxiliary file.  
>> I suppose
>> the heuristics for doing so will improve when the solution gets  
>> tested.
>
> how much guessing is really needed? Are you targetting an existing  
> 2e env
> unchanged or are you intending to design an interface that is robust  
> if used?
> Or something inbetween?

Almost entirely the first.

Neither package needs to guess anything; the problem is that there's  
just no way to know if \input refers to a generated file or a user file.

The XeTeX solution simply patches \@input. The LuaTeX solution does  
something similar and allows customisation so that certain files or  
file extensions can be treated as if they were \@input rather than  
\input.

> - new solution, ie not for 2e as such: design a proper interface for  
> handling
>   internal auxilary file reading and writing. That would then have  
> hooks to
>   maintain encoding. We certainly have to do something along those  
> lines for expl3

Yep.

> - partial 2e solution: use \@input as a proposed way to read  
> internal files
>   back in (as suggested by Will) and handle those correctly. booh at  
> those
>   packages that don't use \@input but \input for their internal  
> files (which
>   is is already wrong in 2e proper) and ask them to change or ignore  
> them.

Yep. I hadn't thought of it before, but we could add a note to the  
documentation explicitly discussing this behaviour. Using \@input for  
internally-generated files is implicit in what it does but there's no- 
where (that I know of) that states it plainly.

Note that even the kernel uses \input on the .aux file somewhere :)

> - possible 2e solution: steal \openout to always write
>   \InternallyWrittenFileHookToHandleWhatWeNeedToHandle
>   to the top of each such file; fix the cases where this is not  
> appropriate
>   in 2e, such as filecontents env ... and wait for the packages to  
> blow up
>   and fix those (probably only a few if any)

Nice idea, probably will work; but the return on investment is too low  
(for me at least). I expect non-UTF8 input in Xe(La)TeX documents to  
be hardly ever used. And we can always foist off the responsibility on  
the packages that don't work because of \input v. \@input.

* * *

So, assuming we want to do something about the whole situation (I hope  
so), how open are you to the idea of adding branching to inputenc to  
load packages that aren't under the LaTeX team's control? I'm more  
than happy printing a big warning telling users what's going on.

Thanks for the comments,
Will

ATOM RSS1 RSS2