LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

On Wed, Feb 10, 2010 at 11:30:30AM +0000, Joseph Wright wrote:

> On 10/02/2010 10:09, Heiko Oberdiek wrote:
> >* Encoding conversions, see package `stringenc'.
> >   Application: PDF (outlines and other text fields).
> 
> At present we seem to have stayed away from encodings. My own
> preference is to leave things to LaTeX2e when working as a package
> and to use the "native" encoding only for the format (with UTF-8
> engines available this seems sensible to me).

There are encoding issues independet from TeX input.
For example, outlines can be given as strings in PDFDocEncoding
or Unicode (UTF-16). In hyperref, option pdfencoding=auto,
the string is first encoded as Unicode, then PDFDocEncoding is tried
and used, if it fits.
  Also `encoding' can be understood in a general way including
hex strings, strings in ASCII85, quoted strings in C/Lua/..., ...

> >* Matching (replacing) using regular expressions,
> >   see \pdfmatch and luaTeX.
> >   Matching is useful for extracting information pieces or
> >   validating option values, ...
> >   Unhappily \pdfmatch has still the status "experimental"
> >   and the regular expression language differs from Lua's.
> 
> I think we'll be staying away from this. XeTeX has no equivalent of
> \pdfmatch, and as you say the LuaTeX version works differently from
> the pdfTeX one.

The authors/maintainers of XeTeX/pdfTeX/LuaTeX could agree on
one version, supported by all these engines.

> [At present, we only *require* e-TeX in any case,
> although an engine with \(pdf)strcmp available is very useful.]

If an expandable version is not an issue, then \pdfstrcmp
can be implemented in virgin TeX. If the input of \pdfstrcmp
consists of `other' and perhaps `space' tokens, then
even an expandable version can be simulated. The main problem
is AFAIK the conversion of a general token list to the string
representation in an expandable way (that means without
having \edef).

Yours sincerely
  Heiko <[log in to unmask]>