LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Classic View

Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender: Mailing list for the LaTeX3 project <[log in to unmask]>
Date: Thu, 10 Jan 2019 17:01:31 +0100
Reply-To: Mailing list for the LaTeX3 project <[log in to unmask]>
Message-ID: <[log in to unmask]>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
In-Reply-To: <[log in to unmask]>
Content-Type: text/plain; charset=utf-8; format=flowed
From: Frank Mittelbach <[log in to unmask]>
Parts/Attachments: text/plain (39 lines)
Am 09.01.19 um 22:03 schrieb Kelly Smith:
> Excuse my naîveté, as there are probably important advantages to the
> text command approach that I’ve completely overlooked.

the original purpose of the LICR approach (LaTeX Internal Character 
Representation) was/is twofold:

  - allow for a safe roundtrip through the LaTeX processing workflow, eg 
writing to and aux file and reading back

  - support different input and output (font) encodings

As for the first reason, these days with unicode being essentially 
standard on the OS level the roundtrip question is no longer really an 
issue and one can assume that unicode native characters will survive 
this trip unharmed

As for supporting different encodings:

  - input encoding is by default unicode but there are others and those 
then needs to map the input to some LICR. However, they could in theorey 
map to unicode code points as the LICR either as native chars like "ä" 
or as a unicode codepoint representation as a command \ucchar{<number>} 
for example.

  - output/font encodings are somewhat different as there are (and 
probably will be forever) output formats that are restricted and only 
support a subset of characters. For example, fonts with limited 
character sets or pdfbook marks or ...

In any such case it would be important to be able to swap in and out 
different definitions for the LICRs which is only possible one has a 
handle to do so and that is not the case for simple characters of 
catcode 11 or 12. It is only the case if the input is mapped to a 
command of some shape or form