LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Proportional Font
Show HTML Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Frank Mittelbach <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Mon, 6 Mar 2006 09:18:26 +0100
text/plain (51 lines)

 > Söndagen den 5 mars 2006 kl 21.30 skrev Frank Mittelbach:
 > >
 > > that is not to say that  the line
 > >
 > >>>>  \DeclareUnicodeCharacter{02C6}{\textasciicircum}
 > >
 > > is probably wrong it should be most likely
 > >
 > >   \DeclareUnicodeCharacter{005E}{\textasciicircum}
 > >
 > > and several others have similar defects.  would be good if that got 
 > > checked.
 > Is that even a legal definition? U+005E (^) is, as was mentioned 
 > earlier in this thread, syntax in LaTeX, so you can't inputenc map it 
 > to something. Or are you thinking about some attempt at supporting 
 > verbatim input?

of course it is a legal definition. the fact that ^ as input is syntax doesn't
mean that for the abstract character you can't have an LICR; just like $ is
syntax but \textdollar is an LICR denoting the DOLLAR-CHARACTER

the only thing not possible here is to such characters as simple LICR objects
(ie denoted by 7-bit ascii directly)

 > > so the right thing is not to use fi at all here but would be to a generic
 > > method to denote subword boundaries or whatever to allow the formatter
 > > not to use the ligature. TeX's method would be \textcompwordmark ... but
 > > unicode never thought that such encoding of lgoical information is the
 > > task of the standard.
 > Actually, U+200C (ZERO WIDTH NON-JOINER) seems to me a perfect match to 
 > \textcompwordmark, and I've entered it as such in my "Draft 
 > specification for the T1 encoding".

yes, probably, I was too lazy to hunt for that char last night, but I
remembered dimly having seen one for that case.

but my main point was that ligutures in UC are useless and not a universal
concept that can be applied for something like this

and as a followup there isn't a 1-1 relationship between UC an LICR ie some
things can not be properly represented as LICRs  (eg combining chars) but only
after preprocessing, if used and in the opposite direction LICRs can cover
more than UC supports (eg \"t would be a perfect LICR) in otherwords the chars
covered by both are although overlapping a lot, are not fully compatible.