LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Lars Hellström <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Date:
Mon, 6 Mar 2006 08:49:32 +0100
Content-Type:
text/plain
Parts/Attachments:
text/plain (55 lines)
Söndagen den 5 mars 2006 kl 21.30 skrev Frank Mittelbach:
>
> that is not to say that the line
>
>>>> \DeclareUnicodeCharacter{02C6}{\textasciicircum}
>
> is probably wrong it should be most likely
>
> \DeclareUnicodeCharacter{005E}{\textasciicircum}
>
> and several others have similar defects. would be good if that got
> checked.

Is that even a legal definition? U+005E (^) is, as was mentioned
earlier in this thread, syntax in LaTeX, so you can't inputenc map it
to something. Or are you thinking about some attempt at supporting
verbatim input?

>> Example: Assuming there is a word "deaffish" and the
>> author does not want a ligature ffi spanning both word parts.
>> Therefore, having a good editor, he uses the Unicode sequence
>> U+0066 U+FB01 to specify the correct and desired ligature.
>> Using the later case of \DeclareUnicodeCharacter{FB01}
>> TeX would get "ffi" and then form the wrong ligature.
>
> wrong example in my opinion. as Lars said: fi or ffi ligature ended up
> in
> unicode as legacy codes because they were in legacy 8-bit encodings.
> million
> other ligatures are not available as "chars" because UC like most other
> standards are heavily influenced by what is right for certain
> countries but
> not others. using "fi" in this way is like using tables in html to
> position
> elements on the page, ie it works for that example but ...
>
> so the right thing is not to use fi at all here but would be to a
> generic
> method to denote subword boundaries or whatever to allow the formatter
> not to
> use the ligature. TeX's method would be \textcompwordmark ... but
> unicode
> never thought that such encoding of lgoical information is the task of
> the
> standard.

Actually, U+200C (ZERO WIDTH NON-JOINER) seems to me a perfect match to
\textcompwordmark, and I've entered it as such in my "Draft
specification for the T1 encoding".

More pragmatically, one may of course write "deaf\-fish" to not only
escape the ligature, but also point out the proper point of hyphenation.

Lars Hellström

ATOM RSS1 RSS2