LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Classic View Use Proportional Font Show Text Part by Default Show All Mail Headers
Topic:	[<< First] [< Prev] [Next >] [Last >>]

Re: latex/3480: Support for UTF-8 missing in inputenc.sty

Frank Mittelbach <[log in to unmask]>

Sat, 1 Feb 2003 00:59:38 +0100

text/plain (70 lines)

Roozbeh Pournader writes:

 > > what we try is to provide a utf8 input encoding, how likely is it that some
 > > editor or application generates that Adobe thing? not very i would guess (at
 > > least not now) therefore i would not assign anything.
 >
 > Something that may happen:
 >
 > 1. A TeX document is typeset with a PS Type 1 font will have the dotlessj
 > somewhere. After being converted to PDF, you will have the glyph in a PDF
 > document. Adobe tools see a 'dotlessj' there.
 >
 > 2. Someone copies and pastes it from Acrobat Reader into a document using
 > an editor that supports Adobe private use characters. He sees a dotlessj
 > there.

which "some" editor is that? i'm not saying it is not possible, i'm just
saying that as long as something is a) not very likely b) potentially
controversial we should in the first step not made a fixed assignment ...
 >
 > 3. The output is fed back into LaTeX.

not a problem, what would happen is that we get that char  U+F6BE
and would say, sorry, nothing set up for this. Then all it needs is

\DeclareUnicodeChar{F6BE}{\j}  % already forgotten what's today syntax is :-)

in the preamble of the document and off we go. 'course if that becomes the
standard situation we might as well put it in, right now i would leave it open

 > Unicode doesn't distinguish that much between text and math characters. It
 > says somewhere that you may use a math character as a bullet or something.
 > I guess the best way to implement this is if you saw the character in text
 > mode it is \textasteriskcentered and if you saw it in math mode it is '*'.

that's not the way it works in TeX, is it? at the time input encoding is
translated to LICR we are before the decision for "text" or "math".  the
naming conventions for the LICR objects are a bit dubious here as they often
say "\text..." but that is the major goal for them, ie make the LICR objects
work in text and with different font encodings.

note that any LICR object, say, \"a is first of all only an abstract name for
the character umlaut-a. it is not the instruction put an accent of a nor is
\textsterling the pound glyph but the abstract name for the character pounds.

technically, all the (text)-font-encoding commands and the majority of LICR objects
are font-encoding commands only work and TeX text and not in TeX math today,
which is why naming them \text... was useful at one stage.

the inpmath proposal adds a new dimension to that by basically allowing to
define a mapping from LICR to math chars/commands/constructs.

if i would start afresh then the LICR objects should probably get names which
are a bit more genderless, eg \LICR... but then this isn't the way it developed
so we are more or less stuck with the current set of names.

it might as well be that U+2217 should be translated to \textasteriskcentered
when inpmath (or rather its successor implmentation) is incorporated but as
long as this isn't the case i would not map something that is only likely to
come up in the middle of a math formula to something that \LaTeX is going to
choke on if surrounded by $...$

 > Anyway, what is the usage of \textasteriskcentered? I may be able to
 > follow it up with Unicode guys and see if we need a character for that.

the only common usage in LaTeX (i think) is as a bullet for some itemize level

good night
frank

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung