## LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Forum View Use Proportional Font Show Text Part by Default Show All Mail Headers Message: [<< First] [< Prev] [Next >] [Last >>] Topic: [<< First] [< Prev] [Next >] [Last >>] Author: [<< First] [< Prev] [Next >] [Last >>]

 Subject: Re: XML, UTF-8 and TeX engines From: Torsten Bronger <[log in to unmask]> Reply To: Mailing list for the LaTeX3 project <[log in to unmask]> Date: Fri, 18 Jul 2003 20:52:09 +0200 Content-Type: text/plain Parts/Attachments: text/plain (61 lines)
Halloechen!

>
> [...]
>
>>> As LaTeX is evolving it will be possible for gellmu's "alpha"
>>> (an empty element marked up in Gellmu source as \alpha) to be
>>> formatted in LaTeX as (math) \alpha when recursively inside a
>>> math element and not inside either of gellmu's "mbox" or "text",
>>> while outside of math "alpha" could easily be morphed to a
>>> suitable unicode point.
>>
>> So you distuguish between both cases within your Gellmu tools?
>> Okay, we have to, I do so, too; but actually I think that this is
>> something that the typesetter should provide.  So, an \alpha in
>> math mode should be cmmi, and in text mode is must be part of a
>> Greek word.
>
> One way or another there should be a distinction.
>
> But I want gellmu article to be able to reach xhtml+mathml and for
> this I want to have a source markup way of identifying math
> symbols.

Granted, but eventually it's MathML and then a following processor
must cope with a Unicode alpha.  And either it's something like my
Unicode --> LaTeX filter program, or it's the typesetter itself.  I
prefer the latter strongly, because all other variants I've seen so
far looked like kludges more or less.

> For that purpose it is convenient for me to hold on to </alpha>
> (the xml form of \alpha) until the end of any pipeline.  Beyond
> that I think it inefficient use of xml structure to look
> individually at items of cdata.

I think so, too, however db2latex and the MathML-->XSLT-->LaTeX
project (sorry, don't know its Sourceforge name at the moment) do
something like that apparently.

> So my formatter is willing to think about how to handle </alpha>
> but not about how to handle á (which will be understood only as
> the unicode object that it is and which, therefore, should not be
> found loose inside math).

But then your formatter stops when having reached XML, or it starts
with a format that has similar limitations as LaTeX.

> (The last sentence is supposed to have a single U+03B1 that is
> UTF-8 encoded; I don't know what will happen in the mail.)

It arrived in one piece (but not as UTF-8).

Tschoe,
Torsten.

--
Torsten Bronger, aquisgrana, europa vetus