> but not reasonable -- unless the > processor, like David Carlisle's xmltex, is a TeX thing -- for it to > know that a particular character must have \ensuremath applied. That isn't clear. A unicode text processor is supposed to know an awful lot about each character. It has to "know" that combing characters combine, and is supposed to know the default writing direction of every character, and various other properties. The property of being a math character is really just one of these. In fact it _is_ one of those see http://www.unicode.org/Public/UNIDATA/UnicodeData.html Informative Categories Abbr. Description Lm Letter, Modifier Lo Letter, Other Pc Punctuation, Connector Pd Punctuation, Dash Ps Punctuation, Open Pe Punctuation, Close Pi Punctuation, Initial quote (may behave like Ps or Pe depending on usage) Pf Punctuation, Final quote (may behave like Ps or Pe depending on usage) Po Punctuation, Other Sm Symbol, Math ^^^^^^^^^^^^^^^^^^^^^^ Sc Symbol, Currency Sk Symbol, Modifier So Symbol, Other one of the problems xmltex has is that it _doesn't_ know this stuff (and doesn't combine combing characters, for example) Unicode as currently devised hasn't got 2^32 characters, just 17 planes of 2^16, but even so, that's probably enough. But whether the internal canonical form is a unicode number or a latex style 7bit string \'e the issues of mapping between input encodings and this internal form, and from there to font encodings, are probably about the same. David