Am 14.01.19 um 07:15 schrieb Kelly Smith: >> of course this is way before the event of unicode or xetex/luatex ... > > As for XeTeX and LuaTeX, it appears that UTF-8 text doesn’t get converted > to an internal representation, according to some quick experiments I tried. in some sense I would love to make that happen (as there are also advantages to it) but it would come at a high cost - loss of speed (somewhat) - impossibility to use utf8 chars in csnames, eg in luatex you can do \newcommand\Füße{} as long as the chars are of \catcode 11, once they are active they can't be used in this way and while I personally wouldn't mind, I can understand that depending on your language you rather want that to be possible. The second point really means that it is a no-go > I guess that it would be impractical to set up automatic conversions > for such a large character set. There is that. > And, as you said, since the UTF-8 will > survive reading and writing to and from files, an internal representation > isn’t critical. The downside is that you simply don't know if the character will typeset correctly in a given situation, as that approach assumes that everything is unicode and that there is something at the other end when you ask the font to render the utf8 char--- as that is not necessarily true you may end up with tofu in other words a certain step backwards from the situation we had in that respect in 8bit > When running on XeTeX or LuaTeX, is there an interface for getting the > Unicode character(s) represented by a text command? (e.g. \"{a} -> ä, > useful for string comparison). I suppose one could easily achieve that by > locally redefining all the text commands? Not necessary, you just need to process the definitions in the right state, eg \typeout{ä \"w} \makeatletter \protected@edef\foo{ä \"w} \show\foo will give you this as output: ä \"w > \foo=macro: ->ä \"w. l.14 \show\foo cheers frank