Tue, 6 Feb 2001 11:09:10 -0500
Just out of curiosity, I'm wondering what those here think about
unicode and, in particular:
1. Is its concept of character -- basically unsigned 32 bit
integer -- durable for, say, the next 100 years?
(As I read the discussion here, I think not.)
2. Do we think that 2^32 is a wise upper bound?
(This question vanishes if we think that representing
characters as integers, rather than as more complicated data
structures, is inadequate.)
Unicode is directly relevant to the future of LaTeX to the extent that
LaTeX is going to be robust for formatting XML document types because
normal document content can consist of arbitary sequences of unicode
characters. XML systems are designed to make decisions only where
markup occurs. It is reasonable for an XML processor writing in a
typesetting language to know the markup ancestry of a character, e.g.,
whether it is within a math zone, but not reasonable -- unless the
processor, like David Carlisle's xmltex, is a TeX thing -- for it to
know that a particular character must have \ensuremath applied.
I note that in GNU Emacs these days characters can have property lists.
Thanks for your thoughts.