David --

> Incidentally one reason why xmltex can not support utf16 is that
> TeX buffers to ^J (or ^M) and throws away any bytes with value 32 that
> occur at the end of this buffer, which might just be half of a 16bit
> quantity that you'd rather keep. there's no way to control this

UTF-8 has the virtue that looking at a single byte you know what kind
of creature it is.  I see this as a very desirable property for a
universal encoding system.  (Local coding systems are another matter.)

So provide a front-end filter for converting UTF-16 to a better
multibyte format.

                                      -- Bill