> Don't we need support for UCS-2 encoded files? UCS-2 encoded TeX code is not feasible, since TeX assumes e.g. that \ is a single byte, not a sequence. > And what about Unicode surrogates (i.e., characters that consist of > four octets)? In UTF-8 these are not needed (UTF-8 can encode the whole Unicode range directly). Think of the surrogates as a relict. DniQ.