There is an interesting "UTF-8 and Unicode FAQ" at
It looks as though two strong Unicode encoding candidates for TeX/LaTeX are
the UTF-8 and UTF-32. The UTF-32 encoding might be better to used internally
for performance and programming convenience reasons. The UTF-8 encoding is
better to work with as an extension of ASCII. So one might make a TeX
version that say uses UTF-32 internally, and UTF-8 and UTF-32 externally.