David Carlisle wrote --

> But whether the internal
> canonical form is a unicode number or a latex style 7bit string \'e
> the issues of mapping between input encodings and this internal form,
> and from there to font encodings, are probably about the same.

And I just want to agree with this.

In fact I would go a lot further and say that the problems raised in
this discussion such as the following are the same whatever
system you use to do quality typesetting:

  what is a character?

  what is the relationship between character strings and relatively positioned
  glyphs on a surface?

Thus LaTeX and its choice of internally using 7-bit strings is a also
a mere detail.

And these problems do not go away just because you use larger integers
to represent text streams.
