 Re: accents and inputenc Heiko Oberdiek <[log in to unmask]> Mon, 5 Jul 2004 15:30:06 +0200 text/plain (52 lines) On Mon, Jul 05, 2004 at 07:31:34AM +0200, Werner LEMBERG wrote: > [LaTeX 2e 2003/12/01] > > Is the following a known limitation or a bug? And if it is a > limitation, where is it documented? > > \documentclass{article} > > \usepackage[latin3]{inputenc} > > \begin{document} > \tableofcontents > \section{\'^^b9} > \end{document} > > ^^b9 is the dotless i in latin 3 -- in the TOC, the accent is > formatted incorrectly. BTW, it doesn't matter whether OT1 or T1 is > used. Package inputenc translates the input characters that it controls into TeX code: ^^b9 becomes:   \show^^b9   ->\IeC {\i } Actually 4 tokens instead of one ^^b9 token. This goes into the .aux and .toc file:   \contentsline {section}{\numberline {1}\'\IeC {\i }}{1} The function of \IeC is that spaces after the character are detected correctly:   ^^b9 foobar --> space between   \i foobar --> no space   \IeC{\i} foobar --> space between Because of the four tokens you need braces around such characters:   \section{\'{^^b9}} Of course it is possible to change the behaviour of inputenc: The translation into TeX code is deferred in protecting environments, so that the 8-bit character goes into the .aux and .toc file:   \contentsline {section}{\numberline {1}\'^^b9}{1} The disadvantage of this approach is, that the \section command and \tableofcontents are processed at different times perhaps with different input encodings. Then the wrong input encoding can apply to the section title in the table of contents. Then changes of the input encoding has to be recorded in the .toc file, too. Yours sincerely   Heiko <[log in to unmask]>