At 18:43 +0000 2001/02/13, David Carlisle wrote:
>> What happens if a command in the middle of a line changes the catcodes
>makes no difference: the notion of line for the input buffer is hardwired
>into the implementation it is not changable via TeX commands and does
>not depend on catcodes or the value of \endlinechar.
>> or contains a macro that expands to a \input <filename>?
>The rest of the line of the original file sits buffered in one of those
>input streams until the input file finishes.
>Incidentally one reason why xmltex can not support utf16 is that
>TeX buffers to ^J (or ^M) and throws away any bytes with value 32 that
>occur at the end of this buffer, which might just be half of a 16bit
>quantity that you'd rather keep. there's no way to control this
>behaviour from within TeX.
So TeX is a lot less sophisticated than it appears at first sight.
>> How can this be true?
>By magic, or the will of Knuth, or something.
Well, it's not magic, so it must be the other then.
>At 14:44 -0500 2001/02/13, Michael John Downes wrote:
>>Sorry, I didn't use the terminology very well. TeX input first goes into
>>a string buffer, one line at a time. This string buffer is the only
>>place where TeX deals with ASCII chars as input; all other "input
>>streams" are streams of tokens. Tokenization occurs by scanning
>>substrings from this string buffer and adding the corresponding token to
>>the current input stream (which if we call it a "buffer", is a different
>>buffer, not the one that contains simple 8-bit characters as first read
>>from a file).
>>If you get an error "TeX capacity exceeded: buffer size" it means
>>that a line of the input file was too long to be read into the string
TeX really is a program from another age...
* Email: Hans Aberg <mailto:[log in to unmask]>
* Home Page: <http://www.matematik.su.se/~haberg/>
* AMS member listing: <http://www.ams.org/cml/>