LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Classic View Use Monospaced Font Show Text Part by Default Condense Mail Headers
Topic:	[<< First] [< Prev] [Next >] [Last >>]

Sender: Mailing list for the LaTeX3 project <[log in to unmask]>

Subject: Re: Side remarks about TeX input sequence

From: Hans Aberg <[log in to unmask]>

Date: Wed, 14 Feb 2001 00:42:35 +0100

In-Reply-To: <[log in to unmask]>

Reply-To: Mailing list for the LaTeX3 project <[log in to unmask]>

Parts/Attachments: text/plain (49 lines)

At 18:43 +0000 2001/02/13, David Carlisle wrote:
>> What happens if a command in the middle of a line changes the catcodes
>
>makes no difference: the notion of line for the input buffer is hardwired
>into the implementation it is not changable via TeX commands and does
>not depend on catcodes or the value of \endlinechar.
>
>>  or contains a macro that expands to a \input <filename>?
>
>The rest of the line of the original file sits buffered in one of those
>input streams until the input file finishes.

OK.

>Incidentally one reason why xmltex can not support utf16 is that
>TeX buffers to ^J (or ^M) and throws away any bytes with value 32 that
>occur at the end of this buffer, which might just be half of a 16bit
>quantity that you'd rather keep. there's no way to control this
>behaviour from within TeX.

So TeX is a lot less sophisticated than it appears at first sight.

>> How can this be true?
>
>By magic, or the will of Knuth, or something.

Well, it's not magic, so it must be the other then.

>At 14:44 -0500 2001/02/13, Michael John Downes wrote:
>>Sorry, I didn't use the terminology very well. TeX input first goes into
>>a string buffer, one line at a time. This string buffer is the only
>>place where TeX deals with ASCII chars as input; all other "input
>>streams" are streams of tokens. Tokenization occurs by scanning
>>substrings from this string buffer and adding the corresponding token to
>>the current input stream (which if we call it a "buffer", is a different
>>buffer, not the one that contains simple 8-bit characters as first read
>>from a file).
>>
>>If you get an error "TeX capacity exceeded: buffer size" it means
>>that a line of the input file was too long to be read into the string
>>buffer.

TeX really is a program from another age...

  Hans Aberg
                  * Email: Hans Aberg <mailto:[log in to unmask]>
                  * Home Page: <http://www.matematik.su.se/~haberg/>
                  * AMS member listing: <http://www.ams.org/cml/>

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung