LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Mailing list for the LaTeX3 project <[log in to unmask]>
"Randolph J. Herber" <[log in to unmask]>
Mon, 3 Mar 1997 11:41:34 -0600
Mailing list for the LaTeX3 project <[log in to unmask]>
text/plain (183 lines)
The following header lines retained to affect attribution:
|Date: Mon, 03 Mar 1997 12:14:03 +0100
|From: [log in to unmask] (Hans Aberg)
|Subject: Re: Shortref mechanism
|To: Mailing list for the LaTeX3 project <[log in to unmask]>
|Cc: "Randolph J. Herber" <[log in to unmask]>

|"Randolph J. Herber" <[log in to unmask]> writes:

|>|  I cannot follow the details in your reasoning, but I can note that with
|>|deterministic parsing, the method generally used in LaTeX, conditional
|> ^^^^^^^^^^^^^ ^^^^^^^                                      ^^^^^^^^^^^
|>|parsing have such limits.
|> ^^^^^^^
|>|  But with non-deterministic parsing more general things can be done:
|>            ^^^^^^^^^^^^^^^^^
|>|  For example, I just made a definition command that can produce commands
|>|having optional arguments; in this general approach, I had to switch from
|>|LaTeX style deterministic parsing to non-deterministic parsing.

|>To someone that has written several small compilers and has studied automata
|>theory at the doctorate level, your word choice as high-lighted above is
|>quite jarring.  By using a power automaton, a non-deterministic automaton
|>can be reduced to a deterministic automaton.  Therefore, one does not gain
|>any power of expression by using a non-deterministic automaton, rather one
|>only gains compaction of the description.

        Your following comments are not pertinent to my comments above.

        You may feel that you are making up nomenclature to describe your
        proposed algorithms.  But, in fact, you are using nomenclature
        with already assigned meaning in the field of computer language

        Furthermore, you ``added insult to injury'' by deleting my
        provision of the proper nomenclature from the field of computer
        language processing that does pertain to your proposed changes
        to TeX's handling of its input, to wit:

                I believe that what you intended is the distinction of
                context free and context sensitive languages.  From
                what I have read in the TeX book, the tokenizer of TeX
                is context sensitive with a single character look-ahead
                and the TeX language based on the recognized tokens is
                context free.

                It is a significant change in the behavior of the TeX
                language to change it from being context free to being
                context sensitive.  But, it may be a necessary change.
                Most modern computer languages are context sensitive
                with a single token look ahead.  A few look ahead two
                tokens in some situations.  I imagine that some look
                ahead three tokens.  Parser generators for single token
                look ahead readily are availed.

                What you are proposing is a change from zero token look
                ahead to one token look ahead.

        Please. would you use the proper nomenclature?

        The pairing from your improper nomenclature to what I believe
        is the pertinent nomenclature for what you are attempting to
        discuss is:

                deterministic           ==>             context free
                non-deterministic       ==>             context sensitive

        Computer languages have both structure, i.e. syntax, and meaning,
        i.e. semantics.  Computer languages that are studied for their
        syntactical properties might not have associated semantics.  All
        others do have semantics.  Many computer languages are context
        free in their syntax eventhough they have semantics and therefore
        context among the language elements because of those semantics.

        I believe that TeX (with the exception of the one character
        look ahead in its tokenizer which is used to locate the
        termination of tokens) is context free __in its syntax.__
        This does not mean that TeX does not have semantics nor does
        it mean that these semantic elements do not have context
        among the various semantic elements.

        I believe that Frank Mittelbach's point and position (not
        ``problem,'' as you say) is that changing TeX from a context
        free to a context sensitive syntax (grammar, if you wish)
        is too large of a change to be considered.

|  This reasoning would be true in any sufficiently general purpose, but
|TeX is not such a language (or it is unknown if it is).

        My observations above pertain to all languages which have syntax.
        TeX is a language which has syntax.  Therefore, it is such a language.

|  The second thing is that, even though something may be theoretically
|possible, it may be practically impossible, because you simply do not have
|time to both  doing that implemntation, and pay your bills.

        This is Frank Mittelbach's point as I understand it.

|  The third thing one must consider, is that a computer language is not
|only used to manipulate logical data, but logical data that has a semantic
|interpreation attached to it. Any logical transformation must keep track of
|that semantic interpretation, and this is related to the practicality
|question, I guess.

        At the syntax level of language processing, the semantics do
        not pertain.  At the semantics level of language processing,
        semantics is the entire purpose of the processing.  Any
        compiler or interpreter that does any semantics processing
        must handle the semantics processing that is dictated by the
        specifications indicated by the semantics associated with
        the syntactical elements.

|  With TeX the problem is this:

|  You have a variable #1 equal to some parameter text, say ##1##2.
|When #1 pick up an argument, in the first pass, an argument of the form
|    {section}{theorem}
|will be transformed into
|    sectiontheorem,
|so, when writing an deterministic parser by hand, on puts back the argument
|to the next command as {##1}{##2}, say if you want to put it back all. Now,
|working in this generality, there is no obvious way of transforming
|    #1 --> #1_new
|by a command doing
|    ##1##2 --> {##1}{##2}

        I have written several compilers and know how to process
        context sensitive grammars.

|  By reverting to non-deterministic parsing, one can get around this
|problem, by first picking up some text that surely contains the original
|##1##2, and then sending this original text to the next command, instead of
|the partially parsed by #1 (which may be corrupted).

        This passing along, unchanged, those syntactical elements which
        have been determined by the grammar to belong to following
        elements is part of the processing that occurs in a context
        sensitive parser.

|  But this does not solve Frank Mittelbach's problem, as he pointed out.

        Unless you consider Frank Mittelbach's lack of interest in
        redesigning or reimplementing or a lack of resources to redesign
        and reimplement TeX's syntax processing as ``Frank Mittelbach's
        problem,'' Frank Mittelbach does not have a problem here.  I do
        not have a problem here; computer languages are a major portion
        of my education and work.

|>I believe that what you intended is the distinction of context free and
|>context sensitive languages.  From what I have read in the TeX book, the
|>tokenizer of TeX is context sensitive with a single character look-ahead
|>and the TeX language based on the recognized tokens is context free.

|  TeX is highly context sensitive, and this is much of the point with TeX:
|Each environment or grouping has its own set of local variables, which can
|be used to change the context rather radically. This is unrelated to the
|stuff I discussed above.

        Please read my comments above.  There is a major, significant
        difference between syntax and semantics.  I do not deny that
        TeX is quite sensitive to the semantic context of the material
        it processes.  It would not be useful if it were not so.  This
        does not prevent TeX from having a context free grammar.

|  Hans Aberg

        Deciding whether TeX should have a context free or a context
        sensitive grammar is an appropriate topic for this forum.

        Since context sensitive grammars tend to be more complex and
        to use more computer resources to process, I believe the
        TeX developers will not change the grammar of TeX in such a
        way as to make TeX's grammar context sensitive.

Randolph J. Herber, [log in to unmask], +1 630 840 2966,
CD/OSS/CDF CDF-PK-149O Mail Stop 234
Fermilab, Kirk & Pine Rds., P.O. Box 500, Batavia, IL 60190-0500.
(Speaking for myself and not for US, US DOE, FNAL nor URA.)
(Product, trade, or service marks herein belong to their respective owners.)
N 41 50 26.3 W 88 14 54.4 and altitude 700' approximately, WGS84 datum.