LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Classic View

Use Proportional Font
Show HTML Part by Default
Show All Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Bruno Le Floch <[log in to unmask]>
Thu, 13 Oct 2011 14:34:45 -0400
text/plain (58 lines)
>> Do you have any idea on the natural resulting data structure? It's not
>> clear how trees should be implemented in TeX.
> I would suggest just return a token sequence, into which the TeX commands
> the user specified has been inserted to reflect the aspects of the structure
> that he's interested in. A silly example might be something like
> \begin{Sentence}\begin{NounPhrase}My
> \begin{Noun}hovercraft\end{Noun}\end{NounPhrase}
> \begin{VerbPhrase}\begin{Verb}is\end{Verb} full
> \begin{PrepositionalPhrase}of
> \begin{Noun}eels\end{Noun}\end{PrepositionalPhrase}\end{VerbPhrase}.\end{Sentence}
> It seems to be a fairly standard extension that something inside braces in
> an PEG means "When passing through here upon matching, then insert this
> piece of material." An awkward point is however that you'd often want to
> insert an opening brace at one point and a closing brace a bit later -- for
> example to make that \Noun{hovercraft} rather than the less flexible
> \begin{Noun}hovercraft\end{Noun} -- which is not obvious how to express.

My main problem was the insertion of { and }. We cannot do that
directly, of course, and need to walk an extra mile. A reasonable but
ugly approach is to insert " { \iffalse } \fi " instead of " { " and "
\iffalse { \fi } " instead of " } ", also putting every other piece of
material within "\unexpanded", then once the token list is built,
\edef the whole thing to get braces right. Of course, there needs to
be a check (before \edef-ing) that the result will be brace balanced.

I will probably implement PEGs at some point, but it is unlikely that
I do it before I finish rewriting l3fp (so count a few months).

> I'd be inclined to consider a grammar with a left recursion loop a
> programming error, just like it is for ordinary TeX macros. There is a
> transformation of PEGs that turn left-recursion into runtime match failures
> rather than infinite loops, but I don't know how complicated it is, and I
> suspect it's mainly interesting when the PEG effectively gets compiled to
> machine code. LaTeX already runs inside the TeX virtual machine, so an
> infinite loop is not that fatal.

Detecting left-recursion should be easy, and then I'll just raise an
error, and possibly try to behave in a sensible way (not loop
eternally). The problem is not that different from epsilon-transitions
in an NFA running in a loop.

> And exchanging the sets of initial and final states. Yes, that's all there
> is to it. (I assume you don't worry about submatch capturing there anyway.)
>> but the
>> representation I have makes that non-trivial).
> It is often there that the crux lies.

Indeed, I don't need to keep track of submatches. Some more thoughts
about that this weekend.