LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Joseph Wright <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Date:
Thu, 20 Aug 2009 08:50:32 +0100
Content-Type:
text/plain
Parts/Attachments:
text/plain (174 lines)
Hello Lars,

Thanks for taking the time to look at all of this: I really appreciate it.

Lars Hellström wrote:
> Have begun review, but only gotten about a third of the way. Some
> remarks so far:
> 
> 1. Is it possible to use a space as delimiter of an argument (perhaps
> most interesting for u arguments)? I believe I spotted some places in
> the d->D conversion that would gobble a space as <token>.

I tend to find it best to test things, even thought I had a feeling I
knew the answers here.  The u argument works with spaces:

\ExplSyntaxOn
\DeclareDocumentCommand \foo { u{~stop~} } { (#1) }
\ExplSyntaxOff
\foo word stop more

results in "(word)more".

The D specifiers, on the other hand, do not work using a space as one
delimiter. That is not affected by the shorthand: if you try something
like { D[~{default} } or { D[{~}{default } all hell breaks loose.  That
is basically what I expected, although I guess it should be documented.
 Do we really want to support something like { d~~ } ?

> 2. I believe \xparse_prepare_next:w should be listed as a variable (or
> maybe "variable function"), since it is getting redefined rather
> frequently and seems to be keeping track of the state of the argspec
> parser automaton.

I've tried to improve the documentation here a bit.
\xparse_prepare_next:w is not a variable, as it contains things to be
executed. So it has to be a function! I have tried to explain what it is
grabbing. You often see this with "next" functions, so this is just one
case of something that is bound to come up again.

> 3. Though provided in .dtx format, I find the implementation section
> somewhat illiterate (i.e., not up to literate programming standards).
> Some concrete examples:

No-one ever said I was any good at literate programming :-)

>  * \l_xparse_processor_int and \l_xparse_processor_use_int are
> documented as "For keeping a count of post-processors and then using
> them." Well, I could guess about as much from the names alone. What
> would be more interesting to see spelt out is /how/ these keeps a track
> of post-processors; what does actual values of these variables mean? At
> what stage in the process are they used?

I've tried to improve the documentation on this (whether I have I leave
to others). I've also altered the name \l_xparse_processor_int to
\l_xparse_processor_total_int. Trying to explain what has to happen here
is not easy, at least for me.  At the parsing stage, the processor
functions are in front of the parser in reverse order. So as each one is
found, it is saved and \l_xparse_processor_total_int is increased. Then
the argument is grabbed, and the processors are used. This has to start
a 1, so \l_xparse_processor_use_int is used to count up until it is
equal to \l_xparse_processor_total_int, at which point the processing is
done.

>  * The overall structure of the code starts with lots of little things
> (macros concerned with parsing specific argspecs) and ends (I'm
> guessing, from xparse-alt) with putting them together. The more literate
> approach would be to start with the big picture -- either from the "what
> goes on at run-time" or "what goes on at define-time" point of view --
> so that one knows what the little things will fit into when one gets to
> them.

This reflects how I think, I suspect. I like to start at the low level
and work up, hence variables come first (after the lead-off), then
internal functions, the user functions. I've divided internal functions
into what seem to me to be logical "blocks", then I do everything
alphabetically. So I can quickly find a function if I know its name. (I
never read code from start to end, or even in typeset form. I always
read it in my editor, find a function, read it, then find the next
function, etc. So for me alphabetical is best.) In my defence, xparse
was in roughly the same order before any changes were made by me.

> Joseph Wright skrev:
>> Lars Hellström wrote:
>>> Haven't had time to review the code yet, I've only just downloaded it.
>>> Will get back to you when I have had a look at it. (I'm especially
>>> curious to how on earth you managed to do optional arguments at
>>> expand-time.)
>>
>> That is done basically following etextools. For each argument, grab #1
> 
> Aha, the optimistic approach of hoping everything can be processed as
> undelimited macro arguments. That I can believe to be doable.

If there is another (better) way, do tell. As I said, I've just taken
what is known elsewhere and tried to fit it together with the rest of
xparse.

>> I'd missed this before as I'd not run makeindex on xparse. I've
>> corrected my batch file so this happens!
> 
> A similar issue: Today I pdflatex'ed source3.tex (rev. 1464), and it
> goes into an infinite loop at \end{document} (\seq_if_in:NnT being one
> thingie involved).

All seems okay to me on the SVN.

> So you store it away in a macro... Yes, one probably has to do something
> like that to handle this situation. Any particular reason
> \xparse_grab_m_aux:n includes an "m" in the name, though? If you're
> constantly redefining it, then why not use the same scratch control
> sequence for all grabbers that require something of this sort...

Not really.  I've altered this so that there is one function
\xparse_grab_aux:w which is defined by whatever grabber is currently
working.

>>> 2. What happens (for the various argument types) when TeX fails to grab
>>> an argument? (Failures happen for long arguments when encountering a
>>> mismatched right brace, and for short argument also when encountering a
>>> \par token.) To keep in line with TeX, everything up to the ungrabbable
>>> token should disappear, but an uncareful implementation could leave
>>> other tokens in place as well, which would subject the user to
>>> mysterious additional errors.
>>
>> The basic idea of xparse is not mine: ask the more senior team members!
> 
> This is not about the ideas, but whether the implementation of these
> ideas behaves well under stress.

What I meant was that the code is essentially unchanged from what
already existed: I've tidied a bit, but the basic ideas are unchanged.

> Leaving something odd in a toks register shouldn't be a problem. The
> problem is leaving odd tokens in the input.
> 
>> then get back to business with grabbing any more arguments.
> 
> This I find implausible. Getting back to business after \futurelet'ing
> and not finding what was expected, yes, that's no problem, but getting
> back to business after failing to expand a macro is hard; everything
> from the macro to the token where TeX gave up will be gone (but that is
> usually the sane thing to do).

I realise I got this wrong. If you construct a short test file:

\documentclass{article}
\usepackage{xparse}
\DeclareDocumentCommand\foo{om}{}
\begin{document}
\foo [ % oops

More text
\end{document}

the argument runs away, of course, but only to the end of the paragraph.
At that stage, TeX abandons \xparse_grab_aux:w, which was defined by
\xparse_grab_D:w as

\cs_set:Npn \xparse_grab_aux:w [ #1 ] {
  \xparse_add_arg:n { {#1} }
  \xparse_grab_m_1:w \toks_use:N \l_xparse_args_toks
}

and so the other grabbers never get executed and "More text" is left
alone. That seems to be acceptable to me. Of course, if the arguments
are long then everything else gets gobbled, but there is not much that
can be done about that. The key point is that at the grab-an-argument
stage all of the rest of the grabbing business is inside
\xparse_grab_aux:w, with no other tokens about. So if \xparse_grab_aux:w
goes wrong, the rest of the grabbing is also terminated and TeX should
recover.
-- 
Joseph Wright

ATOM RSS1 RSS2