On Fri, Aug 3, 2012 at 10:52 AM, Bruno Le Floch <[log in to unmask]> wrote:
> I promised to go back to you earlier but didn't, sorry about that.
> I'm replying to two emails in one, and the result is somewhat long,
> hopefully helpful.

I don't think it would have been nearly as helpful if it were any
shorter; thank you *very* much for this explanation! It's taken me a
few days to work through your code and build it into something I
almost understand.

I'm restarting the package with your comments in mind, especially the
l3docstrip bits. Meanwhile I'm experimenting with a standalone
version; see <https://github.com/jcsalomon/xpeek/blob/do-over/xpeek.tex>

>> The direction I’m considering is to read ahead, consuming tokens. Each
>> token read is added to a save-list and compared to the ignore-list. If
>> it’s on the ignore-list, continue; otherwise put the save-list back on
>> the input stream and stop.
>> Does this sound reasonable so far?
> Somewhat reasonable, yes.  I'm not sure what the best approach is.
> You need to collect the tokens in your ignore list, and you then need
> to perform an action depending on the next token. It is possible to
> define \xpeek_collect_do:nn, whose first argument is a list of tokens
> to ignore, whose second argument is some operation to perform, which
> will receive as an argument the tokens:
>     \xpeek_collect_do:nn { abc } { \foo \bar } caada
> =>
>     \foo \bar { caa } da

I've combined some of the part you gave me (per your suggestion) into
something like this:

\tl_const:Nn \g_xpeek_ignorelist_tl  { .,;:!? }

\DeclareDocumentCommand \nextnonpunct {}
    \xpeek_collect_do:nn \g_xpeek_ignorelist_tl
      { `\l_peek_token' \use:n }

\cs_new_protected:Npn \xpeek_collect_do:nn #1#2
  { \__xpeek_collect_do:nnnn { #1 } { #2 } { } { } }

\cs_new_protected:Npn \__xpeek_collect_do:nnnn #1#2#3#4
        \__xpeek_if_in:NNTF #1 \l_peek_token
            \__xpeek_collect_do:nnnn {#1} {#2} { #3#4 }
            #2 { #3#4 }

>>           {
>>             \tl_put_right:N? \l_jcs_ignored_tokens_tl
>>               { something involving \l_peek_token }
>>             keep looking, probably by recursing
> Yes, that's roughly what I'm doing.  I'm storing the tokens as macro
> arguments #3 and #4 of \@@_collect_do:nnnn, but that's not very
> sensible, storing in a token list is better.

Am I correct in understanding that #3 grows to become the list of
ignored tokens, and #4 is a list with the single next non-ignored
token? In other words, given

    The next non-punctuation mark is \nextnonpunct.,;:!xyz

eventually, this becomes

    `\l_peek_token' \use:n {.,;:!} {x}yz

with \l_peek_token pointing to the 'x'.

If that's correct, this would explicitly build up a named token-list:

\tl_new:N \l_xpeek_ignored_tokens_tl

\cs_new_protected:Npn \xpeek_collect_do:nn #1#2
    \tl_clear:N \l_xpeek_ignored_tokens_tl
    \__xpeek_collect_do:nnn { #1 } { #2 } { }

\cs_new_protected:Npn \__xpeek_collect_do:nnn #1#2#3
        \__xpeek_if_in:NNTF #1 \l_peek_token
            \tl_put_right:Nn \l_xpeek_ignored_tokens_tl {#3}
            \__xpeek_collect_do:nnn {#1} {#2}
            #2 { \l_xpeek_ignored_tokens_tl #3 }

In a simple test, this seems to work, but I'm out of my depth enough
not to be sure whether the braces are needed in that last line, or
whether a \use:n might not be needed under some circumstances.

> At least for now, I think the \xpeek_collect_do:nn code I give above
> is (up to a few improvement) a reasonable approach to practical
> situations where someone wants to look ahead in the input stream.  So
> I'd say, provide \xpeek_collect_do:nn or a similar functionality as a
> public code-level function in your xpeek package.

Makes sense.

>> After some experimentation, it seems that the \peek_* family of
>> functions don't work well inside l3prg conditionals; source3.pdf seems
>> to bear this out in the justification for \__peek_def:nnnn.
> Indeed: consider
<example & explanation snipped>

Wow. I don't think I quite appreciated how many layers it takes to
make a macro expansion language behave like a structured one!

>> Is it reasonable to use \__peek_def:nnnn to generate something like
>> \peek_unconditional:TF? (The false-code branch should never execute, I
>> expect.)
> Definitely not.  \__peek_def:nnnn is internal, and may change at a
> whim.  We have been careful to mark internal functions as such, and
> make no guarantee whatsoever that they will remain.  The function you
> want is \peek_after:nw (see l3kernel-extras), and for now, you can use
> your own copy
>     \tl_new:N \l__xpeek_code_tl
>     \cs_new_protected:Npn \xpeek_after:nw #1
>       {
>         \tl_set:Nn \l__xpeek_code_tl {#1}
>         \peek_after:Nw \l__xpeek_code_tl
>       }

I'll include it for now, then move to \peek_after:nw when that version
becomes available in TeX Live. If it gets dropped later, I'll recover
\xpeek_after:nw from the repository history.

>> Actually, it's \peek_unconditional_remove:T I think I need.
> I don't think you need that one since the token should be kept
> somewhere.  The copy \l_peek_token is not appropriate, since that
> control sequence will later be changed to the next token in the input
> stream.  Think of \l_peek_token as a pointer (that's almost not a
> lie), which TeX can unfortunately not dereference.

That's a helpful image. TeX might be easier to learn if I were versed
in Lisp, but I'm a C programmer at heart. "Pointer" has a nice,
friendly sound to it. :)

>> Does this sound like the correct path to head down?
> Yes.

Thank you for all your help.