LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Classic View Use Monospaced Font Show HTML Part by Default Condense Mail Headers
Topic:	[<< First] [< Prev] [Next >] [Last >>]

Sender: Mailing list for the LaTeX3 project <[log in to unmask]>

Date: Mon, 6 Aug 2012 20:29:02 -0400

Reply-To: Mailing list for the LaTeX3 project <[log in to unmask]>

Message-ID: <[log in to unmask]>

Subject: Re: Peek ahead for next token not in token-list

MIME-Version: 1.0

Content-Transfer-Encoding: 8bit

In-Reply-To: <[log in to unmask]>

Content-Type: text/plain; charset=UTF-8

From: "Joel C. Salomon" <[log in to unmask]>

Parts/Attachments: text/plain (171 lines)

On Fri, Aug 3, 2012 at 10:52 AM, Bruno Le Floch <[log in to unmask]> wrote:
> I promised to go back to you earlier but didn't, sorry about that.
> I'm replying to two emails in one, and the result is somewhat long,
> hopefully helpful.

I don't think it would have been nearly as helpful if it were any
shorter; thank you *very* much for this explanation! It's taken me a
few days to work through your code and build it into something I
almost understand.

I'm restarting the package with your comments in mind, especially the
l3docstrip bits. Meanwhile I'm experimenting with a standalone
version; see <https://github.com/jcsalomon/xpeek/blob/do-over/xpeek.tex>

>> The direction I’m considering is to read ahead, consuming tokens. Each
>> token read is added to a save-list and compared to the ignore-list. If
>> it’s on the ignore-list, continue; otherwise put the save-list back on
>> the input stream and stop.
>>
>> Does this sound reasonable so far?
>
> Somewhat reasonable, yes.  I'm not sure what the best approach is.
> You need to collect the tokens in your ignore list, and you then need
> to perform an action depending on the next token. It is possible to
> define \xpeek_collect_do:nn, whose first argument is a list of tokens
> to ignore, whose second argument is some operation to perform, which
> will receive as an argument the tokens:
>
>     \xpeek_collect_do:nn { abc } { \foo \bar } caada
>
> =>
>
>     \foo \bar { caa } da

I've combined some of the part you gave me (per your suggestion) into
something like this:

\tl_const:Nn \g_xpeek_ignorelist_tl  { .,;:!? }

\DeclareDocumentCommand \nextnonpunct {}
  {
    \xpeek_collect_do:nn \g_xpeek_ignorelist_tl
      { `\l_peek_token' \use:n }
  }

\cs_new_protected:Npn \xpeek_collect_do:nn #1#2
  { \__xpeek_collect_do:nnnn { #1 } { #2 } { } { } }

\cs_new_protected:Npn \__xpeek_collect_do:nnnn #1#2#3#4
  {
    \peek_after:nw
      {
        \__xpeek_if_in:NNTF #1 \l_peek_token
          {
            \__xpeek_collect_do:nnnn {#1} {#2} { #3#4 }
          }
          {
            #2 { #3#4 }
          }
      }
  }

>>           {
>>             \tl_put_right:N? \l_jcs_ignored_tokens_tl
>>               { something involving \l_peek_token }
>>             keep looking, probably by recursing
>
> Yes, that's roughly what I'm doing.  I'm storing the tokens as macro
> arguments #3 and #4 of \@@_collect_do:nnnn, but that's not very
> sensible, storing in a token list is better.

Am I correct in understanding that #3 grows to become the list of
ignored tokens, and #4 is a list with the single next non-ignored
token? In other words, given

    The next non-punctuation mark is \nextnonpunct.,;:!xyz

eventually, this becomes

    `\l_peek_token' \use:n {.,;:!} {x}yz

with \l_peek_token pointing to the 'x'.

If that's correct, this would explicitly build up a named token-list:

\tl_new:N \l_xpeek_ignored_tokens_tl

\cs_new_protected:Npn \xpeek_collect_do:nn #1#2
  {
    \tl_clear:N \l_xpeek_ignored_tokens_tl
    \__xpeek_collect_do:nnn { #1 } { #2 } { }
  }

\cs_new_protected:Npn \__xpeek_collect_do:nnn #1#2#3
  {
    \xpeek_after:nw
      {
        \__xpeek_if_in:NNTF #1 \l_peek_token
          {
            \tl_put_right:Nn \l_xpeek_ignored_tokens_tl {#3}
            \__xpeek_collect_do:nnn {#1} {#2}
          }
          {
            #2 { \l_xpeek_ignored_tokens_tl #3 }
          }
      }
  }

In a simple test, this seems to work, but I'm out of my depth enough
not to be sure whether the braces are needed in that last line, or
whether a \use:n might not be needed under some circumstances.

> At least for now, I think the \xpeek_collect_do:nn code I give above
> is (up to a few improvement) a reasonable approach to practical
> situations where someone wants to look ahead in the input stream.  So
> I'd say, provide \xpeek_collect_do:nn or a similar functionality as a
> public code-level function in your xpeek package.

Makes sense.

>> After some experimentation, it seems that the \peek_* family of
>> functions don't work well inside l3prg conditionals; source3.pdf seems
>> to bear this out in the justification for \__peek_def:nnnn.
>
> Indeed: consider
<example & explanation snipped>

Wow. I don't think I quite appreciated how many layers it takes to
make a macro expansion language behave like a structured one!

>> Is it reasonable to use \__peek_def:nnnn to generate something like
>> \peek_unconditional:TF? (The false-code branch should never execute, I
>> expect.)
>
> Definitely not.  \__peek_def:nnnn is internal, and may change at a
> whim.  We have been careful to mark internal functions as such, and
> make no guarantee whatsoever that they will remain.  The function you
> want is \peek_after:nw (see l3kernel-extras), and for now, you can use
> your own copy
>
>     \tl_new:N \l__xpeek_code_tl
>     \cs_new_protected:Npn \xpeek_after:nw #1
>       {
>         \tl_set:Nn \l__xpeek_code_tl {#1}
>         \peek_after:Nw \l__xpeek_code_tl
>       }

I'll include it for now, then move to \peek_after:nw when that version
becomes available in TeX Live. If it gets dropped later, I'll recover
\xpeek_after:nw from the repository history.

>> Actually, it's \peek_unconditional_remove:T I think I need.
>
> I don't think you need that one since the token should be kept
> somewhere.  The copy \l_peek_token is not appropriate, since that
> control sequence will later be changed to the next token in the input
> stream.  Think of \l_peek_token as a pointer (that's almost not a
> lie), which TeX can unfortunately not dereference.

That's a helpful image. TeX might be easier to learn if I were versed
in Lisp, but I'm a C programmer at heart. "Pointer" has a nice,
friendly sound to it. :)

>> Does this sound like the correct path to head down?
>
> Yes.

Thank you for all your help.

--Joel

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung