LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Bruno Le Floch <[log in to unmask]>
Reply To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Tue, 16 Jul 2013 13:25:35 -0400
text/plain (128 lines)
Hello Michiel,

It is indeed useful to have unique csnames generated automatically.
However, the need is sufficiently rare that it does not warrant adding
a new variant letter: this is expensive, since each function would
need a new variant, hence a new entry in TeX's hash table, and this
adds up pretty quickly to a very large number of csnames spent,
slowing TeX down.

Furthermore, none of the variant types perform any assignment (with
the exception of 'x', which only modifies an internal LaTeX3 variable,
and within a group), hence we can rest safely assured that using
\exp_args:... (or the \cs_generate_variant mechanism) to expand an
argument will always yield the same result (barring two exceptions,
random numbers generated with \pdfuniformdeviate and the like, and
pretty elaborate code involving \csname and \ifcsname, as is done in
l3fp).  The proposed 'U' argument type would be altering the variable
appearing as the corresponding argument, which is quite contrary to
anything that any other argument type does.

> \tl_new:N \l__something_map_csname_tl
> \cs_new_protected:Nn \something_map_inline:Nn {
>     \cs_set:Upn \l__something_map_csname_tl ##1##2 {#2}
>     \something_map_function:Nc #1 \l__something_map_csname_tl
> }

I find it much more natural to provide a function which creates a
unique name and stores it in its argument, along the lines of

    \cs_new_protected:Nn \something_map_inline:Nn {
        \unique_csname:N \l__something_map_csname_tl
        \cs_set:cpn { \l__something_map_csname_tl } ##1##2 {#2}
        \something_map_function:Nc #1 { \l__something_map_csname_tl }

This approach is good for cases where one wants a unique name which is
used over a long span of time.  I will focus on the case of defining a
map_inline function, but I am interested to see other situations where
such unique names are useful.  In the case of map_inline, and whenever
there is a need for short-lived unique identifiers, \unique_csname:N
will lead to memory leaks.  Indeed, \unique_csname:N has no way to
know when a given unique name is not used anymore.  This means that
every time the map_inline function is called, a new unique csname is
used.  TeX will run out of hash table space relatively fast if the
loop is called a few thousand times in the document (which, at least
for kernel loops, is not unreasonable).  Even worse, if we generate
names simply by incrementing a counter, given the algorithm used by
TeX to hash csnames, all of the unique names will hash to one of 30 or
so different values, leading to a vast amount of hash collisions,
hence slow down.  I don't know how bad it is, but I believe it is not
to be neglected before adding such a \unique_csname:N function.

The kernel mapping functions do it differently, by incrementing a
counter before constructing a csname and decrementing it after the
loop (actually, the kernel is not quite consistent there, but it works
and I don't yet have a very clean solution).  This could be wrapped
into something like

    \cs_new_protected:Nn \something_map_inline:Nn {
        \unique_csname:N \l__something_map_csname_tl
        \cs_set:cpn { \l__something_map_csname_tl } ##1##2 {#2}
        \something_map_function:Nc #1 { \l__something_map_csname_tl }
        \unique_free:V \l__something_map_csname_tl

where we explicitly free, in the eyes of the \unique_... functions,
the unique identifier provided by \unique_csname:N.  This does not
leak memory anymore.  Alternatives which require less typing (and
don't require defining a tl variable \l__something_map_csname_tl)

    \cs_new_protected:Nn \something_map_inline:Nn {
        \unique_csname:n {
            \cs_set:cpn {##1} ####1####2 {#2}
            \something_map_function:Nc #1 {##1}
    \cs_new_protected:Nn \something_map_inline:Nn {
        \unique_csname:n {
            \cs_set:cpn { \l_unique_tl } ##1##2 {#2}
            \something_map_function:Nc #1 { \l_unique_tl }

The first one is a bit confusing since # does not need doubling, but
## does become ####.  The second one is a bit risky since one can only
use \l_unique_tl before any other \unique_csname:n is called: here it
is safe because \cs_set:cpn does not use \unique_csname:n.

However, in the context of defining mapping functions, none of the
above are good enough.  None of them let us provide
\something_map_break:n {<foobar>} that safely breaks out of both
map_function:NN and of map_inline:Nn without leaving any token after
<foobar> in the input stream.  Indeed, the map_break:n breaks out of
the map_function, but there remain tokens left by \unique_csname:n to
reclaim the looping function name as free.  This is why the kernel
map_inline do not call the corresponding map_function.  The two user
functions are typically implemented in terms of a common auxiliary,
with a sentinel (\__prg_break_point:Nn IIRC) whose n argument stands
for some code which cleans up the loop (and reclaims the used unique
name) and is inserted before the argument <foobar> of map_break:n.  I
don't know how to give this a nice wrapper.

In fact, perhaps the right approach is that I revive some ideas I had
about objects.  Mapping a function then amounts to repeatedly popping
the first item and acting with the function.  I roughly see how to
write a wrapper for that, which would mean that the package writer
would not need to worry about defining the mapping function: rather he
would define a \something_get:N or \something_pop:NN function, and the
framework would do the work of defining map_function and map_inline.
This is a longer term idea.

In any case, most of the above concerns specifically the application
of your idea to defining map_inline.  If you have more examples where
unique identifiers are useful, it would be very helpful to give them,
so as to either find alternatives or get a better view of what would
be needed.

> (

Can you give a short example of use of that package? I'm curious, and
slightly too lazy to figure out the code. :-)

More on your other point in a separate thread.