LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

On Feb 6, 1:29 am, Philipp Stephani <[log in to unmask]> wrote:
> I can't say a lot here because I'm "tainted" by TeX programming.  I can
> only tell you a bit about my experience with source3.pdf.  The following
> list is in completely random order and contains some of my suggestions
> for expl3 and the LaTeX 3 documentation:

Hello Philipp,

First, thanks very much for taking the time to provide this list of
useful suggestions: it really helpful. I'll start with a few general
comments before looking at the issues you raise point by point (or at
least a number of them).

As you'll be aware, expl3 has been developed over a long period. That
means that some of the material carries historical "baggage", and that
the documentation is not all written in the same way. Last year's
refactor dealt with a lot of the problems that were present in the
code at the time, but of course there are those that remain. There is
also the fact that getting things written requires that there is a
driving force. So the modules that get written depend on what the team
want/need to do. At least part of this is because it becomes apparent
what is needed when you need it yourself.

There are also time issues. The amount of "man-hour" effort available
is limited, and there are several things to try to do. We could spend
for ever trying to improve expl3, but without other stuff as well this
won't get us very far. It's an iterative process, to some extent.

Another general question is who is best to write the documentation.
Anyone who feels they can contribute can send updates to the team: I'm
sure we'll add useful material to the sources if we have it!

(One or two points re-ordered to help the flow.)

> - First and foremost, expl3 is already very useful and well-documented,
>   at least in comparison with LaTeX 2e.

That is the plan :-)

> - Probably you should give a rigorous definition and many examples for
>   each of the specifiers.  For example, every LaTeX programmer must gain
>   deep knowledge about expansion, and it should be made clear that both
>   macros and token lists can be expanded, while all other data types
>   can't.
>
> - In general, the documentation should be more lengthy and rigorous.
>   Nearly all macro descriptions leave a lot of open questions.  For
>   example, consider \muskip_use:N.  It is clear from the naming scheme
>   that the only argument must be a single token.  The description just
>   says: "This function returns the length value kept in ⟨muskip⟩ in a
>   way suitable for further processing."  Now what does "function" and
>   "return" mean?  What will happen if <muskip> is a character token or a
>   control sequence token that does not refer to a muskip register.  What
>   is the "further processing" the description talks about?  What does
>   "suitable" mean?  Is the macro expandable?—We have to look at the
>   implementation to find out that \muskip_use:N is in fact an alias for
>   \the and thus expands to the string representation of any register.
>   If we would only guess by the name \muskip_use, then we would rather
>   assume that this macro inserts a math glue item into the current list.

I assume that in the first point you mean that a token list can
contain both expanded and unexpanded material, whereas something like
an integer expands whatever you pass it until it finds something
unexpandable and of the correct form.

I think that it's clear that we need a "Programming in LaTeX3" guide
that starts from assuming only a working knowledge of LaTeX as a user,
rather than the current situation where you need to know things like
expansion already. This needs some real thought, and at the moment I'm
not sure who is best placed to do it!

We could probably do with explanation of some of the general
conventions, such as that
\<var>_use functions are accessors, which are replaced expandably by
the content of the variable concerned.

We did add something to expl3.pdf about the fact that "return" is used
in places as it is convenient, although TeX as an expansion language
does not actually return things. Again, I suspect a proper re-write of
all of the documentation is going to be needed to catch all of these
problems.

> - Over all, the documentation should be made clearer and more verbose:
>   currently I find it easier to look up the definition in many
>   instances.

A lot of the documentation has been written very much ad hoc, as the
code has developed. Coupled to this, the historical issues also
feature (different people, different stages of development, etc.). It
would not be a bad idea to go back over the documentation and try to
get it into some kind of consistent order. However, this may well have
to wait a bit. The current l3doc is an improvement on ltxdoc, but a
better class is likely to be needed in the longer term. How the
documentation is written at the source level is also important. There
is still too much mark up for appearance rather than for logical
structure. The problem here is of course that this will all take time.

> - The documentation sometimes contradicts the implementation: for
>   example, \tl_if_empty:nTF is listed as unexpandable, and \exp_arg:x is
>   documented, but not implemented.

Listing expandable functions was not done to start with, and so I'm
sure this is not the only missing item. (Again, a proper re-write
would probably help here.) On \exp_arg:x, it got taken out of the main
part of expl3 recently (it was supposed to be a wrapper for \expanded,
but as pdfTeX 1.50 does not seem likely as a release we've dropped
it.) I thought we'd removed it from the docs: can you point it out to
me in the current CTAN snapshot|?

> - What is the conceptual difference between token lists and token
>   registers?  In what cases should I use which one?  

At the implementation level, a token list is a macro, whereas a token
register is, well, a token register. Most of the time, token lists are
more convenient as they don't need an access function. There are a
really limited number of places where a token register is needed:
  - When the stored material might contain # tokens
  - When you want to have exactly one expansion inside an x scenario
That is of course the same as any other TeX programming, but we should
I guess discuss this somewhere.

> Same questions for
> integers/numbers and comma lists/sequences.

The num data type is one of the historical things, which I've also
noted on the team list and asked about. I think we sort-of agreed to
remove it but got somewhat stalled: I'll try again.

On comma lists versus stacks, comma list handling is needed as it goes
up to the user level (commas are convenient for lists). On the other
hand, sequences use an internal marker to separate items, so can
contain just about any input without confusion about the list
boundaries. For example, if I do:

\clist_new:N \l_my_clist
\clist_put_right:Nn \l_my_clist { a , b , c }

the clist now contains three entries, which might not have been the
intention. However, with a sequence

\seq_new:N \l_my_seq
\seq_put_right:Nn \l_my_seq { a , b , c }

only puts one item onto the sequence. (That said, I've used comma
lists almost exclusively to date: see the next item on your list.)

> - Why is there no separate stack/queue data type?

That is what sequence stacks are for: expl3.pdf, p. 12:

"l3seq This implements data-types such as queues and stacks"

This is where I have used the seq data type rather than using comma
lists: when wanting to set up a stack.

> - Why are there no expandable stack peek operations?

I'm not sure quite how you'd work this. For xparse, I took some ideas
from etextools for doing an expandable test for optional arguments.
However, this is not really that robust (as I hope the xparse
documentation makes clear), and so is not supposed to be encourages.
TeX only gives us \futurelet, and that is not expandable.

> - Why only a global undefine macro?
>
> - Sometimes the "new" macros are local (\cs_new), sometimes global
>   (\tl_new).

These two points go together: I'll address the second one first :-)
\tl_new is consistent with all of the other \<var>_new functions, in
that it works globally. (I only realised this s few weeks ago while
testing something out.) So the question is why does \cs_new not work
globally while \cs_gundefine is global. I'm not sure I have a good
answer to this: I'd probably prefer names to be "taken" globally
irrespective of whether they are variables or functions.

> - In general, the behavior of "new" and "set" macros regarding globality
>   and error checking should be consistent.

I'd noticed this too (a couple of days ago, as it happens): we should
look at this.

> - Advanced string functions (e.g. splitting strings by arbitrary
>   delimiters) are missing.
>
> - Perhaps a "string" datatype that only contains category-12 tokens.

We know that :-) At present, there are two pretty obvious gaps in
expl3:

 - strings (something like xstrings or stringstrings or ...)
 - floating points (something like fp or pgfmath or ...)

Both of these are on the "to do" list, but this depends on someone
doing it. I've got reasons for being interested in implementing at
least part of "l3fp" sooner rather than later, but at present I'm not
sure who will take the strings issue (Will Robertson did mention it at
one point, but he's currently rather busy with other things). Probably
some discussion is needed about what to implement, but we certainly
need something.

> - l3io is too low-level.  \io_new should only check whether the control
>   sequence already exists, and the allocation should be done by
>   \io_open.  As long as we have only 16 streams, the allocation should
>   not use the plain TeX allocator, but a "heap allocator" with a list of
>   free streams instead.  Closing a stream should deallocate the stream
>   handle.  This is the normal behavior in all programming languages.  On
>   the contrary, the current implementation forces everybody to
>   preallocate stream handles.

I wrote l3io as we had nothing at all and I needed some functions
available: it's therefore not had too much reviewing just yet. I'm
very much learning these things (I have no formal programming
background) so I make mistakes, I'm afraid, and just went with
essentially a re-code of the latex.ltx material in this area. Your
suggestion is pretty sensible, and as long as the rest of the team
look happy with this I'd hope a re-write can be arranged. Feel free to
contribute more ideas [or even code :-)].

> - A file stack should be implemented so that there is always information
>   about the file currently being read.

At some point, yes, but at the moment this is one of the "where to put
it" bits. We'll need some kind of higher-level loading mechanism in
any case, so it might go there. (l3io is always going to be pretty low-
level I suspect.)

> - GetIdInfo requires a certain version control system and should thus be
>   removed sooner or later.

Not sure about this. The Team uses SVN for the code, and I doubt this
will change, and so we need something to turn the $Id data into
something to put into the output. What would you suggest as an
alternative? (\GetIdInfo works with both SVN and CVS: do other version
control systems use the $Id line but with differing syntaxes?)

> - The specifier "d" is used, but not documented.

Should be removed totally: can you point out where this is? (We did
our best but may have missed something.)

> - Sometimes the examples contradict the naming scheme
>   (\g_file_name_stack should be \g_file_name_stack_seq).

Sounds like one of the historical things I mentioned.

> - There are no functions to convert between a bool variable and its
>   string representation.

I wonder where this would be used. Do you mean something like
\bool_display:N (to work a bit like \prop_display:N), or for
typesetting the result (and if so, when would this be used)?

> - expl3 should load the etex package, otherwise you will soon run out of
>   registers.

Done in the SVN.

> - There should exist a message class between "error" and "fatal" that
>   stops reading the current input file, but does not stop the LaTeX run
>   completely.

We recently re-worked some of the message handling stuff, and things
did change a bit here. Apart from fatal errors we moved to messages
sticking purely to that: giving a message. So any file-loading issues
should be addressed separately (indeed, should we even have fatal
messages: should have thought about this before).

> - There is a macro \int_const, but no corresponding \tl_const.

Again true and a bit odd, and quite easy to fix. We should probably do
this :-)

> - A macro that tests whether a token list contains exactly one token.

I can see where this is probably going, but can you give a scenario to
use this in? (It's always handy to see what people want.)

> - As far as I can see, expl3 only adds names with the new naming scheme
>   and cannot break anything.  So perhaps you might already include
>   expl3 in the kernel at this early stage, and write a document
>   analogous to clsguide.  This could boost LaTeX 3 usage.

Certainly worth discussing further.

So that interested people see this, I've CC'd the LaTeX-L list with
this reply. It's very useful to get this feedback, and if you are
interested it would be nice to get some more input on the
documentation side of things.

I've pointed the team at this thread and hope to start addressing the
points raised. A few might get a bit of internal team discussion
before anything public, but my suggestion would be to tackle them one
at time on LaTeX-L. I think we can address the code-side questions
quite quickly, even if the documentation side needs more work.
--
Joseph Wright