Hello all,
One of the questions that was raised recently on c.t.t concerning the
currently available LaTeX3 modules was the lack of "strings"
functionality. Looking on CTAN, I can find a number of packages
providing some string-like functionality:
- substr
- coolstr
- stringstrings
- xstrings
(plus some functions in other packages). I'm sure there are also others.
Taking a look through them, I can find some similarities but also a
number of differences. Before trying to create some kind of "l3str"
package, I thought it might be useful to see what the feeling is about
(a) if we need this at all (b) what constitutes a string (c) what
functions are needed and (d) anything else!
On (a) my feeling is that some kind of tools are needed, given the clear
desire to have them (see my list above of current packages). However,
perhaps others disagree as l3tl does provide a number of useful features
already.
The first "big" question is what exactly is a string in a TeX context.
If you look at the existing packages, they take differing approaches to
handling items inside what they call strings. For example, some would
consider "ab{cde}f" to be a string of four items: "a", "b", "cde" and
"f", whereas other approaches would remove the "{" and "}" tokens. An
obvious suggestion is that a string is something which has been
\detokenize'd, but then you have to handle things like:
\tl_new:N \l_my_tl
\tl_set:Nn \l_my_tl { abc }
\str_new:N \l_my_str
\str_set:Nn \l_my_str { \l_my_tl def }
Here, do we allow an "x" variant so that \l_my_str ends up as "abcdef"
(I think so, but what do others consider sensible)? (I am of course
imagining the functions used here!)
You also have to worry about what happens about special characters (for
example, how do you get % into a string). If you escape things at the
input stage [say \% => % (catcode 12)] then a simple \detokenize will
not work.
On features, things that seem to be popular:
- Substring functions such as "x characters from one end", "first x
characters", etc.
- Search functions such as "where is string x in string y".
- Case-changing functions.
What is sensible does not necessarily mean everything that is currently
available (as I say, some things are handled nicely in l3tl). What are
the priorities for others? Example of the context in which things might
be used would be helpful, as this may well guide the overall discussion.
--
Joseph Wright
|