LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Content-Type:
text/plain; charset=ISO-8859-1; format=flowed
Date:
Wed, 10 Feb 2010 08:51:04 +0000
Reply-To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Subject:
MIME-Version:
1.0
Message-ID:
Content-Transfer-Encoding:
7bit
Sender:
Mailing list for the LaTeX3 project <[log in to unmask]>
From:
Joseph Wright <[log in to unmask]>
Parts/Attachments:
text/plain (60 lines)
Hello all,

One of the questions that was raised recently on c.t.t concerning the 
currently available LaTeX3 modules was the lack of "strings" 
functionality. Looking on CTAN, I can find a number of packages 
providing some string-like functionality:

  - substr
  - coolstr
  - stringstrings
  - xstrings

(plus some functions in other packages). I'm sure there are also others.

Taking a look through them, I can find some similarities but also a 
number of differences. Before trying to create some kind of "l3str" 
package, I thought it might be useful to see what the feeling is about 
(a) if we need this at all (b) what constitutes a string (c) what 
functions are needed and (d) anything else!

On (a) my feeling is that some kind of tools are needed, given the clear 
desire to have them (see my list above of current packages). However, 
perhaps others disagree as l3tl does provide a number of useful features 
already.

The first "big" question is what exactly is a string in a TeX context. 
If you look at the existing packages, they take differing approaches to 
handling items inside what they call strings. For example, some would 
consider "ab{cde}f" to be a string of four items: "a", "b", "cde" and 
"f", whereas other approaches would remove the "{" and "}" tokens. An 
obvious suggestion is that a string is something which has been 
\detokenize'd, but then you have to handle things like:

\tl_new:N \l_my_tl
\tl_set:Nn \l_my_tl { abc }
\str_new:N \l_my_str
\str_set:Nn \l_my_str { \l_my_tl def }

Here, do we allow an "x" variant so that \l_my_str ends up as "abcdef" 
(I think so, but what do others consider sensible)? (I am of course 
imagining the functions used here!)

You also have to worry about what happens about special characters (for 
example, how do you get % into a string). If you escape things at the 
input stage [say \% => % (catcode 12)] then a simple \detokenize will 
not work.

On features, things that seem to be popular:
  - Substring functions such as "x characters from one end", "first x 
characters", etc.
  - Search functions such as "where is string x in string y".
  - Case-changing functions.

What is sensible does not necessarily mean everything that is currently 
available (as I say, some things are handled nicely in l3tl). What are 
the priorities for others? Example of the context in which things might 
be used would be helpful, as this may well guide the overall discussion.
--
Joseph Wright

ATOM RSS1 RSS2