> I wonder if \regex_set:Nn would be better as \regex_save:Nn. My > reasoning is that \<var>_set:Nn functions are used with a variable of > name \l_..._<var>, while here we are (currently) naming as \l_..._tl. Thinking about this, I feel \regex_const:Nn would make most sense, since regular expressions would typically not be a "dynamic" variable. > I'm not sure about the approach on submatches. You say > > % Submatches with numbers higher than $10$ are accessed in the same way, > % namely |\10|, |\11|, \emph{etc}. To insert in the replacement text > % a submatch followed by a digit, the digit must be entered using the > % |\x| escape sequence: for instance, to get the first submatch followed > % by the digit $7$, use |\1\x37|, because $7$ has character code |37| > % (in hexadecimal). > > I wonder how likely it is that we'll need more than 9 submatches in the > sort of scenario that l3regex is likely to applied in. TeX programmers > are already used to the idea that we have up to 9 numbered parameters, > so why not limit to nine submatches and avoid the need to use "\x" syntax? I agree that \x37 is pretty awkward. But contrarily to macro arguments, the number of submatches can grow pretty quickly. One case where there can be more than 9 submatches is recognizing a date: \regex_const:Nn \c_date_regex { ((Jan)(uary)?|(Feb)(ruary)?|...) \ (\d\d?) } Then "\2\4\6\8\10\12\14\16\18\20\22\24" gives the three first letters of the month that was found, and "\26" is the day of the month. I guess that would be done better with \regex_const:Nn \c_date_regex { (Jan(?:uary)?|Feb(?:ruary)|Mar(?:ch)|...) \ (\d\d?) } then extracting the first three letters of \1 for the month, and \2 now holds the day. Note that I had to use non-capturing groups (?: ... ), otherwise the last submatch would be \14, again too large. The best for that particular example may be that I finally implement (?| ... ) groups, namely, non-capturing groups where the submatch number is reset for each alternative \regex_const:Nn \c_date_regex { ( (?|Jan(uary)?|Feb(ruary)?|...) ) \ (\d\d?) } Then the interesting submatches are \1 and \3. All in all, I don't know what's best. Perhaps provide a \g{...} (or whatever is standard) for submatches > 9 ? -- Bruno