Hello all, A few months ago now we added various expandable case changing functions to expl3 with clearly 'experimental' status. I've recently had some useful feedback on aspects of the behaviour and have revised some of the code. I've now got some more questions, so thought it would be useful to raise those here. (Note: I've updated the SVN code but this has yet to go to CTAN. I can arrange a release if people want to test but not grab via GitHub.) *Background* The current implementation has six functions \tl_upper_case:n \tl_lower_case:n \tl_mixed_case:n \tl_upper_case:nn \tl_lower_case:nn \tl_mixed_case:nn where the two-argument versions deal with language-specific case changing. The functions are x-type expandable. 'Letters' can be case changed from the full Unicode range when using XeTeX/LuaTeX and the mappings do not have to be 1-1 (cf. \uppercase/\lowercase). There is also \str_fold_case:n which does folding for programmatic applications. That function has a different set of use cases and is not considered further here. *Escaping from case changing* The current implementation follows a BibTeX-like convention for preventing case changing: braced content is not changed. In the original approach there was no mechanism to do case changing inside the argument to a command as a result. I have now altered this to include a list of commands where case changing should be applied, so for example it would be possible to arrange that \tl_upper_case:n { Hello~\emph{world} } will case change the argument to \emph. At present, this functionality is designed to work with commands taking one argument (i.e. a second or subsequent argument will be unaffected). The alternative to such an approach is to case change everything and provide an escape mechanism (cf. the textcase package and \NoChangeCase). As a user, I can see advantages to both approaches. One thing that is not currently covered is dealing automatically with math mode content. That is doable but would require some consistent interface. In particular, while dealing with "$ ... $" and "\( ... \)" is straight-forward (single-token delimiters), it would be more challenging to cover "\begin{math} ... \end{math}" or similar. Some of this has a relationship to expandability: see the next area. *Expandability* The current implementation is expandable as this allows the 'natural' usage \tl_set:Nx \l_tmpa_tl { \tl_upper_case:n { foo } } \tl_show:N \l_tmpa_tl % => "FOO" Expandablity imposes some restrictions on the code and does have a performance knock-on. The need to deal with changes that are not 1-1 or have other context-dependence means that the performance aspect is not so important: a full solution using \uppercase/\lowercase would still require a mapping or similar to deal with all of the possibilities. One area that is more tricky in this regard is input which is not fully expanded. For example \def\myname{Joseph Wright} \MakeUppercase{Written by \myname} will yield "WRITTEN BY JOSEPH WRIGHT" as there is an \edef inside the LaTeX2e command before case changing. In contrast, the expl3 functions currently do no expansion so \tl_upper_case:n { Written~by~\myname } gives "WRITTEN BY Joseph Wright". Notably, if used in setting a token list the content would be "WRITTEN BY \myname", i.e. further expansion is inhibited. It is not clear to me what the 'expected' outcome might be. It would be possible to use f-type expansion to deal with stored tokens before case changing, but for input such as \tl_upper_case:n { Written~by \\ Joseph~Wright } that could break outcomes with LaTeX2e: \\ would be 'lost' and this would could problematic if the text was used later in for example a center environment. A non-expandable implementation could use the same logic as \MakeUppercase but at the cost that case changing for storage would then need dedicated functions for example \tl_set_upper_case:Nn \tl_set_lower_case:Nnn This looses the 'natural' approach to case changing inside a tl setting and requires separate 'set a tl with case changing' and 'typeset case changed text' functions. *LICR/Non-native input* The original implementation for the expl3 functions only case changes letters. Adding an 'escape' to cover e.g. \emph also allows coverage of things like "\'{e}" and so it was natural to consider LICR input. I have therefore extended the code to allow coverage of everything handled by \MakeUppercase when T1/T2A/T2B/T2C/T4/T5/LGR encodings are in use. There is of course a performance hit, but this should be comparable to that for processing letters. That then leaves the question of input outside of the ASCII range when using pdfTeX. It would I think be possible to do this using an approach detecting inputenc active chars, but I am reluctant to go this way (in the longer term it will be increasingly hard to justify using a 8-bit program as the world standardises on Unicode). With inputenc loaded case changing does work if the input goes via LICR \documentclass{article} \usepackage[utf8]{inputenc} \usepackage{expl3} \makeatletter \ExplSyntaxOn \cs_generate_variant:Nn \tl_upper_case:n { V } \cs_new_protected:Npn \MakeExplUpperCase #1 { \group_begin: \protected@edef \l_tmpa_tl {#1} \tl_upper_case:V \l_tmpa_tl \group_end: } \ExplSyntaxOff \makeatother \begin{document} \MakeExplUpperCase{Héllo} \end{document} Again, this has a link to expandability. *Naming* As noted in previous mails on this topic, the naming here (\tl_...) at least in part reflects the fact this code is difficult name. Any better naming schemes welcome! *Conclusions* The current code works but there are open questions. What I am hoping for is feedback on the ideas and in particular what issues come up with real use cases. Ideas about all or any of the above, or indeed other aspects, most welcome. -- Joseph Wright