Hello, and sorry for the long title (useful perhaps for searching purposes later on). There was recently a question on tex.stackexchange about writing a purely expandable version of LaTeX2e's \MakeUppercase. Joseph Wright and me posted two answers with different interpretations of uppercasing, and he asked me to transfer the discussion to this list. For the code, see http://tex.stackexchange.com/questions/10805/ and in particular our two answers. His method yields "\Uppercase{Som{e } {te{x}t} with $math$.}" -> "SOMe te{x}t WITH $math$." Mine yields: "\Uppercase{Som{e } {te{x}t} with $math$.}" -> "SOM{E } {TE{X}T} WITH $MATH$." Two questions: - what precise behaviour do we want an uppercase function to have? Note that we could even provide hooks to let the user choose. (See near the bottom of this long email.) - what do you think of the advantages/drawbacks described below? == Joseph's way: (correct me if I didn't understand your code properly) - Time: ~50*NL, where L~26 is the number of letters and various accent tokens (\ae,\oe,etc), and N is the length of the string to be uppercased. - Number of expansions: O(NL)? - Braces disappear, and protect their argument against uppercasing. - Spaces are dropped at the start and end, kept in the middle. - The stuff between dollars is kept. - It expands its argument? - It does not pollute the macro namespace. It relies on comparing the current token with a, then b, etc., until z, for each token, and replacing it by the uppercase letter. If the token is not found, we keep it. The function that does the replacement looks like \prg_case_str:nnn {#1} { { a } { A } { b } { B } ... } {#1} So it has L lines, and is difficult to patch (i.e. if the user wants to add his custom accent, with a given uppercase behaviour, then he has to redefine the whole function). Although, I don't understand Joseph's code enough yet to be sure of this. == My way: - Time: ~100*N^2. - Number of expansions: 2. (thanks to an \ifcsname hack) - All spaces and braces are kept, but braces don't protect against uppercasing (can be changed). - Dollars could be taken care of. - It does not expand the argument at all. - It pollutes the macro namespace: uses L~26 macros. It relies on having one macro for each token that should be transformed by the case change. Namely, for uppercase, we would have defined the following case table: ... \tl_new:cn{UL_table_u_m}{M} ... \tl_new:cn{UL_table_u_\string\ae}{\AE} ... Then we read the tokens one by one. Say we see "\oe". If \UL_table_u_\oe is defined, then we use it. Otherwise, we put \oe. == Hooks It should not be too hard to give hooks to the user so that he can - decide the behaviour of braces - define some commands that "do things" (e.g. protect their argument against uppercasing) - others? == Final comments on namespace pollution I don't know if time is an issue or not there, and whether having more macros introduces an unacceptable overhead. Several times in the past, when trying to convert from a list of tokens to another, I found that putting each token in a \csname construction, and defining one macro per token made things very much easier. Possible issue: after `\let\?=?` and `\escapechar=-1\relax`, one cannot distinguish between `\?` and `?`. This idea of defining macros rather than comparing with a list of tokens makes the second method easily customizable: the user can define arbitrary "case-change" tables by setting the relevant macros \UC_table_mytable_<token>. That would lead to a "static" variant of \prg_case_str:nnn. Best regards, Bruno @Joseph: were you thinking of the expansion control part?