Joseph Wright skrev 2014-03-31 17.48: > On 31/03/2014 13:56, Lars Hellström wrote: [snip] >>> - \__harmless_quat_split- (etc) - all missing an arg spec, which >>> looks w-type to me >> >> My thinking was that these are "constants" rather than "functions", and >> for that reason have no argspec part. Even if one considers them >> functions, they don't take any arguments, so why would they have a w? > > If they are constants then \c__harmless_ ... Now there's a point I missed! >>> I'd also suggest the 'sheep and goats' separation of all commands into >>> fully expandable or \protected. >> >> Not sure there could only be those two, but I can certainly change some >> \newcommand's into \DeclareRobustCommand's. > > There are very few commands that don't seem to fall into one of the two > categories. (I've perhaps got one set for siunitx, but in a very unusual > use case.) Most commands either: > > - Should/can work inside \edef => expandable > - Don't (assignments, ...) => \protected Well, I wonder how that would interact with \harmless_secure_expand:w. > Note this is nothing to do with \DeclareRobustCommand, as those are not > engine-robust. For the time being, I still support TeX 3. >>> (Note: moving the code to expl3 would require quite a bit of work, but >>> that's a different question.) >> >> Beyond the rote substitution of l3names for primitives and 2e core >> macros, is there something particular you think about? (I suspect the >> unimperative coding style could be one issue. ;-) > > A general impression, not least in that you've coded things by hand that > would be done using expl3 kernel functions. The other very obvious one > is that we don't use toks :-) Not at all?! Or just not the likes of \newtoks? > A full analysis would take some time! Understandable. > I forgot one thing before: as you are using the expl3 namespace in a > deliberate way, am I OK to add "harmless" to the prefix list with a note > about the use? Yes. >>> More broadly, there are big open questions that >>> some of this is linked to. One is LICR-type input. As you'll see when we >>> land our 'new' case changing functions, the team feeling on input >>> methods for *new* code is to stick close to the engines rather than to >>> the LaTeX2e approach. >> >> You mean a token with character code 229 (U+00E5) is more canonical than >> \r{a} as encoding of that character? That's actually close to the way >> harmless would do it (under the \HighCharsUnicode setting). For >> typesetting, there is conversely \UnicodeCharUsesChar. > > In that area, yes. Thinking is currently as follows: for UTF-8 work, > there are two engines which can do the job. Supporting non-UTF-8 input > for chars outside of the 8-bit range is something that 'new' code really > should avoid. That doesn't negate use of inputenc, etc., for 'real > world' cases now but does suggest that for new code we might want to > take a different take. For example, the current thinking is that for > 'text level' case-changing functions we will support only 'engine > native' input, which implies that at some stage we'd want to as you say > do mappings for UTF-8 engines going in the \r{a} => U+00E5 direction. > (At a user level, things like \r{a} for a 'one-off' accent remain useful > even with a UTF-8 engine.) As a data-point here, I should perhaps mention that I have for some years been using the \r and \" accents even when I write swedish text in LaTeX -- the reason being that I've remapped the ÅÖÄ keys to \{} since the latter are more frequently needed. And when you don't have a character conveniently available on the keyboard, it doesn't make that much difference that UTF-8 can encode it prefecty fine, if you cannot conveniently type it! Lars Hellström