LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

On 31/03/2014 17:38, Lars Hellström wrote:
>>>> I'd also suggest the 'sheep and goats' separation of all commands into
>>>> fully expandable or \protected.
>>>
>>> Not sure there could only be those two, but I can certainly change some
>>> \newcommand's into \DeclareRobustCommand's.
>>
>> There are very few commands that don't seem to fall into one of the two
>> categories. (I've perhaps got one set for siunitx, but in a very unusual
>> use case.) Most commands either:
>>
>>   - Should/can work inside \edef =>  expandable
>>   - Don't (assignments, ...) =>  \protected
> 
> Well, I wonder how that would interact with \harmless_secure_expand:w.

Haven't read all of the code: just did an overview. The observation of
the team to date, which also seems to match with the ConTeXt experience,
is that a division does appear to work.

>> Note this is nothing to do with \DeclareRobustCommand, as those are not
>> engine-robust.
> 
> For the time being, I still support TeX 3.

Your decision: you do know e-TeX was finalised 15 years ago ;-)

>> A general impression, not least in that you've coded things by hand that
>> would be done using expl3 kernel functions. The other very obvious one
>> is that we don't use toks :-)
> 
> Not at all?! Or just not the likes of \newtoks?

At an interface level, no toks at all (there are a few internal special
cases). The reasoning is that toks are hard to explain: two types of
similar variable (macros and toks), except ... With e-TeX, we can say that:

  - Anything can be stored in a macro with

      \edef\foo{\unexpanded{<stuff>}}

 - Expansion can always be controlled with

      \unexpanded\expandafter{\macro}

Wrapping that up inside some interfaces, we don't need toks at all.

(Bruno uses toks by number for l3regex within a group structure, and of
course at a base layer there are some raw things to do with \every...
and so on. However, none of this shows up in the interfaces, which all
use macro-based storage of tokens.)


>>> You mean a token with character code 229 (U+00E5) is more canonical than
>>> \r{a} as encoding of that character? That's actually close to the way
>>> harmless would do it (under the \HighCharsUnicode setting). For
>>> typesetting, there is conversely \UnicodeCharUsesChar.
>>
>> In that area, yes. Thinking is currently as follows: for UTF-8 work,
>> there are two engines which can do the job. Supporting non-UTF-8 input
>> for chars outside of the 8-bit range is something that 'new' code really
>> should avoid. That doesn't negate use of inputenc, etc., for 'real
>> world' cases now but does suggest that for new code we might want to
>> take a different take. For example, the current thinking is that for
>> 'text level' case-changing functions we will support only 'engine
>> native' input, which implies that at some stage we'd want to as you say
>> do mappings for UTF-8 engines going in the \r{a} =>  U+00E5 direction.
>> (At a user level, things like \r{a} for a 'one-off' accent remain useful
>> even with a UTF-8 engine.)
> 
> As a data-point here, I should perhaps mention that I have for some
> years been using the \r and \" accents even when I write swedish text in
> LaTeX -- the reason being that I've remapped the ÅÖÄ keys to \{} since
> the latter are more frequently needed. And when you don't have a
> character conveniently available on the keyboard, it doesn't make that
> much difference that UTF-8 can encode it prefecty fine, if you cannot
> conveniently type it!

As I said, there is still discussion to have in this area and at the
user level things like \r{A} make sense. However, that's not the same as
saying they should be handled by e.g. case-changing functions, which
with a UTF-8 engine can deal with e.g. Å readily.
--
Joseph Wright