On 20/05/2014 03:09, Will Robertson wrote:
> Dear all,
>
> I'm writing to continue some recent discussions in other forums on
> unicode math and how: * unicode-math.sty current (incorrectly) deals
> with it, and * how we envisage official support for unicode math in
> LaTeX into the future.
>
> * * *
>
> In short, here's the issue as I see it.
>
> LaTeX has a default maths font plus families such as \mathrm,
> \mathit, and \mathbf to choose new alphabets. These maths fonts are
> expressly and strictly only allowed to be text fonts, contrary to
> their name. This allows people to write things like \mathit{Re} for
> Reynolds number and \mathbf{Set} in category theory (thanks David C
> for the example).

I think the "contrary to their name" requires some comment (since it sort of relates
to what these commands do and whether they should be changed)

I think \mathbf selecting a text font looks odd to _you_ because you are approaching it
from the implementation point of view and looking at the math font tables etc.

At the use level the command names are more to do with use than implementation.
\mathbf selects a bold font in math mode. hence its name.
The exact font it selects is something of an implementation detail and the fact that it is
technically a text font is no more strange than in the default setup
$ 1 + a $
the 1 comes from a roman text font and a comes from a math italic font.

If you look at the plain tex definitions of \bf and \it they are two definitions packed in to one

\def\it{\fam\itfam\tenit}

with \fam\itfam just working in math mode and \tenit just working in text mode.

\mathit is just the math part of \it
\textit is just the text part of \it (with an extra \hbox in math mode to ensure you are in text mode.)

So I think the \mathxx names have caused more confusion amongst font package authors than they have for users...

>
> The \mathbf command in particular has been abused in physics to
> denote vectors and matrices, such as \mathbf{B} for magnetic field. I
> suspect the situation is similar for sans math, with tensors using
> sans on occasion but no doubt in other contexts used for multi-letter
> identifiers. (Examples more than welcome; in fact, requested.)

I don't think that is abuse, it was the intended use.

Multi-letter identifiers are common in category theory (and often with different fonts for different categories)
other areas where they are common is math mode used in other fields, notably computer science where
variable names in pseudo code often match the variable names used in the real code, where single letters
are frowned on.
grepping the tl2014 tree shows things such as

pbsheet/pbsheet.cls:\newcommand{\covf}[1]{\mathbf{Cov}_{#1}}
zed-csp/zed-csp.sty:\def \ELSE {\mathrel{\mathbf{else}}}

> In contrast, Unicode math defines a number of alphabets in a single
> Unicode font, including mathematical italic and bold mathematical
> italic and many more variations. In OpenType maths fonts to date,
> these symbols are all designed as single-letter identifiers and not
> to be used for strings of characters such as "Re" in italic or "Set"
> in bold.
>
> So originally unicode-math simply mapped the unicode alphabets onto
> the LaTeX commands (with nice options and so on for choosing your
> style of bold and ensuring greek "just works"), and while all the
> style options and normalisation were nice and work (IMO) well, the
> choice of overwriting \mathbf and so on has led to obvious problems.
>

These work for the somewhat idiosyncratic character ranges in the math alphabet block,
but gives you bold Greek but not bold Cyrillic or a bold aleph
and gives bold digits 0-9 but not italic digits, etc.
Also there is a bold roman not not a standard roman typeface so you still need the traditional switch for \mathrm

> * * *
>
> I'd first like to apologise for the inconvenience that unicode-math
> has caused up until this point -- I do hope to "fix" it soon,
> whatever "fix" means now. The rest of this email is largely a summary
> of approach taken by unicode-math and how it can be fixed, expressed
> in plain (Unicode-)TeX (see attached and run it through XeTeX; you'll
> need Linux Libertine installed but feel free to replace it with any
> font with both roman and greek text glyphs).
>
> 0. Colours are used to ensure what you're seeing is correct and not a
> side-effect of the underlying Plain math machinery.
>
> 1. \mathbf and friends go back to simply selecting a text font. Note
> that they still need to remap \mathcode{}s in this case because
> normal unicode math glyphs exist all the way up in Plane 1 where text
> fonts daren't to tread.
>
> 2. Greek input into \mathbf and friends does "work" -- if what you
> were after were Greek letters from a text font.
>
> 3. To get proper bold symbols, including Greek, we'll need a whole
> new set of commands. These will need sensible names of some sort.
> Below I've chosen \symbf, etc., which doesn't look too bad to me.

Or for font families that have a bold font (as required for \boldmath) you could support
a much larger collection of bold characters with proper math sidebearings
by switching to that family. (rather as \bm package does, but that has to take care not
to run out of fam by simply loading normal and bold weight of everything) but
here you need less fam (as each holds a lot more symbols) and you have access
to a lot more of them as you get 255 rather than 16 (I think:-)

>
> 4. If how I've implemented unicode-math looks wrong to you, I'd
> appreciate suggestions :) Switching \mathcode{}s like this isn't
> super fun but * doesn't seem too slow, and * I'm not aware of
> anything else flexible and cross-platform enough that will work
> instead.

I think it is a useful technique to access the math alphabet block if there is no alternative,
but I think it is required a lot less than your message indicates.

One thing I note is that if you just change mathcodes then explicit character tokens are affected
but not \mathchardef tokens or characters accessed by \mathchar.

I note you defined \alpha as

\def\alpha{α} rather than as

\mathchardef\alpha="010B

for this reason. Perhaps that's no bad thing and we should deprecate \mathchardef
as an optimisation not needed this century, it also has the benefit that \alpha works like α
and can be used in text as well as math.
But it is quite a big change (and would have a knock on effect on packages that set up
new math fonts which would need clear guidelines on which commands to use....

It is possible to make chardef etc work as in \bm by inspecting each token and modifying the codes in place,
but that requires doing an expansion pass over the whole expression to expose the primitive \chardef values,
which is slow and fragile compared to just looping through the mathcodes at the start.
But people do use bm quite a bit and expect \bm{arbitrary-stuff) to make stretch delimiters and symbols etc bold,
so a kind of bold that can not be done using \mathbf to a text bold font or by mapping to that bold math alphabet range.

I think when people do not want \mathbf and ask for a bold symbol, they often want something more like \bm
that makes all symbols bold. for various reasons bm doesn't work at all in unicode engines at the moment
It doesn't know about the extended mathchar possibilities (easily fixable) and is incompatible with unicode-math
(I was waiting to see what you'd do:-)

David

>
> 5. I'm largely happy to make a breaking change to unicode-math.sty
> itself, but comments on whether an "overwrite" mode which functions
> as the package currently does would be sensible as a long term thing
> would be appreciated. IMO, it doesn't make much sense to have a
> separate text font that is only used for bold math identifiers that
> aren't real single-letter symbols -- in such cases it would surely be
> sensible to use (perhaps a variation on) \textbf.
>
> Best regards, Will
>