On 20/05/2014 03:09, Will Robertson wrote:
> Dear all,
>
> I'm writing to continue some recent discussions in other
forums on
> unicode math and how: * unicode-math.sty current
(incorrectly) deals
> with it, and * how we envisage official support for unicode
math in
> LaTeX into the future.
>
> * * *
>
> In short, here's the issue as I see it.
>
> LaTeX has a default maths font plus families such as \mathrm,
> \mathit, and \mathbf to choose new alphabets. These maths
fonts are
> expressly and strictly only allowed to be text fonts,
contrary to
> their name. This allows people to write things like
\mathit{Re} for
> Reynolds number and \mathbf{Set} in category theory (thanks
David C
> for the example).
I think the "contrary to their name" requires some comment (since
it sort of relates
to what these commands do and whether they should be changed)
I think \mathbf selecting a text font looks odd to _you_ because you
are approaching it
from the implementation point of view and looking at the math font
tables etc.
At the use level the command names are more to do with use than
implementation.
\mathbf selects a bold font in math mode. hence its name.
The exact font it selects is something of an implementation detail
and the fact that it is
technically a text font is no more strange than in the default setup
$ 1 + a $
the 1 comes from a roman text font and a comes from a math italic
font.
If you look at the plain tex definitions of \bf and \it they are two
definitions packed in to one
\def\it{\fam\itfam\tenit}
with \fam\itfam just working in math mode and \tenit just working in
text mode.
\mathit is just the math part of \it
\textit is just the text part of \it (with an extra \hbox in math
mode to ensure you are in text mode.)
So I think the \mathxx names have caused more confusion amongst font
package authors than they have for users...
>
> The \mathbf command in particular has been abused in physics
to
> denote vectors and matrices, such as \mathbf{B} for magnetic
field. I
> suspect the situation is similar for sans math, with tensors
using
> sans on occasion but no doubt in other contexts used for
multi-letter
> identifiers. (Examples more than welcome; in fact,
requested.)
I don't think that is abuse, it was the intended use.
Multi-letter identifiers are common in category theory (and often
with different fonts for different categories)
other areas where they are common is math mode used in other
fields, notably computer science where
variable names in pseudo code often match the variable names used in
the real code, where single letters
are frowned on.
grepping the tl2014 tree shows things such as
pbsheet/pbsheet.cls:\newcommand{\covf}[1]{\mathbf{Cov}_{#1}}
zed-csp/zed-csp.sty:\def \ELSE {\mathrel{\mathbf{else}}}
> In contrast, Unicode math
defines a number of alphabets in a single
> Unicode font, including mathematical italic and bold
mathematical
> italic and many more variations. In OpenType maths fonts to
date,
> these symbols are all designed as single-letter identifiers
and not
> to be used for strings of characters such as "Re" in italic
or "Set"
> in bold.
>
> So originally unicode-math simply mapped the unicode
alphabets onto
> the LaTeX commands (with nice options and so on for choosing
your
> style of bold and ensuring greek "just works"), and while all
the
> style options and normalisation were nice and work (IMO)
well, the
> choice of overwriting \mathbf and so on has led to obvious
problems.
>
These work for the somewhat idiosyncratic character ranges in the
math alphabet block,
but gives you bold Greek but not bold Cyrillic or a bold aleph
and gives bold digits 0-9 but not italic digits, etc.
Also there is a bold roman not not a standard roman typeface so you
still need the traditional switch for \mathrm
> * * *
>
> I'd first like to apologise for the inconvenience that
unicode-math
> has caused up until this point -- I do hope to "fix" it soon,
> whatever "fix" means now. The rest of this email is largely a
summary
> of approach taken by unicode-math and how it can be fixed,
expressed
> in plain (Unicode-)TeX (see attached and run it through
XeTeX; you'll
> need Linux Libertine installed but feel free to replace it
with any
> font with both roman and greek text glyphs).
>
> 0. Colours are used to ensure what you're seeing is correct
and not a
> side-effect of the underlying Plain math machinery.
>
> 1. \mathbf and friends go back to simply selecting a text
font. Note
> that they still need to remap \mathcode{}s in this case
because
> normal unicode math glyphs exist all the way up in Plane 1
where text
> fonts daren't to tread.
>
> 2. Greek input into \mathbf and friends does "work" -- if
what you
> were after were Greek letters from a text font.
>
> 3. To get proper bold symbols, including Greek, we'll need a
whole
> new set of commands. These will need sensible names of some
sort.
> Below I've chosen \symbf, etc., which doesn't look too bad to
me.
Or for font families that have a bold font (as required for
\boldmath) you could support
a much larger collection of bold characters with proper math
sidebearings
by switching to that family. (rather as \bm package does, but that
has to take care not
to run out of fam by simply loading normal and bold weight of
everything) but
here you need less fam (as each holds a lot more symbols) and you
have access
to a lot more of them as you get 255 rather than 16 (I think:-)
>
> 4. If how I've implemented unicode-math looks wrong to you,
I'd
> appreciate suggestions :) Switching \mathcode{}s like this
isn't
> super fun but * doesn't seem too slow, and * I'm not aware of
> anything else flexible and cross-platform enough that will
work
> instead.
I think it is a useful technique to access the math alphabet block
if there is no alternative,
but I think it is required a lot less than your message indicates.
One thing I note is that if you just change mathcodes then explicit
character tokens are affected
but not \mathchardef tokens or characters accessed by \mathchar.
I note you defined \alpha as
\def\alpha{α} rather than as
\mathchardef\alpha="010B
for this reason. Perhaps that's no bad thing and we should deprecate
\mathchardef
as an optimisation not needed this century, it also has the benefit
that \alpha works like α
and can be used in text as well as math.
But it is quite a big change (and would have a knock on effect on
packages that set up
new math fonts which would need clear guidelines on which commands
to use....
It is possible to make chardef etc work as in \bm by inspecting each
token and modifying the codes in place,
but that requires doing an expansion pass over the whole expression
to expose the primitive \chardef values,
which is slow and fragile compared to just looping through the
mathcodes at the start.
But people do use bm quite a bit and expect \bm{arbitrary-stuff) to
make stretch delimiters and symbols etc bold,
so a kind of bold that can not be done using \mathbf to a text bold
font or by mapping to that bold math alphabet range.
I think when people do not want \mathbf and ask for a bold symbol,
they often want something more like \bm
that makes all symbols bold. for various reasons bm doesn't work at
all in unicode engines at the moment
It doesn't know about the extended mathchar possibilities (easily
fixable) and is incompatible with unicode-math
(I was waiting to see what you'd do:-)
David
>
> 5. I'm largely happy to make a breaking change to
unicode-math.sty
> itself, but comments on whether an "overwrite" mode which
functions
> as the package currently does would be sensible as a long
term thing
> would be appreciated. IMO, it doesn't make much sense to have
a
> separate text font that is only used for bold math
identifiers that
> aren't real single-letter symbols -- in such cases it would
surely be
> sensible to use (perhaps a variation on) \textbf.
>
> Best regards, Will
>