LATEX-L Archives

Mailing list for the LaTeX3 project


Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Mailing list for the LaTeX3 project <[log in to unmask]>
Tue, 17 Mar 2020 09:27:37 +1300
Mailing list for the LaTeX3 project <[log in to unmask]>
text/plain; charset=utf-8
Henri Menke <[log in to unmask]>
text/plain (57 lines)
On 16/03/20, 17:32, Joseph Wright wrote:
> On 16/03/2020 17:01, Kelly Smith wrote:
> > Hello!
> > 
> > I’ve been thinking: since Lua is already involved in the build process,
> > by way of l3build, wouldn’t it be reasonable to use a lua script
> > to preprocess Unicode data into forms that are easily consumed by LaTeX
> > during the format-building process?
> > 
> > Warmly,
> > Kelly
> > 
> It depends on the outcome you are after.
> The original loading method for Unicode data in XeTeX was via a Perl script.
> That created a .tex file containing (for example) catcode data. To update
> the Unicode data, one had to run the Perl script, then send the processed
> files to CTAN. There were two issues. First, that meant that any change
> required active work to not only get the data from Unicode but also to
> manipulate it. Second, and more significant, it was *slower* than just
> reading the files in TeX. (This only became apparent when I wrote some test
> parsers.)
> Now, there is more data being loaded today than when I did that work, and
> some of it is in LuaTeX so could be done Lua-only. It's also possible that
> the Perl script was sub-optimal, or that as part of a general 'install'
> function the time would not really show. However, XeTeX needs the data, so
> one is still looking at having to explicitly pre-process in Lua. Moreover,
> most of the time taken for format-building is not about reading Unicode
> data. With LuaTeX, pre-loading expl3 does cut out a slight 'stall' when
> loading everything for case-changing, but having a LuaTeX and a XeTeX path
> separately is not attractive.

Is there any distribution that doesn't have LuaTeX in the default
installation?  (Apart from exotic things like TeX Live infra-only)  Then
it would be conceivable to just make LuaTeX a hard requirement and
process the Unicode data on the fly instead of going via CTAN.

> The current set-up means that updating the Unicode files is just a question
> of copy-pasting the raw .txt files into a form that CTAN can accept.
> Pre-digesting still leaves us needing some way to co-ordinate between
> packages (format, luaotfload, expl3, specialist stuff), plus with having to
> do the explicit extraction.
> As format-building is all about saving time for 'normal' runs, I'm not
> seeing there is a massive need to speed up the process. I know there is one
> engine in development that doesn't use format files, so that might be a
> place to consider things, but I think we'd need a strong case to alter the
> approach for XeTeX/LuaTeX (pdfTeX, ...).

Are you referring to JSBox?  I doubt that this will every be public.

Cheers, Henri

> Joseph