LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Classic View

Use Proportional Font
Show Text Part by Default
Condense Mail Headers

Topic: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender: Mailing list for the LaTeX3 project <[log in to unmask]>
Date: Tue, 20 Apr 2021 11:36:57 +0100
Reply-To: Mailing list for the LaTeX3 project <[log in to unmask]>
Message-ID: <[log in to unmask]>
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
In-Reply-To: <[log in to unmask]>
Content-Type: text/plain; charset=utf-8; format=flowed
From: Joseph Wright <[log in to unmask]>
Parts/Attachments: text/plain (177 lines)
Hello,

> I have started to look at the implementation of the LaTeX required
> primitives supplementary to TeX + e-TeX ones and I have some questions.
> 
> Note: I work with the definitions found in the pdfTeX manual and will
> not look at the code since it is GPL'ed one and I don't want someone
> later to claim that my code shall be released under GPL because my eyes
> have touched GPL code.

If you want to avoid the source of pdfTeX, you are likely best starting 
with descriptions of the 'cross engine' primitives that have been 
implemented by other engines.

I would start with:

- l3names. This makes copies of all primitives from all engines 
supported by expl3 into a single namespace. What's important here is 
that we've worked hard to identify which \pdf... primitives are actually 
PDF-related. Only those that relate to PDFs rtain the 'pdf' part of the 
name (*). See lines 605 onward of the current file 
(https://github.com/latex3/latex3/blob/5b6e6f30397930ba96a2b6744171263b370504d5/l3kernel/l3names.dtx#L605)

(*) The 'version' primitives \pdftexbanner, \pdftexrevision and 
\pdftexversion  retain 'pdftex' in their names when saved.

- The XeTeX manual. As XeTeX doesn't directly product PDF output,
   any primitives it provides are not PDF-specific, naming
   notwithstanding. (Newer additions to XeTeX omit the \pdf..., older
   ones do not)

- The e-pTeX setup in 
https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/eptexdir/pdfutils.ch?view=log, 
which provides the 'utility' primitive. (There is code in this file, but 
the first ~30 lines are purely descriptive)

> PDF only primitives that I will not implement (correct?):
> 
> \pdfpageheight
> \pdfpagewidth

These work (no error) in DVI mode with pdfTeX, but more importantly in 
XeTeX write information of xdvipdfmx to the XDV file. I would suggest 
having them at least as no-ops.

> Primitives that seem highly linked to PDF and that, to be implemented,
> would need to draw in not ISO C routines but POSIX or MS specific
> because they are linked to filesystems:
> 
> \pdfcreationdate

This one is linked to the PDF, but not the file system: it's what is 
written to the PDF for /CreationDate.

> \pdffilemoddate

This extracts data about a general file from the file system: it's not 
linked to PDF at all. It might be used for example to check on an image 
or data file.

> Primitives that seem highly linked to PDF but that can be implemented
> using ISO C routines:
> 
> \pdffiledump
> \pdfmdfivesum
> \pdffilesize ? (size can be more than integer accepted by TeX)

Again, these are not linked to PDF at all: they all work in DVI mode. 
You could look at the Lua implementation of these used by expl3 - that 
avoids any GPL code. (We want the same *output* as pdfTeX gives, not the 
same implementation.)

I don't think I've ever seen a test of what happens if \pdffilesize is 
used on a truly massive file.

> Primitives whose output seem unclear:
> 
> \pdfnormaldeviate
> \pdfuniformdeviate
> 
> said to generate "an integer" and, afterwards said to expand "to a list
> of tokens". Do they provide, on each call, one integer with the defined
> property or a series of a defined cardinal of such integers or an
> infinite series of such integers until interrupted?

Each use gives one integer. For example, if I do

     \pdfsetrandomseed 1234
     \message{
       \pdfuniformdeviate 1000\space\space
       \pdfuniformdeviate 1000\space\space
       \pdfuniformdeviate 1000\space\space
       \pdfnormaldeviate \space
       \pdfnormaldeviate \space
       \pdfnormaldeviate \space
     }

I get "555 3 641 34071 -99169 33759".

Note that the code for the RNG here comes from MetaPost, and was lightly 
modified to put it into pdfTeX. *Exactly* the same implementation is 
used by XeTeX, pTeX, upTeX and LuaTeX, means that with the same random 
seed the same series of values is obtained independent of the engine 
used. I would strongly urge you to look at the original MetaPost code (I 
believe not GPL).

> The pdf prefix exists because the primitives were added to TeX in pdfTeX
> but it is unfortunate because it seems to link the primitives to the
> PDF format. I'd like to implement them without the pdf prefix and
> provide simply a file for compatibility with \let statements. Would it
> be OK?

As noted, XeTeX has adopted a 'no \pdf...' approach for newer stuff, 
e.g. \strcmp rather than \pdfstrcmp. That works fine provided the names 
are not too generic.

> \engine: a read-only ASCII token identifying the engine: ex.: e-TeX;

This is a very generic name. Moreover, the established pattern is that 
engine define \<name>version and \<name>revision, which can then be used 
as markers for the specific engine. Note that upTeX doesn't do that, and 
that makes identifying that engine more tricky already.

> \apimajor: a positive or nul integer identifying a definite list of
> 	primitives provided;
> 
> \apiminor: a positive or nul integer identifying a definite behavior
> 	(nature, parameters, output, errors) of the primitives;
> 
> \apirevision: a positive or nul integer identifying an implementation
> 	of the (major, minor) combination---the major and minor identify a
> 	contract; the revision identifies a modification of the
> 	implementation of the contract including a correction because the
> 	implementation didn't actually provide the contract. I.e.: bug
> 	fixes, optimizations etc. without any contract modification. The
> 	contract is theoretical.

Again, this just feels like \<engine>version/\<engine>revision: see for 
example how older pdftex.def used to check for available features. 
(Standardisation of engine features over recent years means that this is 
generally not needed.)

> \outfmtlist: a series of ASCII tokens identifying the output format
> 	supported by the engine. Ex.: DVI1.0 (traditional DVI), PDF1.3 etc.
> 	The default format shall be listed first. (Note: I plan, some day,
> 	to extend DVI.)

pdfTeX already defines \pdfoutput, which is 0 for DVI and 1 for PDF. 
LuaTeX renames that to \outputmode but with the same numerical values. 
The version of PDF written by pdfTeX/LuaTeX is set separately using (in 
pdfTeX) \pdfminorversion/\pdfmajorversion, as this is really a separate 
concept to whether PDF or DVI is in use.

There is very little use at the macro level for the DVI level. The PDF 
level does have some impact on output features but in a simply 'Sorry, 
not doable' sense.  Note that XeTeX uses XDV, which is a version of DVI 
dedicated to this engine. It's not necessary to test the DVI version at 
the macro level: what's important is for example which method to include 
imagines, which uses an engine test.

> \outfmtset: setting the output format, that shall be amongst the formats
> 	supported. If not, it returns an error and set the output format to
> 	the default one. Shall be set before \shipout and errors if used
> 	after output has started.
>
> \outfmt: a token identifying the current output format.

See above: data in the same format as other engines is strongly preferred.

I notice you don't mention a large number of the other utility 
primitives. A full list of those assumed by LaTeX in an engine-neutral 
sense is listed in 
https://www.latex-project.org/news/latex2e-news/ltnews31.pdf. See in 
particular that \pdfpage(height|width) are included.

Joseph

ATOM RSS1 RSS2