Hello,
> I have started to look at the implementation of the LaTeX required
> primitives supplementary to TeX + e-TeX ones and I have some questions.
>
> Note: I work with the definitions found in the pdfTeX manual and will
> not look at the code since it is GPL'ed one and I don't want someone
> later to claim that my code shall be released under GPL because my eyes
> have touched GPL code.
If you want to avoid the source of pdfTeX, you are likely best starting
with descriptions of the 'cross engine' primitives that have been
implemented by other engines.
I would start with:
- l3names. This makes copies of all primitives from all engines
supported by expl3 into a single namespace. What's important here is
that we've worked hard to identify which \pdf... primitives are actually
PDF-related. Only those that relate to PDFs rtain the 'pdf' part of the
name (*). See lines 605 onward of the current file
(https://github.com/latex3/latex3/blob/5b6e6f30397930ba96a2b6744171263b370504d5/l3kernel/l3names.dtx#L605)
(*) The 'version' primitives \pdftexbanner, \pdftexrevision and
\pdftexversion retain 'pdftex' in their names when saved.
- The XeTeX manual. As XeTeX doesn't directly product PDF output,
any primitives it provides are not PDF-specific, naming
notwithstanding. (Newer additions to XeTeX omit the \pdf..., older
ones do not)
- The e-pTeX setup in
https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/eptexdir/pdfutils.ch?view=log,
which provides the 'utility' primitive. (There is code in this file, but
the first ~30 lines are purely descriptive)
> PDF only primitives that I will not implement (correct?):
>
> \pdfpageheight
> \pdfpagewidth
These work (no error) in DVI mode with pdfTeX, but more importantly in
XeTeX write information of xdvipdfmx to the XDV file. I would suggest
having them at least as no-ops.
> Primitives that seem highly linked to PDF and that, to be implemented,
> would need to draw in not ISO C routines but POSIX or MS specific
> because they are linked to filesystems:
>
> \pdfcreationdate
This one is linked to the PDF, but not the file system: it's what is
written to the PDF for /CreationDate.
> \pdffilemoddate
This extracts data about a general file from the file system: it's not
linked to PDF at all. It might be used for example to check on an image
or data file.
> Primitives that seem highly linked to PDF but that can be implemented
> using ISO C routines:
>
> \pdffiledump
> \pdfmdfivesum
> \pdffilesize ? (size can be more than integer accepted by TeX)
Again, these are not linked to PDF at all: they all work in DVI mode.
You could look at the Lua implementation of these used by expl3 - that
avoids any GPL code. (We want the same *output* as pdfTeX gives, not the
same implementation.)
I don't think I've ever seen a test of what happens if \pdffilesize is
used on a truly massive file.
> Primitives whose output seem unclear:
>
> \pdfnormaldeviate
> \pdfuniformdeviate
>
> said to generate "an integer" and, afterwards said to expand "to a list
> of tokens". Do they provide, on each call, one integer with the defined
> property or a series of a defined cardinal of such integers or an
> infinite series of such integers until interrupted?
Each use gives one integer. For example, if I do
\pdfsetrandomseed 1234
\message{
\pdfuniformdeviate 1000\space\space
\pdfuniformdeviate 1000\space\space
\pdfuniformdeviate 1000\space\space
\pdfnormaldeviate \space
\pdfnormaldeviate \space
\pdfnormaldeviate \space
}
I get "555 3 641 34071 -99169 33759".
Note that the code for the RNG here comes from MetaPost, and was lightly
modified to put it into pdfTeX. *Exactly* the same implementation is
used by XeTeX, pTeX, upTeX and LuaTeX, means that with the same random
seed the same series of values is obtained independent of the engine
used. I would strongly urge you to look at the original MetaPost code (I
believe not GPL).
> The pdf prefix exists because the primitives were added to TeX in pdfTeX
> but it is unfortunate because it seems to link the primitives to the
> PDF format. I'd like to implement them without the pdf prefix and
> provide simply a file for compatibility with \let statements. Would it
> be OK?
As noted, XeTeX has adopted a 'no \pdf...' approach for newer stuff,
e.g. \strcmp rather than \pdfstrcmp. That works fine provided the names
are not too generic.
> \engine: a read-only ASCII token identifying the engine: ex.: e-TeX;
This is a very generic name. Moreover, the established pattern is that
engine define \<name>version and \<name>revision, which can then be used
as markers for the specific engine. Note that upTeX doesn't do that, and
that makes identifying that engine more tricky already.
> \apimajor: a positive or nul integer identifying a definite list of
> primitives provided;
>
> \apiminor: a positive or nul integer identifying a definite behavior
> (nature, parameters, output, errors) of the primitives;
>
> \apirevision: a positive or nul integer identifying an implementation
> of the (major, minor) combination---the major and minor identify a
> contract; the revision identifies a modification of the
> implementation of the contract including a correction because the
> implementation didn't actually provide the contract. I.e.: bug
> fixes, optimizations etc. without any contract modification. The
> contract is theoretical.
Again, this just feels like \<engine>version/\<engine>revision: see for
example how older pdftex.def used to check for available features.
(Standardisation of engine features over recent years means that this is
generally not needed.)
> \outfmtlist: a series of ASCII tokens identifying the output format
> supported by the engine. Ex.: DVI1.0 (traditional DVI), PDF1.3 etc.
> The default format shall be listed first. (Note: I plan, some day,
> to extend DVI.)
pdfTeX already defines \pdfoutput, which is 0 for DVI and 1 for PDF.
LuaTeX renames that to \outputmode but with the same numerical values.
The version of PDF written by pdfTeX/LuaTeX is set separately using (in
pdfTeX) \pdfminorversion/\pdfmajorversion, as this is really a separate
concept to whether PDF or DVI is in use.
There is very little use at the macro level for the DVI level. The PDF
level does have some impact on output features but in a simply 'Sorry,
not doable' sense. Note that XeTeX uses XDV, which is a version of DVI
dedicated to this engine. It's not necessary to test the DVI version at
the macro level: what's important is for example which method to include
imagines, which uses an engine test.
> \outfmtset: setting the output format, that shall be amongst the formats
> supported. If not, it returns an error and set the output format to
> the default one. Shall be set before \shipout and errors if used
> after output has started.
>
> \outfmt: a token identifying the current output format.
See above: data in the same format as other engines is strongly preferred.
I notice you don't mention a large number of the other utility
primitives. A full list of those assumed by LaTeX in an engine-neutral
sense is listed in
https://www.latex-project.org/news/latex2e-news/ltnews31.pdf. See in
particular that \pdfpage(height|width) are included.
Joseph
|