Print

Print


Hello,

I got questions, thus I try to explain it a little more verbose:

On Tue, May 30, 2006 at 05:56:49PM +0200, Heiko Oberdiek wrote:

> On Thu, Apr 27, 2006 at 08:22:51PM +0200, Heiko Oberdiek wrote:
>
> * I have written a wrapper file that merges all the ltnews??.tex
>   documents: ltnews.pdf contain all LaTeX News issues 1--17.

Now base-tds.zip contains
    doc/latex/base/ltnews.pdf
instead of
    doc/latex/base/ltnews01.pdf
    doc/latex/base/ltnews02.pdf
    ...
    doc/latex/base/ltnews17.pdf

(The source/latex/base directory is not changed,
the files ltnews01.tex, ltnews02.tex, ... remain there.)

> * To get smaller pdf file sizes I have experimented with
>   destinations:
>   * Unused destinatins are removed.
>   * The destination names are renamed to shorter names.
>     I am using all bytes (0-255) except for
>     - 0: dangerous for PDF viewers that work with C strings (xpdf?)
>     - 13: carriage return needs quoting, otherwise it would be
>           normalized to 10 (line feed).
>     - 40, 41, 92 (parentheses, backslash): needs quoting
>     - 255: avoids unicode marker at string start
>     I would be interested, if there are problems with the links/outlines.
>     (Apart from links that unhappily point to the baseline.)

Consider an HTML page foo.html with a lot of internal links:
    <a name="section.1">...</a> or <a ref="#codline.127">...</a>
In a similar way destinations are used in the PDF file.
Additionally the names are also stored in a data structure with the
function of a search tree.

The correct spelling of the destination name does matter, if someone
wants use the destination externally: http://somewhere/foo.html#section.1
But in case of the documents that I have generated for latex-tds, these  
destination names are (randomly) choosen by hyperref, not a property     
of the document. You cannot rely that the destination name for the       
implementation section is "section.4", it can be "section*.27" or        
"chapter.3" or ...

Thus I made the assumption the spelling of the destinations are not
intended for external referencing. This allows two optimizations:  
* Unused destinations are removed.
* The destination names must still unique, but I can choose shorter
  names, e.g. instead of
      section.1, section.23, codline.127   (30 bytes)
  the shortest possible, but unique names are used:  
      A, B, C                                (3 bytes)
  The HTML example from above would be transformed to:
      <a name="A">...</a> or <a ref="#C">...</a>

Example: file size decrease of source2e.pdf: around 4.5 %.

Yours sincerely
  Heiko <[log in to unmask]>