LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

Options: Use Forum View

Use Monospaced Font
Show HTML Part by Default
Condense Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Sender:
Mailing list for the LaTeX3 project <[log in to unmask]>
Date:
Fri, 18 Jul 2003 03:16:32 +0200
Reply-To:
Mailing list for the LaTeX3 project <[log in to unmask]>
Subject:
MIME-Version:
1.0
Content-Transfer-Encoding:
7bit
In-Reply-To:
Content-Type:
text/plain; charset=us-ascii
From:
Joachim Schrod <[log in to unmask]>
Parts/Attachments:
text/plain (144 lines)
>>>>> "BV" == Boris Veytsman <[log in to unmask]> writes:
JS> From: Joachim Schrod <[log in to unmask]>

JS> I still like its markup syntax much more than
JS> XML for several reasons that are really off topic here.

BV> Are you sure about the off topic part? I certainly would like to
BV> know these reasons.

OK, you've got me. The real answer would be a paper on the inability
to do pure semantic markup in most situations, and about tag economy,
i.e., the relationship of amount of markup in a document compared to
the amount of text. It would also cover the necessity to have either
full support by editing environments, or be able to enter and maintain
markup manually in "standard" editors. Perhaps a presentation at a TeX
conference... :-)

But then, I'll try a shorter answer:

 -- Good typesetting needs ``micromarkup'', things that TeX does with
    "~", "\ ", "\,", etc. One can imagine semantic markup for many of
    these items, but the amount of markup definitions and the
    cognitive load to use the correct markup would be too high for
    almost all authors. There are also issues where semantic markup
    gets difficult, DEK's examples of ~ usage in the TeXbook provide
    good examples for that. Overall, I like the economy of input here:
    "~", "\,", or "--" are better to read and doesn't disturb the
    input as much as &nbsp;, &spatium;, or &ndash;. Just imagine this
    email written with XML entities... ;-) IMHO, the length of a tag
    should be related to its importance: long tags for important
    things, short tags for unimportant but necessary stuff.

 -- Space handling in TeX is more "natural" than in XML. Not in
    macros, mind you, but in document text. As an example, I like to
    be able to use blank lines to separate paragraphs, as you can see
    in this email. This is a markup tradition since decades, and it
    has proven to be useful. As another example, I also like that
    multiple blanks collapse to one; that drives me mad in Word.

 -- I like the possibility to be able to introduce non-standard TeX
    markup for special situations. E.g., in the TeX Directory
    Standard, we used markup like

      \begin{tdsSummary}
        bibtex/           \BibTeX{} input files
        . bib/            \BibTeX{} databases
        . . base/         base distribution (e.g., \path|xampl.bib|)
        . . misc/         single-file databases
        . <package>/      name of a package
      \end{tdsSummary}

    In the document source, the directory structure is much easier to
    read and to maintain than

      <tdsSummary>
        <entry>
          <directory>bibtex</directory>
          <description>\BibTeX{} input files</description>
        </entry>
       [...]
        <entry>
          <directory><subdir/><variable>package</variable></directory>
          <description>name of a package</description>
        </entry>
      </tdsSummary>

    In the current source, one spots errors immediately (e.g., how
    many <subdir/>s). That would be lost in XML markup. Of course, I'm
    biased since I designed the markup and wrote the macros. :-) SGML
    provided DATATAG for that, but this was thrown out to make the
    parser's writer life easier. Umpf, how many parser writers do we
    have, compared with the number of authors?

 -- TeX math markup is easier to write and to read then MathML.
    Mathematicians can also use its flexibility to introduce arbitrary
    new expressions in their "natural language math".

 -- Editor support for (La)TeX source input is better than for XML.
    Actually, this is a very difficile and difficult topic that would
    need a paper in itself. Please note that this reflects my current
    view on the state of available tools; there's nothing to prevent
    anybody creating better XML editors -- they're promised since
    years, but they don't arrive. Actually, there are good XML
    document editors like Framemaker; but they're not as
    platform-independent as I would need them. (For the record, I
    tried many editors, and currently use psgml-mode in XEmacs. But
    it's not as good as AUC-TeX.)

 -- An often cited reason to use XML markup instead of TeX is the
    better support for validation and transformation of XML documents.
    But IMHO this is overemphasized, it is not needed as often as we
    discuss it. Most XML documents that I've seen are not even
    conformant to some schema, therefore one needs special transform
    scripts for more document classes that one thinks at the start of
    an XML project.

    This is from my practical experience in introducing XML in
    multinational large companies for mission-critical documents.
    There it was even very hard to achieve agreement on structures for
    formal documents like service level agreements -- the ad-hoc
    markup that may be used for informal documents is good for
    nothing. Hell, corporate users even don't use Word document styles
    when they're available and prefer to klick on their bold and
    italics button or change the type size directly. That's the
    reality I'm doing business in.

    Of course, there are XML validators out there -- one only has to
    fight with the inability to express completely sensible document
    structures in DTDs or schemas. The resulting document structure
    definitions are either very complex or very generic. Style sheets
    for complex schemas are very hard to write, e.g., that's one of
    the reasons why we don't have good support for high-quality
    Docbook output. Validation of very generic structures doesn't
    bring enough advantages, then valid documents are still nonsense.

    Last, but not least: If markup validation is really so important,
    one can and should spend effort to make a TeX validator available.
    There are several TeX parser implementations out there -- I wrote
    one myself in two weeks. (Btw, presented at the TUG conference in
    Santa Barbara, years ago.) They can be utilized with sensible
    effort.

 -- Actually, IMO the main disadvantage of TeX markup is the shortage
    of skillfull people in the job market to implement that markup.
    That makes any manager worth his salary shy away from TeX. For me,
    that's the main reason to use XML, I find more people with the
    needed skills.

But it's late and I should stop here. I hope you got an impression of
my viewpoint. As I've written above, a full elaboration is beyond the
scope of this email discussion.

Cheers,
        Joachim

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Joachim Schrod                                  Email: [log in to unmask]
Roedermark, Germany

        ``How do we persuade new users that spreading fonts across the page
        like peanut butter across hot toast is not necessarily the route to
        typographic excellence?''                       -- Peter Flynn

ATOM RSS1 RSS2