## LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

 Options: Use Classic View Use Proportional Font Show Text Part by Default Show All Mail Headers Topic: [<< First] [< Prev] [Next >] [Last >>]

 XML vs. (La)TeX markup (was: XML, UTF-8 and TeX engines) Joachim Schrod <[log in to unmask]> Fri, 18 Jul 2003 03:16:32 +0200 text/plain (144 lines) >>>>> "BV" == Boris Veytsman <[log in to unmask]> writes: JS> From: Joachim Schrod <[log in to unmask]> JS> I still like its markup syntax much more than JS> XML for several reasons that are really off topic here. BV> Are you sure about the off topic part? I certainly would like to BV> know these reasons. OK, you've got me. The real answer would be a paper on the inability to do pure semantic markup in most situations, and about tag economy, i.e., the relationship of amount of markup in a document compared to the amount of text. It would also cover the necessity to have either full support by editing environments, or be able to enter and maintain markup manually in "standard" editors. Perhaps a presentation at a TeX conference... :-) But then, I'll try a shorter answer:  -- Good typesetting needs micromarkup'', things that TeX does with     "~", "\ ", "\,", etc. One can imagine semantic markup for many of     these items, but the amount of markup definitions and the     cognitive load to use the correct markup would be too high for     almost all authors. There are also issues where semantic markup     gets difficult, DEK's examples of ~ usage in the TeXbook provide     good examples for that. Overall, I like the economy of input here:     "~", "\,", or "--" are better to read and doesn't disturb the     input as much as  , &spatium;, or –. Just imagine this     email written with XML entities... ;-) IMHO, the length of a tag     should be related to its importance: long tags for important     things, short tags for unimportant but necessary stuff.  -- Space handling in TeX is more "natural" than in XML. Not in     macros, mind you, but in document text. As an example, I like to     be able to use blank lines to separate paragraphs, as you can see     in this email. This is a markup tradition since decades, and it     has proven to be useful. As another example, I also like that     multiple blanks collapse to one; that drives me mad in Word.  -- I like the possibility to be able to introduce non-standard TeX     markup for special situations. E.g., in the TeX Directory     Standard, we used markup like       \begin{tdsSummary}         bibtex/ \BibTeX{} input files         . bib/ \BibTeX{} databases         . . base/ base distribution (e.g., \path|xampl.bib|)         . . misc/ single-file databases         . / name of a package       \end{tdsSummary}     In the document source, the directory structure is much easier to     read and to maintain than                           bibtex           \BibTeX{} input files                 [...]                    package           name of a package                     In the current source, one spots errors immediately (e.g., how     many s). That would be lost in XML markup. Of course, I'm     biased since I designed the markup and wrote the macros. :-) SGML     provided DATATAG for that, but this was thrown out to make the     parser's writer life easier. Umpf, how many parser writers do we     have, compared with the number of authors?  -- TeX math markup is easier to write and to read then MathML.     Mathematicians can also use its flexibility to introduce arbitrary     new expressions in their "natural language math".  -- Editor support for (La)TeX source input is better than for XML.     Actually, this is a very difficile and difficult topic that would     need a paper in itself. Please note that this reflects my current     view on the state of available tools; there's nothing to prevent     anybody creating better XML editors -- they're promised since     years, but they don't arrive. Actually, there are good XML     document editors like Framemaker; but they're not as     platform-independent as I would need them. (For the record, I     tried many editors, and currently use psgml-mode in XEmacs. But     it's not as good as AUC-TeX.)  -- An often cited reason to use XML markup instead of TeX is the     better support for validation and transformation of XML documents.     But IMHO this is overemphasized, it is not needed as often as we     discuss it. Most XML documents that I've seen are not even     conformant to some schema, therefore one needs special transform     scripts for more document classes that one thinks at the start of     an XML project.     This is from my practical experience in introducing XML in     multinational large companies for mission-critical documents.     There it was even very hard to achieve agreement on structures for     formal documents like service level agreements -- the ad-hoc     markup that may be used for informal documents is good for     nothing. Hell, corporate users even don't use Word document styles     when they're available and prefer to klick on their bold and     italics button or change the type size directly. That's the     reality I'm doing business in.     Of course, there are XML validators out there -- one only has to     fight with the inability to express completely sensible document     structures in DTDs or schemas. The resulting document structure     definitions are either very complex or very generic. Style sheets     for complex schemas are very hard to write, e.g., that's one of     the reasons why we don't have good support for high-quality     Docbook output. Validation of very generic structures doesn't     bring enough advantages, then valid documents are still nonsense.     Last, but not least: If markup validation is really so important,     one can and should spend effort to make a TeX validator available.     There are several TeX parser implementations out there -- I wrote     one myself in two weeks. (Btw, presented at the TUG conference in     Santa Barbara, years ago.) They can be utilized with sensible     effort.  -- Actually, IMO the main disadvantage of TeX markup is the shortage     of skillfull people in the job market to implement that markup.     That makes any manager worth his salary shy away from TeX. For me,     that's the main reason to use XML, I find more people with the     needed skills. But it's late and I should stop here. I hope you got an impression of my viewpoint. As I've written above, a full elaboration is beyond the scope of this email discussion. Cheers,         Joachim -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Joachim Schrod Email: [log in to unmask] Roedermark, Germany         How do we persuade new users that spreading fonts across the page         like peanut butter across hot toast is not necessarily the route to         typographic excellence?'' -- Peter Flynn