Hans Aberg <[log in to unmask]> writes: > >... > >I would also like to see somebody translate it to TEI and then compare > >the HTML and LaTeX formattings obtained chez Rahtz from TEI with the > >native GELLMU formattings. Actually, I am more interested in getting a copy for Info trees than in a TEI copy. (And there is now an SGML version of Texinfo thanks to Daniele Giacomini that formats to Texinfo.) I guess I thought that TEI fans might bite. I also believe that a DocBook version would prove useful inasmuch as Docbook is used by the Linux Documentation Project. > If you are in the need of various translations, have you tried using Flex > (lexical analyzer generator) and Bison (parser generator, or > compiler-compiler), see Are you saying that it's easier to code translations from XML using lex and yacc descendants rather than using standard XML tools such as sgmlspl, jade, or xt? I find that hard to believe. (Of course, the situation before 1996 was different.) [snip] > -- I use them together with C++, which is convenient as the latter has > standard string classes. Although I've written in C, I've never gotten into C++. Are there good regular expression libraries for C++? > One approach is to parse objects into something like the DOM (Document > Object Model, http://www.w3.org/), and then onto that hook a program that > can translate into several different formats. Of course, sgmlspl, jade, xt, and other standard sgml/xml tools provide good frameworks for translating into as many different formats as one likes by writing, respectively, Perl, DSSSL, and XSLT. (Possibly also it would be viable to use David Carlisle's xmltex followed by Eitan Gurari's tex4ht in which case one writes TeX.) The power of sgmlspl (though not the speed) can match that of any method except possibly when one wants to descend into CDATA segments. But then if one finds one's self tempted^{1} to do that (as one might, for example, in typesetting with TeX or LaTeX the name of TeX or LaTeX or even the ASCII character '~' from an XML document type that does not provide these things as empties^{2}), one should instead customize one's XML document type. -- Bill Notes: 1. There is one reasonable situation where descent into CDATA *should* take place: math mode contents need to be thoroughly parsed in translation to MathML from a document type that mathematical authors will find tolerable. But there is no issue of that type in connection with http://math.albany.edu:8010/glf/lfaq.xml although, alas, one will find <tex/>, <latex/>, and <tld/>. I wonder how some of these things would survive a double translation gellmu/article ---(hypothetical)---> TEI ----> LaTeX . 2. The default "article" document type for _regular_ GELLMU provides three character names for each of the 33 non-alphanumeric but printable ASCII characters. Each of those is at risk for some conceivable translation target. But an author may simply use one of these characters for itself when it is safe for both LaTeX and HTML. And, for example, by default the syntactic translator understands things like "\$" and "\{". If the syntactic translator's new internal verbatim (which becomes <verblist>, a list-like thing) is used (by calling the front gellmu-verblist for gellmu-trans), then 32 of of these 33 names are auto-generated (';' is omitted) from literal verbatim. Something almost identical happens to literal inline material like |*~$\| if "manmac" mode is enabled .