LISTSERV - LATEX-L Archives - LISTSERV.UNI-HEIDELBERG.DE

LATEX-L Archives

Mailing list for the LaTeX3 project

LATEX-L@LISTSERV.UNI-HEIDELBERG.DE

	LISTSERV Archives
	LATEX-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Forum View Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: Invitation for discussion: My suggestion for a LaTeX3 syntax
From:	Martin Hensel <[log in to unmask]>
Reply To:	Mailing list for the LaTeX3 project <[log in to unmask]>
Date:	Mon, 7 Jul 2003 16:31:16 +0100
Content-Type:	text/plain
Parts/Attachments:	text/plain (51 lines)

> These paragraphs made quite clear that the author didn't know a
> thing about TeX constraints (and is erroneous about space handling
> in HTML and XML as well). Obviously somebody who is new to
> technical details of existing markup languages.
>
> So the probability to find something worthwile in the rest of the
> text was not high enough to spend the time reading further.

Could you please explain to me, where I'm wrong with HTML and XML?

I wrote:
,-----[ syntax.pdf ]-----
| In languages like HTML, XML, and most programming languages spaces
| are treated as following: Line breaks are considered as spaces,
| two or more spaces are considered as a single space.
`-----

The HTML specification
,-----[ http://www.w3.org/TR/html401/struct/text.html ]-----
| only the following characters are defined as white space
| characters:
| - ASCII space (&#x0020;)
| - ASCII tab (&#x0009;)
| - ASCII form feed (&#x000C;)
| - Zero-width space (&#x200B;)
| Line breaks are also white space characters.
: ...
| For all HTML elements except PRE, sequences of white space
| separate "words" (we use the term "word" here to mean "sequences
| of non-white space characters"). When formatting text, user agents
| should identify these words and lay them out according to the
| conventions of the particular written language (script) and target
| medium.
: ...
| For example, in Latin scripts, inter-word space is typically
| rendered as an ASCII space (&#x0020;),
`-----

The XML specification
,-----[ http://www.w3.org/TR/REC-xml ]-----
| S (white space) consists of one or more space (#x20) characters, |
carriage returns, line feeds, or tabs.
|
| White Space
|    [3]    S    ::=    (#x20 | #x9 | #xD | #xA)+
`-----

Martin

ATOM RSS1 RSS2

LISTSERV.UNI-HEIDELBERG.DE
Universität Heidelberg \| Impressum \| Datenschutzerklärung