Print

Print


 > > who will? the user groups? for many lanugages there isn't a user group
 >
 > There are many interested experts around for those languages without a
 > user group. One of the gathering places is the Omega mailing list.

i know that, but that doesn't mean any of those groups and neither the user
groups are necessarily qualified to decide a standard.

just pick the random example from my varioref package: i have language support
in there from users and this gets changed every now and then because i get
claims that such and such is not the right phrasing. how should I decide if
people from a single country claim their wording isn't sounding correct?

and changing the default midway (as i did in case of varioref several times)
is really bad since it is making old document invalid. but i had to change
because it turned out that one or the other phrasing was indeed incorrect

you can argue that a standard defined by those people interested is better
than none. but it is also try that if at all possible you should stick with a
default once decided. so the problem is to find out when you are likely to
have enough data to make a decision

so to come back to inputencs (which the above really was about):

 - right now LaTeX by default lets 8bit chars pass if inputenc is not
   loaded. this is an unfortunate fact of life and no package and only a
   kernel modification would change that and within 2e there will be no such
   kernel modification, so with that we have to live for the moment.

 - but i consider this really problematical because the upper part of 8bit is
   unknown territory and i do not subscribe to Thierry's approach of using
   straight 8bit plus a T1 encoded font and hope all works out well. it is
   true that for certain languages (including Thierry's and my own) it does
   work if i'm on the right kind of computer but for others it does not and it
   certainly wouldn't work if the font encoding mechanisms would be extended
   to allow switching encodings according to font availability as suggested.

 - one can summarize the current situation as follows: it defines a default
   which is "pass whatever is coming straight to the font encoding" and that
   requires the used input encoding and the font encoding to be the same and
   it limits the use of fonts very very drastically. it is a straight
   extension of what Don did with 7bit with the slight difference that for
   7bit most keyboard encodings are identical

I would propose that a follow up kernel (call it ltx3 or whatever, eg a
consolidated version emerging from the currently developed x... packages one
day), would by default make the upper half an error if no input encoding is
specified. Sorry Thierry :-) but you shouldn't feel that bad about it a) i'm
known to change by mind and b) processors are that fast these days that you
can really work without problems with something like inputenc you will not
notice it.

in that case only by specifying a input/keyboard encoding you get access to
using 8bit characters but at the same time you are assured that the document
contains all the necessary information to actually process it correctly
elsewhere and you do not have the potential problem, reported by Éric, that
users do not notice that half their letters (ie those with accents)
vanished. they wouldn't, they would produce error messages.

now to provide default input encodings depending on language would help a
certain number of people to be able to leave out *one* line in the preamble of
the document (and if you are lucky with your choice, the larger part of the
LaTeX users) but at the same time would mean that people, who naively just use
any key on their keyboard but having an keyboard incompatible with the default,
would run in exactly the same problem  Éric reported: they would now get wrong
output without noticing. So then, perhaps not  Éric but somebody else would
rightly moan about such stupid defaults which make it likely that people get
incorrect documents. so in my opinion it there should be no default for input
encodings other than the one which is currently called "ascii" in inputenc and
which makes any 8bit an error.

the above is only about input encodings; as I said earlier the situation for
output encodings is different and there are already defaults in current Babel
and in the implementation i'm working on they will get more generalised trying
to take into account the problems discussed concerning the use or not use of
certain encodings for certain fonts.

the main problem i see with defaults for output encodings is that for languages
like French or German there isn't really a good default because you will have
always a large user group which is dead against one or the other, eg T1 viz
OT1 for other languages it is simpler. however this is more a political than a
technical question, ie who doesn't like THEM the day they make X for language
Y the default ... :-)

frank