[imp] 2.2.4-cvs forces charset encoding in HTTP headers

Alain Fauconnet alain@cscoms.net
Mon, 30 Oct 2000 11:11:15 +0700


On Sun, Oct 29, 2000 at 10:35:11PM -0500, Rich Lafferty wrote:
> On Mon, Oct 30, 2000 at 08:49:56AM +0700, Alain Fauconnet (alain@cscoms.net) wrote:
> > Hello,
> > 
> > We have switched from a 2.2.0-pre13 version with security fixes hacked
> > in, to the real 2.2.4-cvs on last Friday.
> > 
> > I've immediately got complaints saying that people here were unable to
> > view messages with Thai characters.  The  problem  is  that  2.2.4-cvs
> > sends a "charset=ISO-8859-1" in the HTTP headers  when  configured  to
> > use US English. This make Thai  characters  appear  as  west  European
> > accented characters.
> 
> Well, since Thai isn't English :-), there'd be no way for it to get it
> right in the first place (since a browser won't know how to display
> mixed Thai and US English). If I'm mistaken and one *can* display the
> ASCII character set in the Thai locale,

I'm afraid you are :-), actually the Thai characters occupy  only  the
high portion (>127) of the character set. For the lower  part,  normal
ASCII characters are used.

> just create a fake Thai locale
> with the English localization data but with the correct character set.

But this is the right suggestion if what is mentioned below really  is
not feasible.

>  
> > Could we have something like a "Generic" locale setting that  uses  US
> > English messages but does not force any charset  encoding ? 
> 
> No, because then the browser decides what locale to use, which is even
> more unreliable. In particular, if you get it to work on a Windows
> machine, then it won't work on a Mac, and vice versa. That's why that
> change (outputting explicit charset headers) was made.

Well... letting end-users eventually decide which  character  encoding
they  want  to  use  in case there is no perfect solution doesn't look
that  bad to me, at least for encodings which still have the plain old
ASCII in the lower (<128) part of the character map.

Here,  people  configure  their  browsers to use the Thai font maps so
that the Thai characters will be  automatically  displayed  for  upper
character codes. They then rely on servers not forcing character maps,
which  seem  to  be  the  case  for most servers. At least for all the
zillion web mails they use.

That's  why I still this that this "use US english messages, don't mess
with encoding" locale setting would be a good idea.

Greets,
_Alain_
-- 
Alain FAUCONNET
Sr. System Administrator
CS Internet Co. Ltd. (Shin Corp) - Thailand