[imp] 2.2.4-cvs forces charset encoding in HTTP headers

Rich Lafferty rich@horde.org
Sun, 29 Oct 2000 23:36:04 -0500


On Mon, Oct 30, 2000 at 11:11:15AM +0700, Alain Fauconnet (alain@cscoms.net) wrote:
> On Sun, Oct 29, 2000 at 10:35:11PM -0500, Rich Lafferty wrote:
> > 
> > Well, since Thai isn't English :-), there'd be no way for it to get it
> > right in the first place (since a browser won't know how to display
> > mixed Thai and US English). If I'm mistaken and one *can* display the
> > ASCII character set in the Thai locale,
> 
> I'm afraid you are :-), actually the Thai characters occupy  only  the
> high portion (>127) of the character set. For the lower  part,  normal
> ASCII characters are used.

Oh, okay. :-) I probably should have figured that out from it being an
8859 subset, but I somehow stored Thai in with CJVK in terms of
complexity.
 
> > No, because then the browser decides what locale to use, which is even
> > more unreliable. In particular, if you get it to work on a Windows
> > machine, then it won't work on a Mac, and vice versa. That's why that
> > change (outputting explicit charset headers) was made.
> 
> Well... letting end-users eventually decide which  character  encoding
> they  want  to  use  in case there is no perfect solution doesn't look
> that  bad to me, at least for encodings which still have the plain old
> ASCII in the lower (<128) part of the character map.

Ah, but the users aren't really the ones deciding; I can't tell my Mac
to use the Windows character set by default. If you think that not
displaying Thai causes support issues, you should hear what happens in
a 50/50 mixed Mac/PC office when fixing for one breaks the other. :-)

> That's  why I still this that this "use US english messages, don't mess
> with encoding" locale setting would be a good idea.

The previous configuration, in which it was left up to the browser,
broke in particularly ungraceful ways. That's why it was fixed in
2.2.3. IMP's got a remarkably thorough localization facility, but it
doesn't support more than one language at once, because browsers don't
either. What happens when someone who has their Thai-configured
netscape and US-English configured IMP receives a message with two
body parts, one in ISO-8859-2 and the other in EUC-KR? Other than
doing a separate document (read: separate frame) for each part (Ick!),
you have to decide on one character set. We choose to use the
character set of the language that the user asked IMP to use at the
login screen or in preferences.

On the server side, IMP already allows you to set up a specific locale
for your unique situation, so it's not particularly productive to
explicitly account for your situation and not the millions of other
multiple-language combinations that users might wish to try without
setting up a locale for. Of course, the wonders of open source are
such that you can also fix it in the way that you described at your
site -- and it'd be relatively easy to maintain that change across
upgrades, I think, too. 

  -Rich

-- 
------------------------------ Rich Lafferty ---------------------------
 Sysadmin/Programmer, Instructional and Information Technology Services
   Concordia University, Montreal, QC                 (514) 848-7625
------------------------- rich@alcor.concordia.ca ----------------------