[horde] enable UTF-8 then cause these problems

Jan Schneider jan at horde.org
Tue Jan 21 02:46:17 PST 2003


Zitat von David Chang <david at thbuo.gov.tw>:

> 引用 Jan Schneider <jan at horde.org>:
> 
> > Zitat von David Chang <david at thbuo.gov.tw>:
> >
> > > Hi all:
> > > Today i update horde via cvs,I make these changes
> > > 1.Set horde/config/nls.php
> > >    $nls['conf']['enable_utf'] =  true ;
> > > 2.Chang php.ini like this:
> > >    mbstring.language = Chinese
> > >    mbstring.internal_encoding = UTF-8
> > >    mbstring.http_input = auto
> > >    mbstring.http_output = UTF-8
> > >    mbstring.func_overload = 7
> >
> > These ini settings aren't necessary anymore if you enable utf-8
> support.
> >
> > > 3.Restart apache+php 4.3
> > >
> > > After login into horde ,some places look right with Chinese format
> ,but
> > > some
> > > are not.I thought maybe mb_string not be used in some place,so cause
> such
> > > problem.
> > > There are
> > >
> > >     1.imp->option->Personal Information.
> > >       (Original data stored in Mysql with BIG-5 format)
> > >     2.nag/mnemo/kronolith's category name.
> > >       (Original data stored in Mysql with BIG-5 format)
> > >     3.Calendar's name,Notepad's name & Task Lists's name
> > >        (Original data stored in Mysql with BIG-5 format)
> >
> > These are stored in the preferences. Should be fixed now.
> 
> after update via cvs,this morning
> 1. charset problem fixed,but original personal information were lost.
> 2. Fixed.
> 3. Calendar's name,Notepad's name & Task Lists's name still in wrong
> format.
> 
> mean while some problems were found,there are:
> a. Original filter rule in IMP were lost.

All these problems (1-3 + a) depend on the fact that some data get stored
serialized.

This is a general "problem" so I ask this to all developers: As mentioned
earlier it seems to be save to store "raw" serialized data in the storage
backends as PHP doesn't seem to have any problems unserializing even
multibyte strings.

Storing the serialized data unconverted into the backend has advantages as
well as disadvantages to converting the data to the backend's internal
charset before storing it:
Advantage: Every data can be stored, no matter what charset the current user
has.
Disavantage: The data get broken as soon as the user changes his charset,
for example by switching the translation or by the admin turning utf support on.

What do we want to do?

> b. Some of mail's subject,body still in wrong format in IMP even they
> were send
> through in Chinese format via Yahoo webmail. Please ref
> http://www.thbuo.gov.tw/~txg16/tmp/920120.jpg ,also i have attached one
> of
> these mail .

These messages (at least the one you attached) have invalid message headers.
Email headers containing non-7bit characters have to be mime encoded before
being passed to the mta. If they don't the result is unpredictable. My imap
server (Cyrus) for example replace all non-7bit characters with "X".
You were able to read these headers in the past because you happened to have
the same charset as the people sending you the message.

> c. Chinese folders does not look correct after UTF-8 used.Can IMP
> automaticly
> handle folder name to transfer into UTF-8 ? Or user got manualy rename
> every
> folder name one by one ?

This should work automatically. I didn't have any problems creating Chinese
named folders here with IMP or with other MUAs and seeing them correctly
displayed in IMP.
I use some new features of the mbstring extension to handle this, but if it
doesn't work (because your PHP is too old), it should fall back to the old
encoding method. 

That means if you used to be able to create Chinese folder names you still
should. If you never were but want to do so, you should upgrade your PHP.

Jan.

--
http://www.horde.org - The Horde Project
http://www.ammma.de - discover your knowledge
http://www.tip4all.de - Deine private Tippgemeinschaft


More information about the horde mailing list