[i18n] Horde IMP in Japanese

Jan Schneider jan at horde.org
Sun Aug 17 16:48:43 PDT 2003


Zitat von Takeshi Morishima <tm at onepost.net>:

>     character encoding to UTF-8. (this is good :-) However, all other
>     translated messages now appears garbled. (*sigh*..)  snapshot:
>     <http://www.ml-search.com/horde/horde/imp-snapshot-20030816.gif>

This should be fixed now.

>     The menu/header parts appeared to be still encoded in Shift JIS.
>     My current thought is that this is because gettext output is still
>     in Shift JIS since the default Windows locale says so.  For some
>     reason, bind_textdomain_codeset() is not available on my Windows
>     version of PHP4, which is probably causing _() output codeset not
>     properly converted to UTF-8.  (bind_textdomain_codeset() not
>     called in setTextdomain() in NLS.php.  No idea how to get around

You will still get the translation's original charset instead of UTF-8 if
the server system don't know UTF-8 charsets as ja_JP.UTF-8 (like Windows).
In this case the result of system calls would still be in the locale's
original charset and we can't determine which strings to convert to UTF-8
and which not. Thus we simply fall back to the original charset (Shift-JIS
in the case of Japanese).

>   o For .span) appearing in menu/shortcut, I think the issue is that
>     Shift JIS is processed in two separate characters in string
>     functions like String::length()/substr(), and some non-ascii byte
>     is validated as a valid access key.  With mbstring turned on,
>     probably it works with UTF-8 (which may end up finding no valid
>     access key for multibyte only string), and so with EUC-JP, but not
>     with Shift JIS due to PHP/mbstring internal encoding restriction.
>     (Shift JIS is one of charsests that mbstring cannot handle as
>     internal encoding.)  Thus if the above bind_textdomain_codeset
>     issue goes away, this issue will probably be masked.

This was a seperate issue but indeed a bug in our code to generate access
keys that has been fixed now.

>     However, maybe Horde can make this a bit more robust.
>
>     An easist way would be to automatically turn off the access key
>     generation when user charset is multibyte, but it is not ideal
>     since multibyte environment will never be able to take advantage
>     of the access key.  Another way would be just to add a check if
>     access key candidates is a valid keyboard strokable key in
>     respective modules like getAccessKey().  This still will not
>     provide access key when string is entirely multibyte, but some

I changed the code so that only ascii characters are used for access keys
now. We currently search for keys by blindly looking at the single byte
characters of the word and use the first ascii character we find that
hasn't been used before. Thus you don't see any access keys if you use
Japanese with UTF-8 because all bytes used for Japanese characters in UTF-8
are outside the ascii bounds while the bytes used in Shift-JIS are indeed
inside the ascii range.

>     label may.  A better way may be to use gettext processing _()
>     after getAccessKey is processed, i.e. use an English text for when
>     finding an access key, and get actual translated label afterward.
>     For example, if "Accounts" is translated, label is set to
>     "Accounts" for getAccessKey() purpose and the access key is
>     determined as "A", then it is translated according to locale using
>     _() when sending the page back to the user's browser.  (This is
>     major change though.)

This would definitely the best solution.

>   o One last small note, translation.php causes some errors when it is
>     invoked in a directory where its directory path contains some
>     white spaces.  In my case "c:\Program Files\" in cygwin window.

Some of these should be fixed now, but I probably didn't catch all. If you
still get some errors, please post the whole command you entered.

> Questoins:
>
>   o Can somebody tell if my problem with bind_textdomain_codeset with
>     Windows versoin of PHP4 (php-4.3.2-installer.exe from php web -
>     libintl-1.dll is dated on Dec 27, 2002) is local config issue.
>     i.e. does it work for everyone other than me?

This is no problem, see above.

>   o Does anybody know if there is any way that I can override the
>     default char codeset that gettext()/_() uses with another one
>     without bind_textdomain_codeset().  In my case, default seems to
>     be Shift JIS (likely Windows default) and overriding this charset
>     with UTF-8 will prbably fix this issue.

See above.

Jan.

--
http://www.horde.org - The Horde Project
http://www.ammma.de - discover your knowledge
http://www.tip4all.de - Deine private Tippgemeinschaft


More information about the i18n mailing list