[i18n] Horde IMP in Japanese

Takeshi Morishima tm at onepost.net
Sat Aug 16 19:28:11 PDT 2003


In message "Re: [i18n] Horde IMP in Japanese" on 2003/08/15, Jan
    Schneider writes:
  >
  > >   > > Also, notice those "./span)" garbage in shortcut (one on
  > >   > > the head menu and several of them in IMP head and footer
  > >   > > menu lines.)  These things are something we need to chase
  > >   > > for.
  > >   >
  > >   > These should have been fixed already a few weeks ago.
  > >
  > > OK, will look into it.  Do you know which module/files the changes
  > > were done?  I may need to look into CVS logs/diffs to find out what's
  > > wrong with my CVS sync.  (Or will try to search mailing list archive.)
  > 
  > No, sorry. But I can't reproduce this *with* mbstring/iconv
  > installed, so it might have something to do with these extensions
  > missing.

OK, since my spare time for this week is running out, I just summarise
what I found so far.  A few questions at the end of this mail in case
if someone has an answer - otherwise, will look into it when I have
time.  (Sorry, this is a bit long message.)

  o Indeed mbstring/iconv extentions were off (disabled due to another
    php-ware issue).  Turning them on, email message body part is now
    displayed properly.  Message texts encoded in UTF-8 was sent to
    browser with mime content type charset=UTF-8, the browser set its
    character encoding to UTF-8. (this is good :-) However, all other
    translated messages now appears garbled. (*sigh*..)  snapshot:
    <http://www.ml-search.com/horde/horde/imp-snapshot-20030816.gif>
    The menu/header parts appeared to be still encoded in Shift JIS.
    My current thought is that this is because gettext output is still
    in Shift JIS since the default Windows locale says so.  For some
    reason, bind_textdomain_codeset() is not available on my Windows
    version of PHP4, which is probably causing _() output codeset not
    properly converted to UTF-8.  (bind_textdomain_codeset() not
    called in setTextdomain() in NLS.php.  No idea how to get around
    this, other than finding a gettext extension version that provides
    bind_textdomain_codeset(). (c:/WINDOWS/system32/libintl-1.dll
    seems to contain bind_textdomain_codeset string, not sure what's
    wrong with this or my local config.)


  o For .span) appearing in menu/shortcut, I think the issue is that
    Shift JIS is processed in two separate characters in string
    functions like String::length()/substr(), and some non-ascii byte
    is validated as a valid access key.  With mbstring turned on,
    probably it works with UTF-8 (which may end up finding no valid
    access key for multibyte only string), and so with EUC-JP, but not
    with Shift JIS due to PHP/mbstring internal encoding restriction.
    (Shift JIS is one of charsests that mbstring cannot handle as
    internal encoding.)  Thus if the above bind_textdomain_codeset
    issue goes away, this issue will probably be masked.

    However, maybe Horde can make this a bit more robust.

    An easist way would be to automatically turn off the access key
    generation when user charset is multibyte, but it is not ideal
    since multibyte environment will never be able to take advantage
    of the access key.  Another way would be just to add a check if
    access key candidates is a valid keyboard strokable key in
    respective modules like getAccessKey().  This still will not
    provide access key when string is entirely multibyte, but some
    label may.  A better way may be to use gettext processing _()
    after getAccessKey is processed, i.e. use an English text for when
    finding an access key, and get actual translated label afterward.
    For example, if "Accounts" is translated, label is set to
    "Accounts" for getAccessKey() purpose and the access key is
    determined as "A", then it is translated according to locale using
    _() when sending the page back to the user's browser.  (This is
    major change though.)


  o One last small note, translation.php causes some errors when it is
    invoked in a directory where its directory path contains some
    white spaces.  In my case "c:\Program Files\" in cygwin window.


Questoins:

  o Can somebody tell if my problem with bind_textdomain_codeset with
    Windows versoin of PHP4 (php-4.3.2-installer.exe from php web -
    libintl-1.dll is dated on Dec 27, 2002) is local config issue.
    i.e. does it work for everyone other than me?

  o Does anybody know if there is any way that I can override the
    default char codeset that gettext()/_() uses with another one
    without bind_textdomain_codeset().  In my case, default seems to
    be Shift JIS (likely Windows default) and overriding this charset
    with UTF-8 will prbably fix this issue.

Thank you,
Takeshi



More information about the i18n mailing list