[dev] Auto Detect Language (with patch)

Mike Pelley mike at pelley.com
Mon Sep 29 15:29:14 PDT 2003


I noticed that the current language auto-detect code for the login page of
Horde does not deal with multiple entries correctly.  The list of Accepted
Languages in my browser is:

en-ca, en-gb, en

meaning to indicate that I'd prefer Canadian English first, British English
second and any English third.

The current autodetect code in NLS.php selects en-us with the above
settings, as opposed to en-gb.  There are two "bugs" related to this.

First, the HTTP_ACCEPT_LANGUAGE string actually looks like this from my
browser:

en-ca,en-gb;q=0.7,en;q=0.3

The current code does not strip out the ;q=0.7 "quality" values, so the
secondary languages can never match (assuming the first entry does not have
a quality value).

Second, the logic in the current code favours the prefix of the first entry,
in this case en, over an exact match for the second entry, en-gb.  Since en
is translated to en_US, that is what is delivered to my browser.

The attached patch strips the quality values and tries for an exact match
against all entries before falling back to a prefix match.  I've also
removed a superfluous check against the gettext list in _map, since the
Accept-Language header (which requires a dash) cannot match a gettext entry
(which requires an underscore), unless there are some non-standard browsers
out there.

I did not attempt to sort the languages based on the quality values, which
would be more strictly correct.  It appears that browsers tend to put them
in order of priority anyway though.  On another note, I did not remove the
prefix matching that _map does, even though it does not line up with RFC
2616 - prefix selection is legal for the user but not for the server.
Looking at Internet Explorer though, I can see that many users will select
"en-ca" or whatever and not add "en" at the end, so it's probably for the
best.

Thanks!  Mike.
-------------- next part --------------
Index: NLS.php
===================================================================
RCS file: /usr/cvsroot/horde/horde/lib/NLS.php,v
retrieving revision 1.53
diff -u -r1.53 NLS.php
--- NLS.php	16 Sep 2003 23:06:14 -0000	1.53
+++ NLS.php	29 Sep 2003 20:19:56 -0000
@@ -52,20 +52,30 @@
             /* The browser supplies a list, so return the first valid one. */
             $browser_langs = explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']);
             foreach ($browser_langs as $lang) {
+                /* Strip quality value for language */
+                if (($pos = strpos($lang, ';')) !== false) {
+                        $lang = substr($lang, 0, $pos);
+                }
                 $lang = NLS::_map(trim($lang));
                 if (NLS::isValid($lang)) {
                     $language = $lang;
                     break;
-                } elseif (NLS::isValid(NLS::_map(substr($lang, 0, 2)))) {
-                    $language = NLS::_map(substr($lang, 0, 2));
-                    break;
+                }
+                /* In case no full match, save best guess based on prefix */
+                if (!isset($partial_lang) &&
+                    NLS::isValid(NLS::_map(substr($lang, 0, 2)))) {
+                    $partial_lang = NLS::_map(substr($lang, 0, 2));
                 }
             }
         }
 
-        /* No dice auto-detecting, default to US English. */
         if (!isset($language)) {
-            $language = 'en_US';
+            if (isset($partial_lang)) {
+                $language = $partial_lang;
+            } else {
+                /* No dice auto-detecting, default to US English. */
+                $language = 'en_US';
+            }
         }
 
         return basename($language);
@@ -165,12 +175,6 @@
         require_once dirname(__FILE__) . '/String.php';
 
         $aliases = &$GLOBALS['nls']['aliases'];
-
-        // First check if the untranslated language can be found
-        if (array_key_exists($language, $aliases) &&
-            !empty($aliases[$language])) {
-            return $aliases[$language];
-        }
 
         // Translate the $language to get broader matches.
         // (eg. de-DE should match de_DE)


More information about the dev mailing list