[Tickets #1609] NEW: Incorrect encoding by MIME::encode() on some
UTF-8 strings
bugs at bugs.horde.org
bugs at bugs.horde.org
Tue Mar 22 15:53:48 PST 2005
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/?id=1609
-----------------------------------------------------------------------
Ticket | 1609
Created By | horde at ndn.no
Summary | Incorrect encoding by MIME::encode() on some UTF-8 strings
Queue | Horde Base
Version | 3.0.3
State | Unconfirmed
Priority | 1. Low
Type | Bug
Owners |
-----------------------------------------------------------------------
horde at ndn.no (2005-03-22 15:53) wrote:
While investigating a problem with the norwegian character "Å" (big "å"),
causing incorrectly encoded headers when sending mail with UTF-8 (but not
ISO-8859-1), i tracked it to line 142 in lib/Horde/MIME.php:
$size = preg_match_all('/([^\s]+)([\s]*)/', $text, $matches,
PREG_SET_ORDER);
In my case, adding the Unicode option (/u) to the regex solved the problem:
$size = preg_match_all('/([^\s]+)([\s]*)/u', $text, $matches,
PREG_SET_ORDER);
It seems preg_match_all does not always handle multibyte characters (e.g.
norwegian Å). On a system with PHP 4.3.10, Apache/1.3.33, and, the bug
appeared, as shown by this Amavis alert with "Åretur" as the subject:
X-Amavis-Alert: BAD HEADER Non-encoded 8-bit data (char 85 hex) in message
header 'Subject'
Subject: Re: =?utf-8?b?ww==?=\205retur\n
A var_dump of $matches would show the mangled first character as the first
entry in the array, with "retur" in the second entry.
On another system running PHP 4.3.9, Apache/1.3.31 the bug did NOT appear.
I'm not sure whether this is a bug with other character sets, or whether
turning on multibyte character support in PHP would solve the problem.
More information about the bugs
mailing list