[Tickets #1621] non-ASCII 7-bit message headers not RFC2047-encoded
bugs at bugs.horde.org
bugs at bugs.horde.org
Fri Mar 25 16:00:54 PST 2005
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/?id=1621
-----------------------------------------------------------------------
Ticket | 1621
Updated By | windhamg at email.arizona.edu
Summary | non-ASCII 7-bit message headers not RFC2047-encoded
Queue | IMP
Version | HEAD
State | Feedback
Priority | 2. Medium
Type | Bug
Owners | Michael Slusarz
-----------------------------------------------------------------------
windhamg at email.arizona.edu (2005-03-25 16:00) wrote:
Well, I tried the '[^\x00-\x7f]' regex pattern in is8bit(), but no dice. I
may be speaking ignorantly (in fact, it's very likely) but, even though we
are using a multibyte-aware regex function, this character set (ISO-2022-JP)
*is still* a 7-bit character set. How are we going to find byte values in
the range [\x80-\xff] in a 7-bit-byte character set?
I'm starting to think this is a lost cause...I placed some diagnostic output
in the String::regexMatch function and see that, even though the $charset
being passed in is "ISO-2022-JP", the resultant mb_regex_encoding() is
"EUC-JP".
IMHO, the root of this problem is that the MIME::encode function claims to
"Encode a string containing non-ASCII characters according to RFC 2047",
while it actually only encodes strings containing non-8bit characters.
Since non-8bit does not always imply ASCII, we need to find a good test of
"ASCII-ness". I can test for ISO-2022-JP using a regex like '\x1b[\(\$]',
but it would be nicer to have a more general test (if one exists) for
non-ASCII 7-bit encodings.
More information about the bugs
mailing list