[Tickets #9617] Re: db_migrate and incorrect charset handling
bugs at horde.org
bugs at horde.org
Mon Apr 4 14:24:39 UTC 2011
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/9617
------------------------------------------------------------------------------
Ticket | 9617
Updated By | leena.heino at uta.fi
Summary | db_migrate and incorrect charset handling
Queue | Horde Framework Packages
Version | Git master
Type | Bug
State | Resolved
Priority | 1. Low
Milestone |
Patch |
Owners | Michael Rubinsky
------------------------------------------------------------------------------
leena.heino at uta.fi (2011-04-04 14:24) wrote:
>> PHP's manual suggest that one should not assume that
>> strtolower()/strtoupper() work correctly with
>> multibyte charset like utf-8.
>
> Where does it say that? I don't see any such suggestions in the man pages.
It does not it say it in so many words or at least says it
ambiguously: "Note that 'alphabetic' is determined by the current
locale"
But if we look at php's source code for strtoupper() it works by
bytes, therefore it will not work correctly with UTF-8 encoded strings
that contain non ascii characters.
Excerpt from ext/standard/string.c:
char *php_strtoupper(char *s, size_t len)
{
unsigned char *c, *e;
c = (unsigned char *)s;
e = (unsigned char *)c+len;
while (c < e) {
*c = toupper(*c);
c++;
}
return s;
}
The non ascii characters in UTF-8 are multi byte. Therefore using
php's strtoupper()/strtolower() will not work correctly with UTF-8
encoded strings with non ascii characters.
More information about the bugs
mailing list