[horde] switching to UTF8 database

Tue Feb 7 22:39:19 UTC 2012

Zitat von Andrew Morgan <morgan at orst.edu>:

> On Tue, 7 Feb 2012, Jan Schneider wrote:
>
>>
>> Zitat von Andrew Morgan <morgan at orst.edu>:
>>
>>> Is this really a problem for latin1 to utf8?  If there are no  
>>> conversions that result in a double-byte character, then I think  
>>> it would be okay.
>>
>> That has nothing to do with the charset though. If you data only  
>> contains ascii data, it doesn't matter to which charset you  
>> convert, indeed. As soon it latin1-specific characters, it will  
>> break.
>
> Can you explain why?  How does the string length change?  Perhaps an  
> example would help me understand.  :)

latin1, windows-1251, iso-8859-1, etc. are single-byte charsets, each  
character is a byte.
UTF-8 is multi-byte charset with variable character lengths. ASCII  
characters in UTF-8 are still single bytes with the same code like in  
US-ASCII, but any non-ASCII characters like äöuáé etc take at least  
two bytes.

Jan.

-- 
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/