[horde] switching to UTF8 database

Vilius Šumskas vilius at lnk.lt
Tue Feb 7 21:22:13 UTC 2012


Sveiki,

Tuesday, February 7, 2012, 11:16:33 PM, you wrote:


> Zitat von Simon Brereton <simon.brereton at buongiorno.com>:

>> On 7 February 2012 15:54, Andrew Morgan <morgan at orst.edu> wrote:
>>> On Tue, 7 Feb 2012, Simon Brereton wrote:
>>>
>>>> 2012/2/7 Vilius ?umskas <vilius at lnk.lt>:
>>>>>
>>>>> Sveiki,
>>>>>
>>>>> Tuesday, February 7, 2012, 8:17:28 PM, you wrote:
>>>>>
>>>>>> On Tue, 7 Feb 2012, Vilius ?umskas wrote:
>>>>>
>>>>>
>>>>>>> Sveiki,
>>>>>>>
>>>>>>> Tuesday, February 7, 2012, 12:34:57 AM, you wrote:
>>>>>>>
>>>>>>>> I'm running Horde v3.3.12 (I know, upgrade to Horde 4 is on my todo
>>>>>>>> list)
>>>>>>>> with MySQL v5.1.  My Horde database is an old install that has been
>>>>>>>> upgraded many times.
>>>>>>>
>>>>>>>
>>>>>>>> The character set in horde/config/conf.php is:
>>>>>>>
>>>>>>>
>>>>>>>> $conf['sql']['charset'] = 'iso-8859-1';
>>>>>>>
>>>>>>>
>>>>>>>> and my collation on the database and tables in MySQL are set to
>>>>>>>> "latin1_swedish_ci".
>>>>>>>
>>>>>>>
>>>>>>>> I'd like to switch to UTF8.  Are these steps sufficient:
>>>>>>>
>>>>>>>
>>>>>>>> 1. Alter MySQL database and table collations to "utf8_unicode_ci"
>>>>>>>
>>>>>>>
>>>>>>>> 2. Set $conf['sql']['charset'] = 'utf-8';
>>>>>>>
>>>>>>>
>>>>>>> It  depends  what data is stored. See my older messages regarding that
>>>>>>> on this mailing list. Or was it on dev at lists.horde.org?
>>>>>
>>>>>
>>>>>> I searched via Google but nothing obvious turned up.  If you have a
>>>>>> timeframe and a listname, I'd be happy to go search the archives.
>>>>>
>>>>>
>>>>> Mainly these treads:
>>>>>
>>>>> http://lists.horde.org/archives/dev/Week-of-Mon-20110718/026367.html
>>>>> http://lists.horde.org/archives/dev/Week-of-Mon-20101213/025567.html
>>>>>
>>>>> You  can  also  search  the  net  for  general tips and tricks how to
>>>>> convert   data/charset   in   MySQL.   This  is really related more to
>>>>> MySQL itself than to Horde.
>>>>
>>>>
>>>> I don't know if this helps - but when I was migrating to Horde4 I
>>>> coupled it with a change in data/charset encoding.  Since I wanted to
>>>> use the H3 db, what I did was:
>>>>
>>>> Convert to UTF8:
>>>> mysql -u root -p -B -N  -u root -p -v --execute="select CONCAT('alter
>>>> table ',TABLE_SCHEMA,'.',TABLE_NAME,' CONVERT to CHARACTER SET uft8
>>>> COLLATE utf8_general_ci;') from information_schema.TABLES WHERE
>>>> TABLE_SCHEMA != 'information_schema';" > alter.sql
>>>
>> I should have warned you that I'm not an expert.  I'm a copy and paste
>> monkey with better than average google skills..
>>
>>> Any reason you are using utf8_general_ci instead of utf8_unicode_ci?  I read
>>> that utf8_unicode_ci is a more accurate collation, but can be slower.
>>
>> There's your answer ;)  I went for the speed option.
>>
>>>> vim alter.sql
>>>> remove any mysql/mysqlbkup table lines
>>>>
>>>> Apply the new Character Set
>>>> mysql -u root -p -v < alter.sql
>>>> /etc/init.d/mysql stop
>>>> /etc/init.d/mysql start
>>>
>>>
>>> Why did you stop/start MySQL?
>>
>> Clear out any cached queries basically.  And because it felt right
>> after making a blanket change like that.  Mysql would have complained
>> like a beast if anything had gone wrong, so it was a chance to look
>> for that.
>>
>>
>>>> Then I installed Horde4 and pointed it to the same DB (actually a
>>>> copy, just in case things went fubar) and then after configuring each
>>>> Horde application (IMP, Turba, etc,) I just ran the update-schema
>>>> action from the Admin panel.
>>>
>>  >
>>> Vilius mentioned some special concerns when converting data between
>>> character sets.  Do those apply if I am converting from iso-8859-1 (latin1)
>>> to UTF8?  I think iso-8859-1 maps directly into UTF8, right?
>>>
>>> We have been running latin1 the whole time, so we shouldn't have any unusual
>>> data stored in the tables right now.
>>>
>>> Can I switch to UTF8 while I'm still on Horde3?  Our Horde4 deployment will
>>> take a while for testing and documentation updates, but I'd like to get this
>>> conversion done first if that is technically possible.
>>
>> I don't see why not.  I'd held off making the change because it was
>> working and there's no need to "break" a working system.  However,
>> since I was moving hardware and versions, I figured, what the helll,
>> in for a penny, in for a pound.  If you're not upgrading just yet, I
>> would do:
>>
>> dump all the databases to an .sql file.
>> move that file offsite.
>> do the UTF-8 change you're comfortable with
>> use mysqladmin to reimport it
>> stop/start mysql and see if it complains.
>> tell H3 to use the "new" db..
>>
>> Someone else can advise if you need to do more than that.

> It's not quite that easy, because that won't work with any serialized
> data that contains non-ascii strings. Those cannot simply be convert  
> because they'd have different string lengths after that. This might  
> affect identity data for example.
> Jan.

Oh,  yes  Jan  is  right. In addition to DB library modifications I've
also  had  to  convert identities with my custom script. Sadly I don't
have it anymore.

-- 
Best regards,
 Vilius



More information about the horde mailing list