[imp] Signature in UTF-8 [solved]

Daniel A. Ramaley daniel.ramaley at DRAKE.EDU
Thu Feb 7 15:56:10 UTC 2008


Thank you for the fast response!

On Thursday 07 February 2008 09:43, Jan Schneider wrote:
>> If the database itself needs to be re-encoded, my thought is to...
>
>These seems to be PostgresQL specific feature, so I can't tell what
>exactly it does, but it sounds reasonable.

If i decide to do this, i'll do lots of testing first. Probably i'll try 
loading the dump into another database first to make sure it loads as 
smoothly as i think it ought to. I have a second Horde installation on 
the same server that i just use for testing; i could even point that 
installation at a load of the converted database and see if everything 
looks good.

>> Is any conversion of the dump file itself beyond changing the CREATE
>> DATABASE encoding necessary? Right now if i perform a dump, the
>> "file" utility reports the result as ASCII text.
>
>If you are absolutely sure that you only have ASCII data, this is
>sufficient. But as soon as you have latin1 aka iso-8859-1 characters
>above 127 inside your data, this no longer works, because those
>characters are multibyte in utf-8. And this breaks at least in those
>cases where we store serialized arrays.

I'll be sure to do some more rigorous tests on the dump file to verify 
that it only contains ASCII; i think some versions of the "file" 
command only examine the first X characters of the file, for some value 
of X. If the dump is actually ISO-8859-1 i can probably just use iconv 
to switch it to UTF-8.

------------------------------------------------------------------------
Dan Ramaley                            Dial Center 118, Drake University
Network Programmer/Analyst             2407 Carpenter Ave
+1 515 271-4540                        Des Moines IA 50311 USA


More information about the imp mailing list