[imp] Signature in UTF-8 [solved]
Daniel A. Ramaley
daniel.ramaley at DRAKE.EDU
Thu Feb 7 15:56:10 UTC 2008
Thank you for the fast response!
On Thursday 07 February 2008 09:43, Jan Schneider wrote:
>> If the database itself needs to be re-encoded, my thought is to...
>
>These seems to be PostgresQL specific feature, so I can't tell what
>exactly it does, but it sounds reasonable.
If i decide to do this, i'll do lots of testing first. Probably i'll try
loading the dump into another database first to make sure it loads as
smoothly as i think it ought to. I have a second Horde installation on
the same server that i just use for testing; i could even point that
installation at a load of the converted database and see if everything
looks good.
>> Is any conversion of the dump file itself beyond changing the CREATE
>> DATABASE encoding necessary? Right now if i perform a dump, the
>> "file" utility reports the result as ASCII text.
>
>If you are absolutely sure that you only have ASCII data, this is
>sufficient. But as soon as you have latin1 aka iso-8859-1 characters
>above 127 inside your data, this no longer works, because those
>characters are multibyte in utf-8. And this breaks at least in those
>cases where we store serialized arrays.
I'll be sure to do some more rigorous tests on the dump file to verify
that it only contains ASCII; i think some versions of the "file"
command only examine the first X characters of the file, for some value
of X. If the dump is actually ISO-8859-1 i can probably just use iconv
to switch it to UTF-8.
------------------------------------------------------------------------
Dan Ramaley Dial Center 118, Drake University
Network Programmer/Analyst 2407 Carpenter Ave
+1 515 271-4540 Des Moines IA 50311 USA
More information about the imp
mailing list