[sync] Blackberry/Funambol: Character encoding woes

Ian Turner vectro at vectro.org
Sun Feb 26 22:47:47 UTC 2012


Hello list,

Just set up SyncML support with Funambol to Blackberry. Works very well!  
Thanks for your efforts in creating such a wonderful software project. I am 
encountering two problems with the SyncML support in Horde; this e-mail 
describes the second such problem.

It does not appear that character encodings are respected during the sync. I'm 
not sure if this is a blackberry problem, funambol problem, Apache problem, 
PHP problem, or Horde problem, but I can't find any FAQs about it, so I thought 
you, dear list, might know.

Example:
New memo on blackberry containing non-ASCII string "Ûñíçòðë Té§t™".
This string encoded as UTF-8:
0000000: c39b c3b1 c3ad c3a7 c3b2 c3b0 c3ab 2054  .............. T
0000010: c3a9 c2a7 74e2 84a2                     ....t...
This string encoded as ISO-8859-1 (note that '™' is not available in 
iso-8859-1)
0000000: dbf1 ede7 f2f0 eb20 54e9 a774            ....... T..t
This string encoded as XML/HTML character entities:
Ûñíçòðë Té§t™

When I create a memo containing this string on blackberry, Horde appears to 
attempt to commit this string to the database, a mix of HTML character 
entities and iso-8859-1. As this string is not valid UTF-8 (which is the 
database locale), it is rejected by the (postgres) database with the error 
"invalid byte sequence for encoding 'UTF8': 0xdbf1"
0000000: dbf1 ede7 f2f0 eb20 54e9 a774 2623 3834  ....... T..t&#84
0000010: 3832 3b0a                                82;.

There appear to be two problems here: One, that Horde is trying to write 
iso-8859-1 encoded text to a UTF-8 database, and Two that HTML character 
entities have somehow made it this far without being decoded (perhaps because  
'™' is not available in iso-8859-1). It does appear that the original text 
made it into Horde in the correct encoding, since /tmp/sync/data.txt contains 
the correct HTML character entities. Based on this file, it seems like perhaps 
the problem is in the conversion from text/x-s4j-sifn to text/x-vnote.

Contents of /tmp/sync/data.txt:

Input received from client (text/x-s4j-sifn):
<note>
<Subject>Test Unicode</Subject>
<Body>Ûñíçòðë Té§t™</Body>
<Categories></Categories>
<Color></Color>
<Height></Height>
<Width></Width>
<Left></Left>
<Top></Top>
</note>


Input converted for server (text/x-vnote):
BEGIN:VNOTE
VERSION:1.1
BODY;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:=DB=F1=ED=E7=F2=F0=EB 
T=E9=A7t&#8482\;
SUMMARY:Test Unicode
CATEGORIES:
END:VNOTE

When synchronizing in the opposite direction, i.e. creating a non-ASCII note 
in Horde and then synchronizing to blackberry, things work as expected.

I have attached the full contents of /tmp/sync. Any thoughts you can share on 
this mystery would be much appreciated.

Versions:
Horde 3.3.8 
Notes (mnemo) H3 (2.2.3) 
PHP 5.3.3

Sincerely,

--Ian Turner
`
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sync.tar.gz
Type: application/x-compressed-tar
Size: 5287 bytes
Desc: not available
URL: <http://lists.horde.org/archives/sync/attachments/20120226/f9741965/attachment.bin>


More information about the sync mailing list