[imp] problem with attachments in unicode (UTF16)

Andrew Morgan morgan at orst.edu
Thu Mar 20 22:49:30 UTC 2008


On Thu, 20 Mar 2008, Philip Steeman wrote:

> Hello,
> when I sent a mail with imp and add a attachment in UTF16, it isn't
> base64 encoded as it should be.
> So it becomes corrupted when opened in Outlook or Thunderbird.
>
> You can find a very short unicode-file here to do a test:
> http://users.khbo.be/steeman/unicode.html
>
> Is it an error in horde/imp, or in php or in ...
>
> PS: I tested it with lots of browsers (in windows and linux) to get the
> same result.
>
> Verions:
> horde 3.1.7
> imp 4.1.6
> php 4.3.10

I tested this with Iceweasel (Firefox) 2.0.0.12 on Debian Unstable as the 
client, and the latest stable releases of Horde and IMP with PHP5 on 
Debian Etch as the server.  The browser says the content-type is 
text/plain when it uploads the attachment.  Here is the exact attachment 
that was sent in the email:

--=_3uemsho7ppkw
Content-Type: text/plain;
         charset=UTF-8;
         name="unicode.txt"
Content-Disposition: attachment;
         filename="unicode.txt"
Content-Transfer-Encoding: quoted-printable

=FF=FEt=00h=00i=00s=00 =00i=00s=00 =00a=00 =00t=00e=00s=00t=00=0D=00
=00i=00n=00 =00U=00T=00F=001=006=00=0D=00
=00=0D=00
=00P=00h=00i=00l=00i=00p=00 =00S=00t=00e=00e=00m=00a=00n=00=0D=00
=00
--=_3uemsho7ppkw--

It used quoted-printable encoding instead of Base64.  I'm not a 
quoted-printable whiz, but it appears that the high-order bits get encoded 
as 00 (NUL) values.  When I download this same attachment using IMP, it is 
identical to your original unicode.txt file.  However, I suspect 
Thunderbird and Outlook are not combining the two bytes of data back 
together (=FF=FE into FFEE) but are trying to render the NUL character.

Anyways, I'm mostly writing this all down because I was interested enough 
to test it myself.  I have no idea if there is some way to get IMP to use 
the UTF-16 character set instead.

 	Andy


More information about the imp mailing list