[imp] Attachment corruption problem

Michael M Slusarz slusarz at horde.org
Wed Apr 26 10:33:40 PDT 2006


Quoting "Daniel A. Ramaley" <daniel.ramaley at DRAKE.EDU>:

> There has been an off-list discussion that should be part of this
> thread. Both parties agree that it should be posted back to the list.
> It follows, with most quoted text removed:
>
>
>
> Date: Tue, 25 Apr 2006 16:54:06 +0200
> From: Andreas Geesen
>
> I experienced bad behaviour when attachments got encoded as
> "quoted-printable". Can you confirm that this is the case with your
> file, too?
> If so take a look at the bytes of the original and the broken file. If
> they differ in the way EOL is used (0x0d 0x0a vs. 0x0a) you have the
> same prob which i had.

[snip]

This is the same discussion as appeared on the bug report and,  
unfortunately, this discussion is still incorrect.

As mentioned in the bug report - this is not a Horde/IMP issue.  This  
is an issue with quoted-printable not being able to handle binary data  
UNLESS IT IS EXPLICTLY TOLD IT IS BEEN GIVEN BINARY DATA.  More  
important, this issue has *nothing* to do with EOL characters - or,  
more correctly, messing with EOL characters is *absoultely* the wrong  
way to look at this issue.

Maybe a simple example will be in order.  Say I have the following  
text/plain file:

-----
Line one.CR
Line two.
-----

And i send it in quoted-printable.  It will be sent as the following:

-----
Line one.CRLF
Line two.
-----

As can be seen, pursuant to RFCs, all end of line characters are  
converted to CRLF.  Most important, no matter what OS the message is  
read on, that OS can convert the CRLF string to whatever EOL  
convention that OS uses - this is part of the decoding of an RFC  
message on the receiving end.  So the message appears with the same  
line breaks no matter what OS is used to read the message.  What is  
important to realize is that this text message *WILL BE DIFFERENT*  
depending on the OS used.  On unix, the message will look like the  
following:

-----
Line one.LF
Line two.
-----

On windows the message will look like the following:

-----
Line one.CRLF
Line two.
-----

As can be seen, the file length of the former file is 19.  The file  
length of the latter file is 20.  *Ack*!  What is going on?  The  
answer is nothing - as explained several times in the bug reports this  
is exactly what the RFCs allow.   Horde/IMP isn't broken.  Since it is  
text data, the difference is file sizes doesn't make any difference  
since with textual data we only care about the *display*.

But, exactly like the RFCs warn us, the problem occurs when we try to  
use quoted-printable to send BINARY data.  Using the same example as  
above, lets assume that this message is not text data but is binary  
data instead.  Lets assume it is a windows based program that parses  
this data, and this program delimits lines by CRLF.  Lets assume  
Horde/IMP is running on a UNIX machine.  We go to attach our message  
using IMP.  So far so good since the message will be canonicalized  
when sending to:

-----
Line one.CRLF
Line two.
-----

Which just fortuitously happens to be in the format we need.  Now  
imagine this message is received on an IMP installation on a UNIX  
machine.  We go to download the file.  The file is downloaded as such:

-----
Line one.LF
Line two.
-----

And, no suprise, the file is in the wrong format.  The windows program  
can't read the file.  People incorrectly point the finger at Horde/IMP.

So how could this latter situation happen?  Because the file is  
reported to IMP at the time of sending as a text file.  As adequately  
demonstrated above, the RFCs clearly indicate that EOL formatting is  
not guaranteed when using quoted-printable encoding of text data.   
Thus, there is *nothing* broken.  There is either an issue with the  
browser incorrectly identifying the file as text to IMP when  
attaching, or there is an issue with MIME magic detection of the file.

We don't support Q-P encoding of binary data.  It defeats the whole  
purpose of Q-P in the first place - Q-P is intended to provide a non  
MIME-compliant reader (e.g. simple mail user agent, a user looking at  
the raw text of the message) a way to understand the gist of the text  
message without having to do any further processing.

We are not going to send all messages in base64 since #1 it would  
result in *all* messages being approximately 33% larger than they  
should and #2 it does not provide the ability to quickly look at a  
mail message without specialized software and still be able to  
understand most (if not all) of the message

Just FYI, to correctly Q-P binary data, the message above would have  
to be sent as follows:
-----
Line one.=0E=0CLine two.
-----

But if we know the message is binary data, we are just going to base64  
encode it anyway since it is a more efficient way of sending binary  
data (33% more efficient if the entire message is binary data) and if  
a message is binary data, we don't need the feature of being able to  
look at the message (e.g. Q-P) without specialized software since the  
data is going to be indecipherable anyway.

So if binary data is reported as text at the time of attachment then  
there can be no expectation that the message will be transmitted  
through RFC-compliant mail without alteration.  As mentioned  
previously, there may be two reasons why binary data is attached as  
text:
1.) browser reports data as text/*
This is a browser issue.
SOLUTION: Fix your browser.  Or hack Horde/IMP to send all messages in  
base64.  But this will neither become an option or the standard in our  
codebase
2.) MIME magic reports application/octet-stream data as text/*
This may or may not be a Horde issue.  This is only a Horde issue if  
our internal MIME magic detection is used.  But this is the *third*  
option and is only used if both the PECL fileinfo module is not  
installed and the PHP mime_magic extesion is not available.  If either  
of these modules are used, then the issue is with their mime magic  
algorithims which is something out of the control of us.

In conclusion, there is absolutely nothing wrong with the way we send  
Q-P data since we only Q-P encode a message if we are dealing with  
text data.  This is why Bug 3565 was correctly marked Bogus.

michael

___________________________________
Michael Slusarz [slusarz at horde.org]


More information about the imp mailing list