[Tickets #3565] RESOLVED: Attachment modification (newline structure changes)

bugs@bugs.horde.org bugs at bugs.horde.org
Wed Apr 26 10:35:00 PDT 2006


DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.

Ticket URL: http://bugs.horde.org/ticket/?id=3565
-----------------------------------------------------------------------
 Ticket             | 3565
 Updated By         | Michael Slusarz <slusarz at horde.org>
 Summary            | Attachment modification (newline structure changes)
 Queue              | IMP
 Version            | HEAD
 State              | Bogus
 Priority           | 2. Medium
 Type               | Bug
 Owners             | Michael Slusarz
-----------------------------------------------------------------------


Michael Slusarz <slusarz at horde.org> (2006-04-26 10:34) wrote:

>From imp at lists:

Quoting "Daniel A. Ramaley" <daniel.ramaley at DRAKE.EDU>:

> There has been an off-list discussion that should be part of this
> thread. Both parties agree that it should be posted back to the list.
> It follows, with most quoted text removed:
>
>
>
> Date: Tue, 25 Apr 2006 16:54:06 +0200
> From: Andreas Geesen
>
> I experienced bad behaviour when attachments got encoded as
> "quoted-printable". Can you confirm that this is the case with your
> file, too?
> If so take a look at the bytes of the original and the broken file. If
> they differ in the way EOL is used (0x0d 0x0a vs. 0x0a) you have the
> same prob which i had.

[snip]

This is the same discussion as appeared on the bug report and,
unfortunately, this discussion is still incorrect.

As mentioned in the bug report - this is not a Horde/IMP issue.  This is an
issue with quoted-printable not being able to handle binary data UNLESS IT
IS EXPLICTLY TOLD IT IS BEEN GIVEN BINARY DATA.  More important, this issue
has *nothing* to do with EOL characters - or, more correctly, messing with
EOL characters is *absoultely* the wrong way to look at this issue.

Maybe a simple example will be in order.  Say I have the following
text/plain file:

-----
Line one.CR
Line two.
-----

And i send it in quoted-printable.  It will be sent as the following:

-----
Line one.CRLF
Line two.
-----

As can be seen, pursuant to RFCs, all end of line characters are converted
to CRLF.  Most important, no matter what OS the message is read on, that OS
can convert the CRLF string to whatever EOL convention that OS uses - this
is part of the decoding of an RFC message on the receiving end.  So the
message appears with the same line breaks no matter what OS is used to read
the message.  What is important to realize is that this text message *WILL
BE DIFFERENT* depending on the OS used.  On unix, the message will look like
the following:

-----
Line one.LF
Line two.
-----

On windows the message will look like the following:

-----
Line one.CRLF
Line two.
-----

As can be seen, the file length of the former file is 19.  The file length
of the latter file is 20.  *Ack*!  What is going on?  The answer is nothing
- as explained several times in the bug reports this is exactly what the
RFCs allow.   Horde/IMP isn't broken.  Since it is text data, the difference
is file sizes doesn't make any difference since with textual data we only
care about the *display*.

But, exactly like the RFCs warn us, the problem occurs when we try to use
quoted-printable to send BINARY data.  Using the same example as above, lets
assume that this message is not text data but is binary data instead.  Lets
assume it is a windows based program that parses this data, and this program
delimits lines by CRLF.  Lets assume Horde/IMP is running on a UNIX machine.
 We go to attach our message using IMP.  So far so good since the message
will be canonicalized when sending to:

-----
Line one.CRLF
Line two.
-----

Which just fortuitously happens to be in the format we need.  Now imagine
this message is received on an IMP installation on a UNIX machine.  We go to
download the file.  The file is downloaded as such:

-----
Line one.LF
Line two.
-----

And, no suprise, the file is in the wrong format.  The windows program can't
read the file.  People incorrectly point the finger at Horde/IMP.

So how could this latter situation happen?  Because the file is reported to
IMP at the time of sending as a text file.  As adequately demonstrated
above, the RFCs clearly indicate that EOL formatting is not guaranteed when
using quoted-printable encoding of text data.  Thus, there is *nothing*
broken.  There is either an issue with the browser incorrectly identifying
the file as text to IMP when attaching, or there is an issue with MIME magic
detection of the file.

We don't support Q-P encoding of binary data.  It defeats the whole purpose
of Q-P in the first place - Q-P is intended to provide a non MIME-compliant
reader (e.g. simple mail user agent, a user looking at the raw text of the
message) a way to understand the gist of the text message without having to
do any further processing.

We are not going to send all messages in base64 since #1 it would result in
*all* messages being approximately 33% larger than they should and #2 it
does not provide the ability to quickly look at a mail message without
specialized software and still be able to understand most (if not all) of
the message

Just FYI, to correctly Q-P binary data, the message above would have to be
sent as follows:
-----
Line one.=0E=0CLine two.
-----

But if we know the message is binary data, we are just going to base64
encode it anyway since it is a more efficient way of sending binary data
(33% more efficient if the entire message is binary data) and if a message
is binary data, we don't need the feature of being able to look at the
message (e.g. Q-P) without specialized software since the data is going to
be indecipherable anyway.

So if binary data is reported as text at the time of attachment then there
can be no expectation that the message will be transmitted through
RFC-compliant mail without alteration.  As mentioned previously, there may
be two reasons why binary data is attached as text:
1.) browser reports data as text/*
This is a browser issue.
SOLUTION: Fix your browser.  Or hack Horde/IMP to send all messages in
base64.  But this will neither become an option or the standard in our
codebase
2.) MIME magic reports application/octet-stream data as text/*
This may or may not be a Horde issue.  This is only a Horde issue if our
internal MIME magic detection is used.  But this is the *third* option and
is only used if both the PECL fileinfo module is not installed and the PHP
mime_magic extesion is not available.  If either of these modules are used,
then the issue is with their mime magic algorithims which is something out
of the control of us.

In conclusion, there is absolutely nothing wrong with the way we send Q-P
data since we only Q-P encode a message if we are dealing with text data. 
This is why Bug 3565 was correctly marked Bogus.




More information about the bugs mailing list