[Tickets #648] NEW: MIME.php wrapHeaders corrupting filenames
bugs at bugs.horde.org
bugs at bugs.horde.org
Tue Sep 28 21:58:03 PDT 2004
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/?id=648
-----------------------------------------------------------------------
Ticket | 648
Created By | Michael Slusarz <slusarz at mail.curecanti.org>
Summary | MIME.php wrapHeaders corrupting filenames
Queue | Horde Framework
State | Assigned
Priority | 2. Medium
Type | Bug
Owners | Michael Slusarz
-----------------------------------------------------------------------
Michael Slusarz <slusarz at mail.curecanti.org> (2004-09-28 21:58) wrote:
The following function in the MIME framework module is under certain
circumstances taking long filenames which have spaces in them and
replacing a space in the filename with a tab:
function wrapHeaders($header, $text, $eol = "\r\n")
{
/* Remove any existing linebreaks. */
$text = preg_replace("/\r?\n\s?/", ' ', $text);
/* Wrap the line. */
$line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, $eol .
"\t");
/* Make sure there are no empty lines. */
$line = preg_replace("/" . $eol . "\t\s*" . $eol . "\t/", "/" . $eol
. "\t/", $line);
return substr($line, strlen($header) + 2);
}
Example:
Horde:
Content-Type: application/msword; name="Mid-Pgm Assessment
Form000000000000000.doc"
Content-Disposition: attachment; filename="Mid-Pgm Assessment
Form000000000000000.doc"
Content-Transfer-Encoding: base64
Horde with filename > 78 and no spaces:
Content-Type: application/msword;
name="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_anothe
r_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"
Content-Disposition: attachment;
filename="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_an
other_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"
Content-Transfer-Encoding: base64
Here are some examples of how other mailers construct this:
Pine:
Content-Type: APPLICATION/msword; name="Mid-Pgm Assessment
Form000000000000000.doc"
Content-Transfer-Encoding: BASE64
Content-Disposition: attachment; filename="Mid-Pgm Assessment
Form000000000000000.doc"
Pine with a filename > 78:
Content-Type: APPLICATION/msword; name*0="Mid-Pgm Assessment
Form000000000000000 this is a test and this is another test and th";
name*1="is is a third test and just one more for kicks.doc"
Content-Transfer-Encoding: BASE64
Content-Disposition: attachment; filename*0="Mid-Pgm Assessment
Form000000000000000 this is a test and this is another test and th";
filename*1="is is a third test and just one more for kicks.doc"
Pine with a filename > 78 and no spaces:
Content-Type: APPLICATION/msword;
name*0=Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_anoth
er_test_and_;
name*1="this_is_a_third_test_and_just_one_more_for_kicks.doc"
Content-Transfer-Encoding: BASE64
Content-Disposition: attachment;
filename*0=Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_a
nother_test_and_;
filename*1="this_is_a_third_test_and_just_one_more_for_kicks.doc"
Mulberry:
Content-Type: application/msword;
name="Mid-Pgm Assessment Form000000000000000.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="Mid-Pgm Assessment Form000000000000000.doc"; size=25088
Mulberry with a filename > 78:
Content-Type: application/msword;
name="Mid-Pgm Assessment Form000000000000000 this is a test and this is
another test and this is a third test and just one more for kicks.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="Mid-Pgm Assessment Form000000000000000 this is a test and this is
another test and this is a third test and just one more for kicks.doc";
size=24064
Mulberry with a filename > 78 and no spaces:
Content-Type: application/msword;
name="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_anoth
er_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="Mid-Pgm_Assessment_Form0000000000000_this_is_a_test_and_this_is_a
nother_test_and_this_is_a_third_test_and_just_one_more_for_kicks.doc";
size=24064
The following patch which replaces the tab character with a space at least
does not potentially embed a funky character in the attachment filename
quoted string which some mailers cannot make sense of and therefore
include but it does not deal with a long filename comprised of only
alphanumeric characters:
diff -r1.132 MIME.php
809c809
< $line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, $eol .
"\t");
---
> $line = wordwrap(rtrim($header) . ': ' . rtrim($text), 75, $eol .
" ");
812c812
< $line = preg_replace("/" . $eol . "\t\s*" . $eol . "\t/", "/" .
$eol . "\t/", $line);
---
> $line = preg_replace("/" . $eol . " \s*" . $eol . " /", "/" . $eol
. " /", $line);
The pine name*<n> notation looks like an interesting way to handle this.
>From rfc2822:
There are two limits that this standard places on the number of
characters in a line. Each line of characters MUST be no more than
998 characters, and SHOULD be no more than 78 characters, excluding
the CRLF.
The 998 character limit is due to limitations in many implementations
which send, receive, or store Internet Message Format messages that
simply cannot handle more than 998 characters on a line. Receiving
implementations would do well to handle an arbitrarily large number
of characters in a line for robustness sake. However, there are so
many implementations which (in compliance with the transport
requirements of [RFC2821]) do not accept messages containing more
than 1000 character including the CR and LF per line, it is important
for implementations not to create such messages.
The more conservative 78 character recommendation is to accommodate
the many implementations of user interfaces that display these
messages which may truncate, or disastrously wrap, the display of
more than 78 characters per line, in spite of the fact that such
implementations are non-conformant to the intent of this
specification (and that of [RFC2821] if they actually cause
information to be lost). Again, even though this limitation is put on
messages, it is encumbant upon implementations which display messages
I think since the character limit is a "MUST be no more than 998" and a
"SHOULD be no more than 78" then there are the following options:
- use spaces instead of tabs to indent continuation lines on MIME part
headers
- start a new continuation line each time a semi-colon is encountered
outside of a quoted-string unless it is the trailing character
- limit each of these lines to 998 or 78:
- either truncate the value portion of the header attribute to make the
overall length of the line less than 998
or
- use the attribute_key*<n> syntax to break up quoted-strings so that
no line exceeds 78 characters
I was thinking that replacing the call with something like the following -
this hasn't been syntactically checked or anything:
function wrapHeaders($header, $text, $eol = "\r\n")
{
/* Remove any existing linebreaks. */
$text = trim(preg_replace("/\r?\n\s?/", ' ', $text));
$header = trim($header);
$line = '';
if ((strlen($text) + strlen($header)) < 75) {
$line .= $header . ': ' . $text . $eol;
} else {
/* need a more accurate separator regex here but this is just
for demonstrative purposes */
$attrs = array_map('trim', preg_split(';', $text, -1,
PREG_SPLIT_NO_EMPTY));
for ($i = 0; $i < count($attrs); $i++) {
if ($i == 0) {
/* if this is the first line account for the length of
the header addition */
$prefix = $header . ': ';
} else {
/* otherwise it is just a single whitespace indent to
account for */
$prefix = ' ';
}
$offset = strlen($prefix);
if ((strlen($offset) + strlen($attrs[$i])) < 75) {
$line .= $prefix . $attrs[$i] . ';' . $eol;
} else {
$attrItems = explode('=', $attrs[$i], 1);
/* if the separator isn't found in the attribute then
* the value should probably not be folded.
* just make sure it doesn't exceed 995
*/
if (!$attrItems) {
$line .= $prefix . substr($attrs[$i], 0, 995 -
$offset) . ';' . $eol;
} else {
$attrName = $attrItems[0];
$attrVale = trim($attrItems[1], '"');
$chunks = chunk_split(trim($attrItems[1], '"'), 75 -
($offset + strlen($attrName) + 6))
for ($c = 0; $c < count($chunks); $c++) {
$line .= $line .= $prefix . "$attrName*$c=" .
'"' . $chunks[$c] . '";' . $eol;
}
}
}
}
return substr($line, strlen($header) + 2);
}
}
I think there should also be some code in place to deal with displaying
these long filenames at the top of the message in HTML. I think the
anchor tag should be truncated to a certain number of characters and an
alt tag with the full string should be added.
Comments?
--
Sam Nicolary
More information about the bugs
mailing list