[imp] Bugs or configuration errors? (RFC 822 linebreaks.

Chris Hastie lists@oak-wood.co.uk
Fri, 5 Jul 2002 09:27:20 +0100


On Fri, 5 Jul 2002, Michael M Slusarz <slusarz@bigworm.colorado.edu> 
wrote
>Quoting Chris Hastie <lists@oak-wood.co.uk>:
>
>| 3)  Already mentioned this morning (well it's morning here) - base64
>|     encoded attachments to messages composed in IMP are reported by my
>|     main MUA (Turnpike) to have invalid characters in the Base64. This
>|     seems to apply to messages that actually get sent out via postfix.
>|     Copies saved by IMP using the IMAP APPEND command are fine. A quick
>|     comparison suggests a problem with line endings, but I'll have to
>|     think of something more elaborate to be sure the OS on the receiving
>|     machine isn't messing about with them.
>
>Here is a (somewhat related) thread on this subject:
>http://www.sumthin.nu/archives/qmail/Mar_2002/msg00029.html
>
>As near as I can tell, this is the situation: despite what it says in the
>thread, headers should be sent with the <CR><LF> break.

I'd certainly agree about that.

>  Reading RFC 822,
>it seems to want to have <CR><LF> at the end of every line also.

Yes, it's fairly clear about that:

|   Messages are divided into lines of characters.  A line is a series of
|   characters that is delimited with the two characters carriage-return
|   and line-feed; that is, the carriage return (CR) character (ASCII
|   value 13) followed immediately by the line feed (LF) character (ASCII
|   value 10).  (The carriage-return/line-feed pair is usually written in
|   this document as "CRLF".)

and

|   - CR and LF MUST only occur together as CRLF; they MUST NOT appear
|     independently in the body.

> This
>message is sent with <CR><LF> through the Postfix MTA - it _should_ strip
>off the <CR> and leave only the <LF>.  This is good.

OK, I'm lost as to why Postfix should strip off the <CR> and produce a 
message that breaks RFC 2822. Except for local delivery that is - see 
below.
>
>However, the sendmail php library in PEAR doesn't send <CR><LF>, it sends
>only <LF>.  This is bad, I think, and nothing that (I) personally can
>change.  Additionally, postfix does not seem to do a <CR><LF> -> <LF>
>conversion for local message delivery - see
>http://archives.neohapsis.com/archives/postfix/2000-02/0398.html

My reading of that archive message suggest that Postfix /does/ convert 
CRLF to LF for local delivery. This seems acceptable on a system that 
expects lines to end in LF only. The problem I encountered however was 
not with local delivery, but with onward delivery via SMTP to my MUA. If 
I delivered locally and then collected with POP3 the message was OK.

This whole area is a nightmare which has bitten me several times. I've 
gone to great lengths to ensure that scripts generate CRLF pairs for 
mails, only to find something else assumes the input to be native LF and 
converts it before mailing, leaving me with CRCRLF
>
>Also, there is mention that SMTP delivery is different from local MTA
>delivery.
>
Given your comments on the php sendmail library I'm inclined to think 
this:

Using SMTP delivery IMP is sending CRLF throughout. What seems fairly 
clear is that in an SMTP context this is what is expected by pretty much 
everything.

Delivering by piping to the MTA IMP is sending only LF, or possibly only 
what is native on that system, so LF on *nix and CRLF on Windoze. The 
behaviour of different MTAs appears to differ here, and I'm guessing 
Postfix is passing this unaltered, whilst Sendmail converts to CRLF 
before placing in the SMTP stream. If I'm right about that it suggests 
that somehow changing IMP's behaviour to send CRLF would work with 
Postfix, which would pass them unaltered, and with Sendmail since the 
thread you mentioned seems to suggest that's fine, but would break 
Qmail, which it seems expects native LF, and converts them to CRLF. On 
receiving a CRLF it converts the LF leaving CRCRLF.

It seems to me that given RFC2822 statement that

|   - CR and LF MUST only occur together as CRLF; they MUST NOT appear
|     independently in the body.

the sensible MTA would do something like

CRLF -> LF
CR -> LF
LF -> CRLF

thus ensuring that whether the input started out life on Mac, Dos or 
*nix it leaves with only CRLF line endings.

>This is a bit of a hornet's nest and will take some time to figure out.

Your not kidding :)

If I get time I'll try and take a closer look, but right now time is in 
short supply.

-- 
Chris Hastie