[Tickets #10148] Re: Incorrect message charset in replies with reply_headers

Tue Jun 7 05:56:57 UTC 2011

DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.

Ticket URL: http://bugs.horde.org/ticket/10148
------------------------------------------------------------------------------
  Ticket             | 10148
  Updated By         | Michael Slusarz <slusarz at horde.org>
  Summary            | Incorrect message charset in replies with reply_headers
  Queue              | IMP
  Version            | Git master
  Type               | Bug
  State              | Assigned
  Priority           | 1. Low
  Milestone          |
  Patch              |
  Owners             | Michael Slusarz
------------------------------------------------------------------------------

Michael Slusarz <slusarz at horde.org> (2011-06-07 05:56) wrote:

> If the reply_headers preference is set,  
> IMP_Compose#replyMessageText() is inserting the decoded From: header  
> into the message text. This header might contain non-ASCII  
> characters. When determining the message's charset further down,  
> only the original message's charset is considered though. This is a  
> problem if the original message matches the email charset of the  
> current language (so that we don't use UTF-8 for the reply message),  
> but the From: header can not be converted to that charset.

I'm thinking we should junk a) auto-determining charset of outgoing  
reply message based on the original message and b) doing away with the  
sending_charset preference.  In other words, we should send everything  
in UTF-8.  Are there really mail readers out there in 2011 that still  
don't support UTF-8?

I am more inclined to keep a) simply because it gives us hard  
information that the sender can at least read the e-mail message (on  
at least one of his MUA's) in that charset.  And generally people  
aren't doing something like responding to a Norwegian message in  
Mandarin Chinese, so this charset hint is useful in almost all cases.   
But the code would be so much easier to maintain if we just convert  
everything to UTF-8 and consistently stick with that internally.

a) would work better if we had some way of telling that the conversion  
from UTF-8 -> other charset was unsuccessful (e.g. there is no  
codepoint mapping for a certain character).  But I don't think our  
conversion methods give us this kind of feedback so that is not helpful.

Having the user pick the charset is simply out of the question.   
Nobody, outside of maybe programmers and computer scientists, has any  
clue what charsets mean anyway, so giving them an option to change is  
completely pointless.

Thoughts?