[imp] Hint: Whole lines missing in sent mails when using charset iso-8859-15 on older PHP version

André Lang sierra2 at webrausch.de
Tue Jun 30 21:53:35 UTC 2009


Hi all,

I've encountered a strange problem with recent Horde installation 
running on a server with PHP 5.0.4. I am posting the solution here as a 
reference for anyone else having the same problem.

The symptoms are as follows:
When sending mails in IMP, all long lines (above ~80 characters) in the 
emails are lost when sending. They are missing in the saved sent message 
as well as on the recipient side.
This problem only occurs when using certain charsets, im my case 
ISO-8859-15, while ISO-8859-1 works well.

I tracked the problem down to a bug in PHP's iconv_subst: 
http://bugs.php.net/bug.php?id=37773

When preparing the mail, getMessageBody in compose.php is calling
  $textBody->setContents($flowed->toFlowed());
which somehow kills the long lines.

The reason ist the rewrap-Code in Flowed.php, function _reformat, 
following line:
[if line is short enough]
       } elseif ($m = String::regexMatch($line, array('^(.{' . $min . 
',' . $opt . '}) (.*)', '^(.{' . $min . ',' . $this->_maxlength . '}) 
(.*)', '^(.{' . $min . ',})? (.*)'), $this->_charset)) {
[wrap lines]

The $m array returned by String::regexMatch is empty in my case,which 
means the long line in $line is lost. But why is the result empty ?

Having a look at String::regexMatch (/lib/Horde/String.php), there is 
the following call which converts the given array of regexps to UTF8 
(which makes no real sense in this case):
  $regex = String::convertCharset($regex, $charset, 'utf-8');
The $regexp array is empty afterwards, which means no regexp = no result.

convertCharset is doing a conversion of the array's elements one by one 
using the following line:
  $tmp[String::_convertCharset($key, $from, $to)] = 
String::convertCharset($val, $from, $to);
This code is usually fine, but it fails if the part in brackets on the 
left fails to contain a proper index such as 1, 2,....
Debugging showed that the result of String::_convertCharset($key, $from, 
$to) is "1x", "2x",... so the array does not get properly filled.

So _convertCharset is returning crap in this case. The important lines are:
  $output = @iconv($from, $to . '//TRANSLIT', $input . 'x');
  $output = (isset($php_errormsg)) ? false : String::substr($output, 0, 
-1, $to);

I don't know why, but an 'x' is appended to the conversion string and 
cut off again afterwards. Is this really necessary ??
Anyhow, for me the cutoff fails here, so I get an '1x' instead of '1'.

Now having a look at String::substr. When iconv_substr support is found 
in PHP, the following is done:
  $ret = iconv_substr($string, $start, $length, $charset);
In my case, iconv_substr("0x", 0, -1, "ISO-8859-15") returns an empty 
string which is sanitized to the original, uncut string. This is why the 
"x" is still there.

Testing on the command line:
<? print iconv_substr("1x",0,-1,"ISO-8859-15"); ?>
returns "Unknown error (0) in tmp.php on line 2", which is the mentioned 
iconv_substr bug.

It only happens on certain charsets and only when having just a short 
string as parameter.

As the problem is so hard to track (took me 6 hours) I wish everyone 
else encountering something similar to find this post first :)

Regards,
André



More information about the imp mailing list