[imp] HTML to plain text formatting

Oscar del Rio delrio at mie.utoronto.ca
Wed Nov 7 17:13:38 UTC 2012


On 11/ 7/12 12:08 AM, Michael M Slusarz wrote:
> Quoting Oscar del Rio <delrio at mie.utoronto.ca>:
>
>> While testing HTML composing of emails I noticed some problems in the 
>> text/plain conversion.
>>
>> The text/html part is as expected:
>> Normal <strong>Strong</strong> Normal <em>Italics</em> Normal 
>> <u>Underline</u> Normal <strike>Strike</strike> Normal
>>
>> But the text/plain part seems to have conversion problems (words are 
>> joined):
>> Normal STRONGNormal /Italics/Normal _Underline_Normal StrikeNormal
>>
>
> I can't reproduce with IMP 6.  And this testcase passes just fine:
>
> http://git.horde.org/horde-git/-/commit/4ec7d98e8fc68bdea1332e0a6391e26d61047ddd 
>
>

I did some debugging and was able to reproduce the problem on demo.horde.org

The problem seems to be triggered by <br> lines within the paragraph
(e.g. if you do not backspace the blank line that is already inserted 
when the composer window opens).

I was also able to trace it to ltrim() and rtrim() calls in function 
_node($doc, $node) of Horde/Text/Filter/Html2text.php

Case 1 (bug):
<p><br>
word1 <strong>word2</strong> word3</p>

Output of _node(): "\nword1 WORD2word3"

Case 2 (OK):
<p>word1 <strong>word2</strong> word3</p>

Output: "\nword1 WORD2 word3\n"


Case 3 (bug):
<p><br>
word1 <strong>word2</strong> word3<br>
word4 <strong>word5</strong> word6<br></p>

Output: "\nword1 WORD2word3\nword4 WORD5word6\n"


Case 4 (OK but extra space at start of second line):
<p>word1 <strong>word2</strong> word3<br>
word4 <strong>word5</strong> word6</p>

Output: "\nword1 WORD2 word3\n word4 WORD5 word6\n"
Note the extra space at the start of the second line " word4" which I 
think should have been ltrim()'d.




More information about the imp mailing list