[imp] HTML to plain text formatting

Michael M Slusarz slusarz at horde.org
Wed Nov 7 22:04:04 UTC 2012


Quoting Oscar del Rio <delrio at mie.utoronto.ca>:

> On 11/ 7/12 12:08 AM, Michael M Slusarz wrote:
>> Quoting Oscar del Rio <delrio at mie.utoronto.ca>:
>>
>>> While testing HTML composing of emails I noticed some problems in  
>>> the text/plain conversion.
>>>
>>> The text/html part is as expected:
>>> Normal <strong>Strong</strong> Normal <em>Italics</em> Normal  
>>> <u>Underline</u> Normal <strike>Strike</strike> Normal
>>>
>>> But the text/plain part seems to have conversion problems (words  
>>> are joined):
>>> Normal STRONGNormal /Italics/Normal _Underline_Normal StrikeNormal
>>>
>>
>> I can't reproduce with IMP 6.  And this testcase passes just fine:
>>
>> http://git.horde.org/horde-git/-/commit/4ec7d98e8fc68bdea1332e0a6391e26d61047ddd
>
> I did some debugging and was able to reproduce the problem on demo.horde.org
>
> The problem seems to be triggered by <br> lines within the paragraph
> (e.g. if you do not backspace the blank line that is already  
> inserted when the composer window opens).
>
> I was also able to trace it to ltrim() and rtrim() calls in function  
> _node($doc, $node) of Horde/Text/Filter/Html2text.php
>
> Case 1 (bug):
> <p><br>
> word1 <strong>word2</strong> word3</p>
>
> Output of _node(): "\nword1 WORD2word3"

I can't reproduce this.

> Case 3 (bug):
> <p><br>
> word1 <strong>word2</strong> word3<br>
> word4 <strong>word5</strong> word6<br></p>
>
> Output: "\nword1 WORD2word3\nword4 WORD5word6\n"

Can't reproduce this either.

> Case 4 (OK but extra space at start of second line):
> <p>word1 <strong>word2</strong> word3<br>
> word4 <strong>word5</strong> word6</p>
>
> Output: "\nword1 WORD2 word3\n word4 WORD5 word6\n"
> Note the extra space at the start of the second line " word4" which  
> I think should have been ltrim()'d.

Spacing is difficult when converting since <p>, <br>, and <div> can  
mean different things.

michael

___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the imp mailing list