[dev] [framework-patch] clean HTML

Jan Schneider jan at horde.org
Mon Aug 16 14:29:52 PDT 2004


Zitat von Francois Marier <francois at nit.ca>:

> On Fri, Aug 06, 2004 at 02:30:46PM +0200, Jan Schneider wrote:
>> Zitat von Francois Marier <francois at nit.ca>:
>> >This patch fixes the <script> problem by removing what's between the
>> >two tags.  It also fixes the <style> problem when displaying HTML
>> >inline (in non-inline mode, the <style> tags are preserved).
>>
>> I don't like this approach, the style tag changes are not necessary anyway
>> and the script regexps are too weak to catch all cases and too strong to
>> catch common cases. This is a cosmetic issue, so the we don't want to catch
>> each obfuscated version of script tags here. Just take the style cleanup as
>> an example and model the script regexps after that.
>
> True, if there are obfuscated tags in the HTML, then it's most likely
> spam and then it doesn't matter if it's not displayed correctly.
>
> I've changed the regexp to be as simple as possible in this updated patch.

Much better.

>> >Furthermore, I also added a line that strips out all HTML comments
>> >(including scripts and styles) if we are displaying inline.  Since we
>> >cannot allow either script or styles, there is no point in sending
>> >this data to the browser.
>>
>> I'm not sure if i want to trade the additional page size with the additional
>> cpu cycles, but I may get convinced. At least you shouldn't need to look for
>> withspace characters, the full stop already matches them. If you intended to
>> catch new lines, use the DOTALL modifier /s instead.
>
> Well, I don't know how much slower it would be, I guess it depends on
> the speed of the network and the CPU of the server, but my guess is
> that if there is a difference then it is pretty small.  I just thought
> it would be best to refrain from sending useless stuff over the wire,
> but feel free to rip this part out of the patch if you don't think
> it's worth the effort.

Convinced. Committed, thanks.

Jan.

--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting.php


More information about the dev mailing list