[wicked] Horde :: Wicked ::Patches to solve problems with web robots

kracker thekracker at gmail.com
Tue Dec 14 17:51:00 PST 2004


Greetings Horde : Wicked Fans,

Today I'm resubmitting some notes and patches I've written to deal
with web robots / spiders which can cause serious problems and data
loss with a publicly available wicked installation. These patches were
tested myself and should work for others without issue.

Since web robots crawl through a web page and hit every link on a page
in the order they are presented, this creates a problem for the wicked
because it's unlock link comes before the history link and the history
page contains links to revert to a previous revision without
confirmation, so robots who crawl through a site can easily /
unknowingly revert large chunks of your content until your wicked
installation in total disarray.

Here is a patch to move the history button and the lock button (switch
places) so that robots don't accidentally unlock a page, then revert
the page's revisions in the (secondary) history links . . . (as
described in my previous pages).

diff -a standard.inc.org standard.inc > standard.inc.patch

##########################################

29,41d28
< if ($this->allows(WICKED_MODE_LOCKING)) {
<     separator();
<     if ($this->isLocked()) {
<         echo Horde::widget($this->pageUrl('display.php', 'unlock'),
<                            sprintf(_("Unlock %s"), $this->pageName()),
<                            'widget', '', '', _("Unlock"));
<     } else {
<         echo Horde::widget($this->pageUrl('display.php', 'lock'),
<                            sprintf(_("Lock %s"), $this->pageName()),
<                            'widget', '', '', _("Lock"));
<     }
< }
<
58a46,52
> if ($this->allows(WICKED_MODE_HISTORY)) {
>     separator();
>     echo Horde::widget($this->pageUrl('history.php'),
>                        sprintf(_("History of %s"), $this->pageName()),
>                        'widget', '', '', _("History"));
> }
>
85c79
< if ($this->allows(WICKED_MODE_HISTORY)) {
---
> if ($this->allows(WICKED_MODE_LOCKING)) {
87,89c81,89
<     echo Horde::widget($this->pageUrl('history.php'),
<                        sprintf(_("History of %s"), $this->pageName()),
<                        'widget', '', '', _("History"));
---
>     if ($this->isLocked()) {
>         echo Horde::widget($this->pageUrl('display.php', 'unlock'),
>                            sprintf(_("Unlock %s"), $this->pageName()),
>                            'widget', '', '', _("Unlock"));
>     } else {
>         echo Horde::widget($this->pageUrl('display.php', 'lock'),
>                            sprintf(_("Lock %s"), $this->pageName()),
>                            'widget', '', '', _("Lock"));
>     }

##########################################

It seems that the fastsearch robot which is regularly reverting
(still) the pages, can only do so if the pages are left unlocked.

It also seems that the robot hits every link (in order) as it trolls
the site, meaning it hits the unlock link before hit the history page
(which displays the revert links, if the page is first unlocked)

I started this email after I wrote and tested a wicked patch to lock a
wiki page immediately after a page is saved (after an edit, see below
) so that the wiki pages are by default always locked and must be
unlocked to edit to reduce the chance of the wiki being unlocked and
then reverted (only unlocked pages can be reverted .... )

Below is are two patches, the first is for the auto_lock feature, the
second is to not process wiki pages for the IP address of the
fastsearch.net robot (which is nice because it kills the app if the
robot tries to use it but lets other robots which are not so
destructive continue to troll the wiki (i kinda like google cache :) )

My patches will help prevent the problem but they are not a guaranteed solution.
- Possible Bug #1: May arise from search robots who also revert pages
which come from other IP Addresses
- Possible Solution to Bug #1: Switch ip address ban code from a
single ip comparison into an array of banned ip addresses which are
looped over and compared one by one.

- Possible Bug #2: May arise from other robots which troll the links
in an order (displayed first),  which would follow the unlock the page
it is currently on first, then follow the history link on the same
page (displayed after the unlock link) and then follow the revert
links on the history page of the page (most likely one by one in order
till it hits all of them...)
- Possible Solution to Bug #2: Move the Unlock page link to display
after the history link (thus changing the order in which robots troll
the site)

- Possible Alternate Solution : A javascript confirmation popup window
before a revert request is sent to the server instead of a direct link
with server side validation because most web robots can not deal with
javascript events (sides most users use v6+ browsers w/ js anyway)
 
########################################################


file: /home/web/demo/horde/horde/wicked/lib/Page/EditPage.php

diff -a lib/Page/EditPage.php.org lib/Page/EditPage.php > EditPage.patch
109a110,111
>         include_once("auto_lock.php");
>

######################################################

file: /home/web/demo/horde/horde/wicked/auto_lock.php

<?
        //example url: page=WikiHome&actionID=lock

        $auto_lock = true;
        $auto_lock_debug = false;

        if ($auto_lock) {
          $url = Util::addParameter('display.php','page',$page->pageName() );
          $url = Util::addParameter($url,'actionID','lock');
          $url = Horde::applicationUrl($url, true);

         if ($auto_lock_debug)
              die($url);

         header('Location: ' . $url);
        }
?>

##########################################################

file: /home/web/demo/horde/horde/wicked/display.php

diff -a display.php.org display.php
10a11,16
> $ip = $_SERVER['REMOTE_ADDR'];
>
> if( $ip == "66.151.181.4" ) {
>    die("Your IP address / class has been banned by the aklug administrator and is not allowed to view this page!<br />\nIf you have encountered this message in error you may wish to inform the administrator at  info at aklug dot org");
> }
>

##########################################################

let me know how these suggestions find you and which ones you implement

Cheers,
Graham Brookins
Brookins Consulting

F:\television\networks\nickelodeon\invader_zim\eppisodes_s3\Invader
Zim - 1x32 - Gir Goes Crazy And Stuff.avi


More information about the wicked mailing list