[dev] Horde Unified Search

august huber a at pbx.org
Thu May 11 21:06:31 PDT 2006


On 5/11/06, Chuck Hagenbuch <chuck at horde.org> wrote:
>
> > To meet these requirements I propose creating a module which exposes the
> > contents and metadata of all horde objects (calendar items, notes, mail
> > messages) in their raw form to the search spider, but when accessed by
> an
> > authenticated user, they are redirected to the appropriate location in
> the
> > associated module.
>
> I don't really follow what you mean about access by an authenticated
> user vs. by the spider.


I mean that you have a path IE:
http://site/horde/search/rewrite.php/username/imp/messagefolder/messageid
which when accessed via the spider will simply dump the raw contents of the
message (or calendar event, note, task), along with any corresponding
metadata.
When this same path is accessed via the actual user will redirect to show
the corresponding message within imp.

To support this, when the spider hits the URL
http://site/horde/search/rewrite.php/uesrname/imp/messagefolder
it will be presented with a list of URL's for all of the messages insde

The purpose of this methodology is to expose all horde data in a method
which _ALL_ webspiders understand and will be able to index - a spider-able
tree of data containing internally referencing hyperlinks.


> The module could expose all the data via path components IE:
> > /horde/search/search.php/username/modulename/objectid
> > so that ACL's can easily be established in the search index via the path
> > name
>
> You have shares to deal with, also, not just usernames...


I would propose exposing shared data to the spider repeatedly for each user
to simplify access control.  The downside is redundant indexing, and and
more complications in regards to notification of changes to the indexing
agent.


I'm still not sure where you're checking permissions. It sounds like
> you want to leave that to the apps, but then what about exposing
> search results to the user that they can't see?
>

the filtering is done by path so that users can only see documents within
'their path'.

I am convinced this is not the most elegant way to accomplish the goal of
'unified search', however it appears to be the most feasible approach to
ensure interoperability.

another option of filtering is by tags or 'metatags', however Zend_Search
(loosely lucene) does not appear to support this based on a cursory glance.

I am focusing my work around the google search appliance as most of the
searching I need to do relates to documents accessible via gollem, so I must
confess my bias here.  This does not excuse me from developing a solution
which will be of use to others not willing to drop the cash for a GSA
however.

thanks for the comments, they are always appreciated.

-- 

     august huber <a at pbx.org>
     (pbx labs +1.212.464.7306)


More information about the dev mailing list