[imp] mailbox listing speed-up

Michael M Slusarz slusarz at horde.org
Fri Jun 13 05:21:10 UTC 2008


Quoting "Ziba Scott" <ziba at umich.edu>:

> Hi imp devs and all,
>
> After our recent horde/imp upgrade at umich, we've seen login times rise
> from ~5-10 seconds to ~10-30 seconds.   The bulk of this time is spent
> building the mailbox list in IMP_Tree (imp/lib/IMAP/Tree.php).  We've
> made some changes which dramatically reduce the IMP_Tree startup time
> and would love some feedback and basic sanity checking before we move
> forward with them.
>
> On an unloaded test server in our environment, imap_getmailboxes and
> imap_list commands can take up to 1.7 seconds each against our Cyrus system.

That seems like an excessively long time to run those commands, even  
on a loaded IMAP server.  Depending on the mailstore, listing folders  
is generally a filesystem event - in Maildir, for example, it simply  
consists of listing the contents of a directory and cross-referencing  
against the list of subscribed folders.  This is an action that should  
take milliseconds at most, not several seconds.  If your IMAP server  
is that loaded (or if the filestore I/O is that busy that it takes  
that long to do a request), that is something that needs to be fixed  
no matter what is done with IMP_Tree.

> First, IMP_Tree calls imap_getmailboxes for each namespace with a %
> appended to get one level deep of subfolders.
>
> Second, IMP_Tree::_initSubscribed executes imap_list for each namespace
> with an * appended to get all sublevels of folders.
>
> Last, it recursively crawls each folder found during the first set of
> imap_getmailboxes and calls imap_getmailboxes on each folder with %
> again until all branches have been plumbed.
>
> Our improvements are premised on the fact that one level searches (%)
> and searches which return full trees of results (*) actually take the
> same time to return from Cyrus.  So we are bypassing all other searches
> and starting off with one single full (*) search from the root of the
> user's mailbox.

The eason why we walk the tree is explained here (RFC 2683 [3.2.1.1]):
http://www.rfc-archive.org/getrfc.php?rfc=2683

There was a time in the past that at least one installation was  
sharing a folder that contained thousands of subfolders.  The intent  
was that, on login, there was no need to scan and process that entire  
folder unless/until the user browsed to that folder.  Thus, the many  
KB of code in IMP_Tree to walk only as much of the tree as necessary,  
and the the logic to figure all of this out (guessing children, etc).

However, over the years, this walking has essentially become moot  
because in almost every instance, IMP/DIMP will build the entire tree  
on login.  In IMP, the entire tree is needed to build the sidebar, and  
the entire tree is needed to build the drop-down list.  In DIMP, the  
entire tree is needed to build the list in the sidebar panel  
(theoretically, we could do some fancy AJAX loading every time a  
folder is expanded, but the overhead of doing this coupled with the  
user's waiting, eliminate any potential benefit).

Especially now, the whole idea of walking the tree doesn't seem to  
make much sense anymore.

> One quirk we had to side step then was that % searches will return
> objects for folders which have children that can't be directly accessed
> themselves.  The * search will not.
>
> So we just iterate through the php data structure and detect parentless
> children and fake up an object for them.  Then, if tree_view is turned
> on, we separate non-personal mailboxes into  IMPTREE_OTHER_KEY and
> IMPTREE_SHARED_KEY.
>
> Since the list of folders at this point has all the children, we bypass
> the recursive calls to expand.
>
> This also partially fixes bug: http://bugs.horde.org/ticket/6703:
> "problems displaying subfolders whose parent's name is a substring of
> another folder name w/ a space in it"
> It fixes it in the folder drop down display but not on the folder list.
> (I'll look into that after we get people logging in nice and quick like :)
>
> If anyone wants to try it out, I've attached a diff against Tree.php ,v
> 1.185 (our test env).  The actual changes aren't that big.  Blocks that
> got indented for style reasons make it look bigger.  Our changes are
> turned on and off with a new config option: $conf['imap']['tree_build']
> = 'quick';
>
> Our initial tests are very promising.  Tree building on login is down
> from about (1.7sec/per namespace)*2 + (0.2 sec/per subfolder) to simply
> 1.7 seconds.  The preference permutations we've checked haven't shown
> any user interface changes (except for bug 6703).
>
> Does anyone see problems we've missed?  What would it take to make this
> patch acceptable/desirable for inclusion in imp?

I will probably take a look at this over the next few days and provide  
feedback/comments.  My preference at this point would be to simply  
scrap all of the tree walking code - 50% of IMP_Tree has to be code  
dealing with walking the tree - and just do a simple imap_list or  
imap_lsub call and build the tree with that information.

michael

-- 
___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the imp mailing list