[horde] Horde3 performance problems - resolved

Andrew Morgan morgan at orst.edu
Mon Oct 1 17:12:17 UTC 2007


On Sun, 30 Sep 2007, Chuck Hagenbuch wrote:

> Quoting Andrew Morgan <morgan at orst.edu>:
>
>> Running EXPLAIN on this query showed that MySQL had to examine
>> approximately 55000 rows (ouch).  We have a beefy MySQL server (2
>> dual-core Opterons, 8GB of RAM), but the load average went over 100
>> when students returned this week.
>>
>> I solved the problem by adding the following index:
>>
>> CREATE INDEX datatree_attribute_value_idx
>> ON horde_datatree_attributes (attribute_value(12));
>>
>> I chose a key size of "12" because that should cover the size of
>> typical usernames stored in the attribute_value column.  After
>> adding this index, MySQL only has to examine 9 rows to satisfy the
>> query.  The original query took anywhere from 0.5-1.5 seconds.  With
>> the index, it takes 0.00 seconds.
>
> I've added this to the sql and upgrade scripts for Horde 3.2. Thanks again!

Excellent, thanks.  Will there be another alpha release of Horde 3.2 soon? 
I want to start testing Horde 3.2 in next few weeks.  :)

>> The second issue is the function buildAttributeQuery() in
>> lib/Horde/DataTree/sql.php.  Is it necessary to use GROUP BY and
>> ORDER BY in the default case when no sorting order is specified?  I
>> can't find any instance of this function being called using an
>> aggregate SQL function (SUM, MAX, MIN, COUNT, etc).  When the
>> DataTree query is built using this function, the GROUP BY and ORDER
>> BY clauses force MySQL to create a temporary table and use a
>> filesort.  Both of these clauses seem unnecessary (at our site
>> anyways) because the query always returns 1 row.
>>
>> Before I created the index above, I tried modifying
>> buildAttributeQuery(). See the attached patch file.  This change
>> gave a modest performance improvement.
>
> Unfortunately this breaks Ansel. I haven't yet had time to figure out
> why, but some galleries simply disappear.

I don't use Ansel here, but it doesn't surprise me that removing the GROUP 
BY functionality broke things.  :(

While removing the mandatory GROUP BY and ORDER BY clauses yields some 
small performance improvements, the new index is the big winner.  If you 
want to leave buildAttributeQuery() as-is, that is fine.  After a few more 
days of measuring load average to establish a baseline, I can revert my 
changes to buildAttributeQuery() and see if they made any appreciable 
difference.

>> If it is necessary to include GROUP BY functionality in
>> buildAttributeQuery(), perhaps it could be enabled by a test for the
>> use of aggregate functions or as an optional argument passed to
>> buildAttributeQuery().  Also, I would skip the ORDER BY clause
>> entirely unless buildAttributeQuery() is called with a sortby_name
>> or sortby_key argument.  If the caller of buildAttributeQuery()
>> wants the results sorted, they should specify a sort order.
>
> That sounds great, but unfortunately I suspect that we rely on the
> ordering parts now and changing it would break apps that aren't
> updated accordingly. We are going to simply ditch DataTree entirely
> for Horde 4, when we can make these kinds of breaks. Hopefully the
> tweaks that have gone in already for 3.2, and this index, and similar
> work, will be enough for it to scrape by for now.

When is Horde 4 coming out?  Ah, I'm just messing with you...  :)

 	Andy


More information about the horde mailing list