[horde] Horde3 performance problems - resolved

Andrew Morgan morgan at orst.edu
Thu Sep 27 20:28:31 UTC 2007


Like many people on this list, I was having performance problems when I 
upgraded from Horde2 to Horde3 (using IMP, Kronolith, Turba, Nag, and 
Mnemo).

With debug logging enabled, I see a lot of Datatree related SQL queries 
anytime the portal summary page is loaded (20-25).  Each view of the 
Kronolith calendar, Nag tasks, or Mnemo notes pages generates 15 more 
Datatree queries.  One of the most common queries is:

SELECT c.datatree_id, c.datatree_name
FROM horde_datatree c
LEFT JOIN horde_datatree_attributes a1 ON a1.datatree_id = c.datatree_id
WHERE c.group_uid = 'horde.shares.nag'
AND (a1.attribute_name = 'owner' AND a1.attribute_value = 'user123 at onid.orst.edu')
GROUP BY c.datatree_id, c.datatree_name, c.datatree_order
ORDER BY c.datatree_order, c.datatree_name, c.datatree_id;

Running EXPLAIN on this query showed that MySQL had to examine 
approximately 55000 rows (ouch).  We have a beefy MySQL server (2 
dual-core Opterons, 8GB of RAM), but the load average went over 100 when 
students returned this week.

I solved the problem by adding the following index:

CREATE INDEX datatree_attribute_value_idx
ON horde_datatree_attributes (attribute_value(12));

I chose a key size of "12" because that should cover the size of typical 
usernames stored in the attribute_value column.  After adding this index, 
MySQL only has to examine 9 rows to satisfy the query.  The original query 
took anywhere from 0.5-1.5 seconds.  With the index, it takes 0.00 
seconds.



The second issue is the function buildAttributeQuery() in 
lib/Horde/DataTree/sql.php.  Is it necessary to use GROUP BY and ORDER BY 
in the default case when no sorting order is specified?  I can't find any 
instance of this function being called using an aggregate SQL function 
(SUM, MAX, MIN, COUNT, etc).  When the DataTree query is built using this 
function, the GROUP BY and ORDER BY clauses force MySQL to create a 
temporary table and use a filesort.  Both of these clauses seem 
unnecessary (at our site anyways) because the query always returns 1 row.

Before I created the index above, I tried modifying buildAttributeQuery(). 
See the attached patch file.  This change gave a modest performance 
improvement.

If it is necessary to include GROUP BY functionality in 
buildAttributeQuery(), perhaps it could be enabled by a test for the use 
of aggregate functions or as an optional argument passed to 
buildAttributeQuery().  Also, I would skip the ORDER BY clause entirely 
unless buildAttributeQuery() is called with a sortby_name or sortby_key 
argument.  If the caller of buildAttributeQuery() wants the results 
sorted, they should specify a sort order.


 	Andy
-------------- next part --------------
--- sql.php.bak	2007-09-25 13:56:35.000000000 -0700
+++ sql.php	2007-09-25 14:20:53.000000000 -0700
@@ -1429,14 +1429,13 @@
                              ($direction == 1) ? 'DESC' : 'ASC');
             $group_by = 'c.datatree_id, c.datatree_name, c.datatree_order, ' .
                 $order_table . '.attribute_value';
+            $tail = sprintf('GROUP BY %s ORDER BY %s', $group_by, $order);
         } else {
-            $order = 'c.datatree_order, c.datatree_name, c.datatree_id';
-            $group_by = 'c.datatree_id, c.datatree_name, c.datatree_order';
+            $tail = '';
         }
 
         $joins = implode(' ', $joins);
         $pairs = implode(' ', $pairs);
-        $tail = sprintf('GROUP BY %s ORDER BY %s', $group_by, $order);
 
         switch ($operation) {
         case DATATREE_BUILD_COUNT:


More information about the horde mailing list