[dev] MySQL Categories queries

John Morrissey jwm at horde.net
Fri Dec 19 10:24:58 PST 2003


I've noticed a problem under MySQL with the queries that the Categories
framework is generating. For example (pardon the wrap):

mysql> EXPLAIN SELECT category_id, category_name, category_parents, category_order FROM
    -> horde_categories WHERE (category_parents LIKE ':251563:%' OR category_id =
    -> 251563 OR category_parents = ':251563') AND group_uid = 'horde.history'
    -> ;
+------------------+------+------------------------------------------------+--------------------+---------+-------+--------+-------------+
| table            | type | possible_keys                                  | key                | key_len | ref   | rows   | Extra       |
+------------------+------+------------------------------------------------+--------------------+---------+-------+--------+-------------+
| horde_categories | ref  | PRIMARY,category_group_idx,category_parent_idx | category_group_idx |     255 | const | 125974 | Using where |
+------------------+------+------------------------------------------------+--------------------+---------+-------+--------+-------------+

These queries take several seconds because the category_parents and
category_id clauses aren't using indexes. Forcing category_parent_idx with
USE INDEX category_parent_idx doesn't help.

Here's my table structure:

mysql> desc horde_categories;
+---------------------+---------------+------+-----+---------+-------+
| Field               | Type          | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+---------+-------+
| category_id         | int(11)       |      | PRI | 0       |       |
| group_uid           | varchar(255)  |      | MUL |         |       |
| user_uid            | varchar(255)  |      | MUL |         |       |
| category_name       | varchar(255)  |      | MUL |         |       |
| category_parents    | varchar(255)  |      | MUL |         |       |
| category_order      | int(11)       | YES  |     | NULL    |       |
| category_data       | text          | YES  |     | NULL    |       |
| category_serialized | smallint(6)   |      | MUL | 0       |       |
| category_updated    | timestamp(14) | YES  |     | NULL    |       |
+---------------------+---------------+------+-----+---------+-------+

I tried toying around with the query a little; here are some results.

'bad'  == 'long, unindexed query'
'good' == 'short, indexed query'

bad:  WHERE (category_parents LIKE ':251563:%') AND group_uid = 'horde.history'
good: WHERE (category_parents LIKE ':251563:%')
bad:  WHERE (category_parents LIKE ':251563:%' OR category_id = 251563) AND
        group_uid = 'horde.history'
bad:  WHERE (category_parents LIKE ':251563:%' OR category_id = 251563)
good: WHERE (category_parents LIKE ':251563:%' OR category_parents = ':251563')
bad:  WHERE (category_parents LIKE ':251563:%' OR category_parents = ':251563')
        AND group_uid = 'horde.history'
bad:  WHERE (category_id = 251563 OR category_parents = ':251563')

This could be a big problem because imp's Maillog depends on the Category
framework, which causes these queries to be run often.

I'm not sure how best to fix this; hopefully someone better with MySQL
optimization can step in.

thanks,
john
-- 
John Morrissey          _o            /\         ----  __o
jwm at horde.net        _-< \_          /  \       ----  <  \,
www.horde.net/    __(_)/_(_)________/    \_______(_) /_(_)__


More information about the dev mailing list