[dev] Large items are reducing the memcache hit rate

Michael M Slusarz slusarz at horde.org
Wed May 25 16:22:22 UTC 2011


Quoting Gonçalo Queirós <goncalo.queiros at portugalmail.net>:

> We started investigating the low hit rate of Memcache (around 50%), and
> found out that turning large_items off raised the hit rate to around
> 96%. The problem is due to the get misses generated when trying to get
> the respective _os keys of items that are not large items.
> This raise from 50% to 96% only happens because in our case we have less
> than 1% of large items (the only ones we found were some sessions).
>
> Having this in mind it seems wrong to always try to get a _os key that
> in the majority of times will not exist.

Except that is a necessary evil.  The whole point is that you don't  
know which pieces of data are large when doing the initial query.   
Even if there is only 1 large session item, you still need to do this  
for every item.

Why is this a big deal?  Other than the fact it skews the cache hit  
downward, why does this matter?

> We have thought of several fixes to the problem but they all presented
> some problems (more memcache requests, low hit rate if memcache mainly
> composed by large objects, larger complexity), but at the end we came to
> a solution that seems to solve all these problems.
>
> We have thought to use the Memcache flags to store the amount of pieces
> that compose a large item.
> In summary when we store an object, we also store the number of pieces
> that composes it, and when we retrieve an object we check the flags to
> see of how many pieces that item is made of.
>
> Ex with a large object (two pieces):
>
> new key        new flags   current key
> object_key     2           object_key
> object_key_s1  0           object_key_s1
> -              -           object_key_os (with value = 2)
>
> With the new code we only need to store and get two objects on Memcache
> versus 3 with the current code.
>
> Ex with a small object:
> new key      new flags   current key
> object_key   1           object_key
>
> With the new code we make one get and are done. With current code we
> make a miss trying to get object_key_os and then a hit getting object_key


More information about the dev mailing list