[dev] Large items are reducing the memcache hit rate

Michael M Slusarz slusarz at horde.org
Wed May 25 16:33:53 UTC 2011


Quoting Michael M Slusarz <slusarz at horde.org>:

> Quoting Gonçalo Queirós <goncalo.queiros at portugalmail.net>:
>
>> We started investigating the low hit rate of Memcache (around 50%), and
>> found out that turning large_items off raised the hit rate to around
>> 96%. The problem is due to the get misses generated when trying to get
>> the respective _os keys of items that are not large items.
>> This raise from 50% to 96% only happens because in our case we have less
>> than 1% of large items (the only ones we found were some sessions).
>>
>> Having this in mind it seems wrong to always try to get a _os key that
>> in the majority of times will not exist.
>
> Except that is a necessary evil.  The whole point is that you don't  
> know which pieces of data are large when doing the initial query.   
> Even if there is only 1 large session item, you still need to do  
> this for every item.
>
> Why is this a big deal?  Other than the fact it skews the cache hit  
> downward, why does this matter?
>
>> We have thought of several fixes to the problem but they all presented
>> some problems (more memcache requests, low hit rate if memcache mainly
>> composed by large objects, larger complexity), but at the end we came to
>> a solution that seems to solve all these problems.
>>
>> We have thought to use the Memcache flags to store the amount of pieces
>> that compose a large item.
>> In summary when we store an object, we also store the number of pieces
>> that composes it, and when we retrieve an object we check the flags to
>> see of how many pieces that item is made of.
>>
>> Ex with a large object (two pieces):
>>
>> new key        new flags   current key
>> object_key     2           object_key
>> object_key_s1  0           object_key_s1
>> -              -           object_key_os (with value = 2)
>>
>> With the new code we only need to store and get two objects on Memcache
>> versus 3 with the current code.
>>
>> Ex with a small object:
>> new key      new flags   current key
>> object_key   1           object_key
>>
>> With the new code we make one get and are done. With current code we
>> make a miss trying to get object_key_os and then a hit getting object_key

OK... methinks that there is a bug in the SMTP code somewhere, because  
this message was saved fine locally, but cut off at this period...

Anyway, this was the remainder of my message:


> We have made a small script to test the flag usage and everything seems
> to work as expected:
>
> $memcache =3D new Memcache();
> $memcache->addServer('127.0.0.1');
>
> $slices =3D 5;
> $memcache->set('a', 'va', $slices << 8);

I was going to say I was somewhat hesitant to use flags in this
manner.  However the answer was buried on the get() page (not the
set() page when I was originally writing this code).  And that is:

The lowest byte of the int is reserved for pecl/memcache internal
usage (e.g. to indicate compression and serialization status).

The way the set() page is written, it would appear that the $flags
parameter was only allowed for pre-defined PHP flags (e.g.
MEMCACHE_COMPRESSED).  But that appears to not be the case.

So this does seem to be a viable solution.  I would ask that you move
this thread to an enhancement ticket so we can more easily track any
patches that are provided.

michael

___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the dev mailing list