[dev] Large items are reducing the memcache hit rate

Wed May 25 17:24:21 UTC 2011


On 05/25/2011 05:33 PM, Michael M Slusarz wrote:
> Quoting Michael M Slusarz <slusarz at horde.org>:
>
>> Quoting Gonçalo Queirós <goncalo.queiros at portugalmail.net>:
>>
>>> We started investigating the low hit rate of Memcache (around 50%), and
>>> found out that turning large_items off raised the hit rate to around
>>> 96%. The problem is due to the get misses generated when trying to get
>>> the respective _os keys of items that are not large items.
>>> This raise from 50% to 96% only happens because in our case we have
>>> less
>>> than 1% of large items (the only ones we found were some sessions).
>>>
>>> Having this in mind it seems wrong to always try to get a _os key that
>>> in the majority of times will not exist.
>>
>> Except that is a necessary evil.  The whole point is that you don't
>> know which pieces of data are large when doing the initial query. 
>> Even if there is only 1 large session item, you still need to do this
>> for every item.
>>
>> Why is this a big deal?  Other than the fact it skews the cache hit
>> downward, why does this matter?
>>
>>> We have thought of several fixes to the problem but they all presented
>>> some problems (more memcache requests, low hit rate if memcache mainly
>>> composed by large objects, larger complexity), but at the end we
>>> came to
>>> a solution that seems to solve all these problems.
>>>
>>> We have thought to use the Memcache flags to store the amount of pieces
>>> that compose a large item.
>>> In summary when we store an object, we also store the number of pieces
>>> that composes it, and when we retrieve an object we check the flags to
>>> see of how many pieces that item is made of.
>>>
>>> Ex with a large object (two pieces):
>>>
>>> new key        new flags   current key
>>> object_key     2           object_key
>>> object_key_s1  0           object_key_s1
>>> -              -           object_key_os (with value = 2)
>>>
>>> With the new code we only need to store and get two objects on Memcache
>>> versus 3 with the current code.
>>>
>>> Ex with a small object:
>>> new key      new flags   current key
>>> object_key   1           object_key
>>>
>>> With the new code we make one get and are done. With current code we
>>> make a miss trying to get object_key_os and then a hit getting
>>> object_key
>
> OK... methinks that there is a bug in the SMTP code somewhere, because
> this message was saved fine locally, but cut off at this period...
>
> Anyway, this was the remainder of my message:
>
>
>> We have made a small script to test the flag usage and everything seems
>> to work as expected:
>>
>> $memcache =3D new Memcache();
>> $memcache->addServer('127.0.0.1');
>>
>> $slices =3D 5;
>> $memcache->set('a', 'va', $slices << 8);
>
> I was going to say I was somewhat hesitant to use flags in this
> manner.  However the answer was buried on the get() page (not the
> set() page when I was originally writing this code).  And that is:
>
> The lowest byte of the int is reserved for pecl/memcache internal
> usage (e.g. to indicate compression and serialization status).
>
> The way the set() page is written, it would appear that the $flags
> parameter was only allowed for pre-defined PHP flags (e.g.
> MEMCACHE_COMPRESSED).  But that appears to not be the case.
>
> So this does seem to be a viable solution.  I would ask that you move
> this thread to an enhancement ticket so we can more easily track any
> patches that are provided.
>
> michael
>
> ___________________________________
> Michael Slusarz [slusarz at horde.org]
>
Done. Will try to provide a patch this week

http://bugs.horde.org/ticket/10123

-- 
Gonçalo Queirós
Eng. Software
*m.* 913918777

*Portugalmail* | plataformas de inovação
*w.* http://www.portugalmail.net