[dev] [commits] Horde branch master updated. c1f241be16aa88a278d0e51fc7baa78e39008cca

Michael M Slusarz slusarz at horde.org
Tue Aug 10 23:18:44 UTC 2010


Quoting Michael Rubinsky <mrubinsk at horde.org>:

> Quoting Chuck Hagenbuch <chuck at horde.org>:
>
>> Any reason this isn't just an option when creating uuids?
>>
>> Michael M Slusarz <slusarz at horde.org> wrote:
>>
>>> Quoting Chuck Hagenbuch <chuck at horde.org>:
>>>
>>>> Quoting Michael Rubinsky <mrubinsk at horde.org>:
>>>>
>>>>> It seems un-friendly to require a schema simply because we've
>>>>> changed our id generation for these fields. We can truncate the id
>>>>> when it's generated in the affected apps, but I'm unsure how that
>>>>> affects the uniqueness of the id. Or we could go back to the old
>>>>> way of generating the ids in these cases. Not sure why a UUID is
>>>>> needed in some of these cases - like for kronolith's event_id
>>>>> field, for instance.
>>>>
>>>> I think we should go ahead and increase the field length, or we
>>>> should go back to auto-increment ids. We don't need UUIDs here at
>>>> all - I think we thought at one point that we could combine the id
>>>> with the uid, but we can't, so...
>>>
>>> Alternatively, we could use Horde_Support_Randomid instead (which I
>>> just committed) - which produces more compact IDs than the UUIDs since
>>> it packs the data into base-36 instead of base-16 (and doesn't have
>>> any extraneous dashes).
>>>
>>> michael
>
> I've come across an issue with this implementation. I'm getting a  
> very large rate of duplicates being generated when run in a loop.  
> For example, when importing a large .ics file into kronolith, we are  
> generating ids for the kronolith_events table fairly quickly. While  
> running a test loop on my dev box, I'm getting somewhere around a  
> 20% duplicate rate. I imagine it has something to do with what the  
> substr() is taking off, but have to admit I'm not entirely clear how  
> the base conversion affects this.
>
> My test code is below. I realize this loop would run quicker then a  
> real life scenario, but this is causing DB constraint violations for  
> me when importing my calendar into kronolith. I'm seeing this in  
> roughly 1 out of 3 attempts at importing it.
>
> <code>
> $values = array();
> $cnt = 0;
> for($i = 0; $i < 10000; $i++) {
>     $id = (string)new Horde_Support_Randomid();
>     if (!empty($values[$id])) {
>         echo 'Duplicate: ' . $id . '<br/>';
>         $cnt++;
>     }
>     $values[$id] = !empty($values[$id]) ? $values[$id] += 1 : 1;
> }
> echo 'Duplicate count: ' . $cnt;
> print_r($values);
>
> </code>

I added unit tests for Randomid (and Uuid) in Horde_Support based on  
this code.  But I can't reproduce.  Even raising $i to 1,000,000, I  
can't reproduce.

Base conversion should not matter at all.  Because even if microtime()  
happened to be the same on two consecutive calls, uniqid()/mt_rand()  
should give a completely different value.

You wouldn't happen to be running the tests on a virtual machine,  
would you?  Or a 32bit machine?  That could be the issue - its  
possible that we are creating such a large hex number that conversion  
from base 16 -> 36 is causing issues.  Maybe we should split the  
string into smaller chunks...

michael

-- 
___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the dev mailing list