[dev] Fix kronolith-agenda script

Gonçalo Queirós goncalo.queiros at portugalmail.net
Wed Mar 28 17:34:56 UTC 2012


Citando Gonçalo Queirós <goncalo.queiros at portugalmail.net>:
> On 03/21/2012 11:40 AM, Gonçalo Queirós wrote:  > Citando Jan  
> Schneider <jan at horde.org>:   > Zitat von Gonçalo Queirós  
> <goncalo.queiros at portugalmail.net>:    > Hi there dev.
>>>>
>>>>        We were trying to use the kronolith-agenda script to send  
>>>> daily agendas to
>>>>        everyone on our service, but the problem is that we  
>>>> currently have more
>>>>        than 350.000 shares and the script just runs out of  
>>>> memory, even with 2Gb!
>>>>
>>>>        Looking at the script more closely, we think there's a way  
>>>> to make the
>>>>        script work, regardless of the number of shares on the  
>>>> system, but we would
>>>>        like your opinion on that before coming out with a patch.
>>>>
>>>>        Current script:
>>>>        1 - Get every calendar share
>>>>        2 - For every share, list its events, to check if it has  
>>>> any event to the
>>>>        current day
>>>     This is not correct, there is no such step
>>    Sorry, was debugging on my own code ;-)   >> 3 - For the  
>> remaining list get all users that have access to the calendars
>>>>        4 - For every user, check if he desires to receive the  
>>>> daily agenda (pref)
>>>>        5 - For every user that wants to receive the daily agenda, get his
>>>>        calendars
>>>>        6 - Again for every calendar get the ones that have any  
>>>> event to the
>>>>        current day
>>>>        7 - Send the email if there's any calendar left
>>>>
>>>>        For our installation the current script stops on the first  
>>>> step, because
>>>>        it runs out of memory.
>>>>        We thought on creating sub-sets for the shares, but the  
>>>> problem is that we
>>>>        only know the full agenda of a user after we analyze all  
>>>> shares he has
>>>>        access to.
>>>>
>>>>        What we propose:
>>>>        1 - Get every users that desire to receive the daily agenda (pref)
>>>     That's exactly what steps 1, 3, and 4 do.
>>    I know, the difference is that instead of asking for all the shares, we
>>    could ask directly for all the users that wan't to receive an agenda, so
>>    we could eventually narrow the search.
>>    Also, knowing the user instead of the calendars allows us to immediately
>>    send the user his agenda, which will free the memory when the loop for
>>    that user ends.   >> 2 - execute the steps 5,6,7 of the original script
>>>>
>>>>        In the worst case scenario this script will still perform  
>>>> better than the
>>>>        current one, because it doesn't have the first 3 steps.
>>>>        With this approach we can create sub-sets of users which  
>>>> will allow the
>>>>        script to run until the end without running out of memory  
>>>> (even if this is
>>>>        a long process, it will execute)
>>>>
>>>>        Problems:
>>>>        We don't think there's currently a method to retrieve all  
>>>> prefs from the
>>>>        backed by its name. Maybe we need to create it, and state  
>>>> that this is for
>>>>        admin purposes only and shouldn't be called by the  
>>>> user-level code (just
>>>>        like the listAllShares method from Horde_Share_Sql).
>>>>        Currently the pref_name column is not indexed, so we  
>>>> expect slow queries.
>>>>        The fix for that is obvious.
>>>     The problem is that not all backends support listing  
>>> preference details for others than the current user. On the other  
>>> hand, those backends are already missing some features anyway, and  
>>> we already have a listScopes() method that is only implemented in  
>>> a few backends too. In the end this might be an option.
>>>     Actually, since the same script already requires the  
>>> preference backend to return any user's preference, this would be  
>>> a safe approach.
>>>
>>>     Another approach would be do take a further look at why  
>>> listAllShares() exceeds memory. Well, there is not much to look  
>>> actually when creating 350.000+ share objects and then attempting  
>>> to sort them. Alternatively we could implement an Iterator  
>>> interface in the share drivers and only receive the shares one by  
>>> one while looping them.
>>>
>>>     Jan.
>>>
>>>     --
>>>     The Horde Project
>>> http://www.horde.org/
>>>
>>>
>>>     --
>>>     Horde developers mailing list
>>>     Frequently Asked Questions: http://horde.org/faq/To  
>>> unsubscribe, mail: dev-unsubscribe at lists.horde.org
>>    If you iterate the shares one by one i think you will end up with a
>>    similar problem, because you can't send an agenda before you have all the
>>    events from all the calendars that a user has access to. So you would
>>    probably end up again with the 350.000+ shares on memory.
>>
>>    Will try to produce a patch and submit for your appreciation.
>   Jan, attached is a first preview patch, so you can see if things  
> are going in the desired direction.
>   Returning a class that implements the ArrayAccess allows backward  
> compatibility, but unfortunately, any array_ like function  
> (array_merge, array_keys, etc) don't work, so a bit more of refactor  
> is needed.
>
>   One thing that needs to be done is move the Horde_Share_List  
> factory to an injector (if I understand the injectors correctly),  
> but for now I instantiated the objects inside the Horde_Share_Sql  
> directly.
>
>   Another problem I found was that Horde_Shares can have callbacks  
> for listings, so for now I create the new Horde_Share_List object  
> and if the callback is active, that object is sent to the callback.  
> This might brake up for users that rely on array_ like functions,  
> because as stated above they wont work now.
>   What do you devs think?
Ping? :-)


More information about the dev mailing list