[dev] Fix kronolith-agenda script
Jan Schneider
jan at horde.org
Thu Mar 29 14:14:23 UTC 2012
Zitat von Gonçalo Queirós <goncalo.queiros at portugalmail.net>:
> On 03/21/2012 11:40 AM, Gonçalo Queirós wrote:
>> Citando Jan Schneider <jan at horde.org>:
>>> Zitat von Gonçalo Queirós <goncalo.queiros at portugalmail.net>:
>>>> Hi there dev.
>>>>
>>>> We were trying to use the kronolith-agenda script to send daily
>>>> agendas to
>>>> everyone on our service, but the problem is that we currently have more
>>>> than 350.000 shares and the script just runs out of memory,
>>>> even with 2Gb!
>>>>
>>>> Looking at the script more closely, we think there's a way to make the
>>>> script work, regardless of the number of shares on the system,
>>>> but we would
>>>> like your opinion on that before coming out with a patch.
>>>>
>>>> Current script:
>>>> 1 - Get every calendar share
>>>> 2 - For every share, list its events, to check if it has any
>>>> event to the
>>>> current day
>>> This is not correct, there is no such step
>> Sorry, was debugging on my own code ;-)
>>>
>>>> 3 - For the remaining list get all users that have access to the calendars
>>>> 4 - For every user, check if he desires to receive the daily
>>>> agenda (pref)
>>>> 5 - For every user that wants to receive the daily agenda, get his
>>>> calendars
>>>> 6 - Again for every calendar get the ones that have any event to the
>>>> current day
>>>> 7 - Send the email if there's any calendar left
>>>>
>>>> For our installation the current script stops on the first step, because
>>>> it runs out of memory.
>>>> We thought on creating sub-sets for the shares, but the problem
>>>> is that we
>>>> only know the full agenda of a user after we analyze all shares he has
>>>> access to.
>>>>
>>>> What we propose:
>>>> 1 - Get every users that desire to receive the daily agenda (pref)
>>> That's exactly what steps 1, 3, and 4 do.
>> I know, the difference is that instead of asking for all the shares, we
>> could ask directly for all the users that wan't to receive an agenda, so
>> we could eventually narrow the search.
>> Also, knowing the user instead of the calendars allows us to immediately
>> send the user his agenda, which will free the memory when the loop for
>> that user ends.
>>>> 2 - execute the steps 5,6,7 of the original script
>>>>
>>>> In the worst case scenario this script will still perform
>>>> better than the
>>>> current one, because it doesn't have the first 3 steps.
>>>> With this approach we can create sub-sets of users which will allow the
>>>> script to run until the end without running out of memory (even
>>>> if this is
>>>> a long process, it will execute)
>>>>
>>>> Problems:
>>>> We don't think there's currently a method to retrieve all prefs from the
>>>> backed by its name. Maybe we need to create it, and state that
>>>> this is for
>>>> admin purposes only and shouldn't be called by the user-level code (just
>>>> like the listAllShares method from Horde_Share_Sql).
>>>> Currently the pref_name column is not indexed, so we expect
>>>> slow queries.
>>>> The fix for that is obvious.
>>> The problem is that not all backends support listing preference
>>> details for others than the current user. On the other hand, those
>>> backends are already missing some features anyway, and we already
>>> have a listScopes() method that is only implemented in a few
>>> backends too. In the end this might be an option.
>>> Actually, since the same script already requires the preference
>>> backend to return any user's preference, this would be a safe
>>> approach.
>>>
>>> Another approach would be do take a further look at why
>>> listAllShares() exceeds memory. Well, there is not much to look
>>> actually when creating 350.000+ share objects and then attempting
>>> to sort them. Alternatively we could implement an Iterator
>>> interface in the share drivers and only receive the shares one by
>>> one while looping them.
>>>
>>> Jan.
>>>
>>> --
>>> The Horde Project
>>> http://www.horde.org/
>>>
>>>
>>> --
>>> Horde developers mailing list
>>> Frequently Asked Questions: http://horde.org/faq/To unsubscribe,
>>> mail: dev-unsubscribe at lists.horde.org
>> If you iterate the shares one by one i think you will end up with a
>> similar problem, because you can't send an agenda before you have all the
>> events from all the calendars that a user has access to. So you would
>> probably end up again with the 350.000+ shares on memory.
>>
>> Will try to produce a patch and submit for your appreciation.
>
> Jan, attached is a first preview patch, so you can see if things are
> going in the desired direction.
> Returning a class that implements the ArrayAccess allows backward
> compatibility, but unfortunately, any array_ like function
> (array_merge, array_keys, etc) don't work, so a bit more of refactor
> is needed.
>
> One thing that needs to be done is move the Horde_Share_List factory
> to an injector (if I understand the injectors correctly), but for
> now I instantiated the objects inside the Horde_Share_Sql directly.
>
> Another problem I found was that Horde_Shares can have callbacks for
> listings, so for now I create the new Horde_Share_List object and if
> the callback is active, that object is sent to the callback. This
> might brake up for users that rely on array_ like functions, because
> as stated above they wont work now.
>
> What do you devs think?
- You cannot implement ArrayAccess in the base class and Iterator in
the sub-class only. Why do you implement ArrayAccess at all? If
necessary, the Horde_Share_List consumers can use iterator_to_array().
- You cannot use PDO functionality in the SQL driver. You need to use
the Horde_Db API. Horde_Db's select() already returns an Iterator that
you can proxy in Horde_Share_List.
- You cannot use the injector or any other globals in library code.
- You only need share_id in the SQL driver's SELECT query.
- Using the iterator obviously scales better memory-wise, but did you
test how it scales performance-wise. IIUC you are now reading each
share individually from the backend which could be a performance
degration.
Jan.
--
The Horde Project
http://www.horde.org/
More information about the dev
mailing list