[dev] History of share changes
Jan Schneider
jan at horde.org
Tue Aug 11 12:23:52 UTC 2009
Zitat von Evert Pot <evertpot at gmail.com>:
>> For CalDAV and other stuff we need to track when a share has been
>> changed, or more precisely, when the content of a share, i.e the
>> events/tasks in it have been changed.
>> Currently we only store history entries for individual objects, not
>> for the complete share. To identify objects we use a UID like
>> application:share_id:object_id.
>>
>> There are two ways to solve this. The simple one is to extend the
>> History API to get the timestamp from the most recently changed
>> entry in a share with:
>> SELECT history_ts FROM horde_histories WHERE object_uid LIKE
>> 'app:share:%' ORDER BY history_ts DESC LIMIT 1
>>
>> The more complex solution is to add specific history entries for
>> share changes in all places where we already log object changes.
>> We could then use a query like:
>> SELECT history_ts FROM horde_histories WHERE object_uid = 'app:share'
>
>> Usually I would simply pick the most simple solution, but since the
>> history table could grow pretty large, this might have an impact on
>> the database performance. I'd like to get some feeback from anyone
>> who has more experience with fine tuning database queries, where
>> there may be a significant performance hit with the first query in
>> a large table. Obviously this is slower, but does it outweigh the
>> necessary code complexity and additional database record for the
>> faster query?
>>
>> Also, and this question actually goes to Ben and Evert, do we only
>> need ETags for existing shares/resources (which is why we need the
>> change timestamp), or do we have to track the time of the
>> deletion/addition of complete resources?
>
> Unless I'm misunderstanding the question, there's no need from
> CalDAV's perspective to preserve older versions of events.
> I would personally generate ETags based on the content of an item
> (MD5 hash), not the timestamp. It might be worthwhile storing this
> ETag for caching purposes.
>
> As for your points about DB speed. Like many others, my background
> is mostly MySQL based.
> I don't think you'll have to worry about the more complicated query
> if you have a proper index.
>
> However, using a string for your unique id (uid) is a bad idea. If
> you were to split up that field, and use plain old integers (appid,
> shareid) you will will be able to add a BTREE index on appid,
> shareid, history_ts. In that case size will not be really be an
> issue. If you're only interested in the history_ts field, in that
> case the index will be a covering index, and the underlying table
> will actually not be accessed at all.
>
> For large tables that are accessed often, you should pick InnoDB for
> mysql, and recommend a large setting for innodb_buffer_pool_size. If
> the latter setting is low (the default) MySQL will not be happy.
>
> If you are still worried about this after all that, create both a
> history table and a 'current' table. The history table will likely
> not be accessed as much as the current dataset, so your 'current'
> table will be smaller in size, and you can optimize for specific
> access patterns.
Any more opinions about this?
Jan.
--
Do you need professional PHP or Horde consulting?
http://horde.org/consulting/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digitale PGP-Unterschrift
URL: <http://lists.horde.org/archives/dev/attachments/20090811/4f577536/attachment.bin>
More information about the dev
mailing list