[Tickets #12567] Re: AS: Unlikely race condition leads to data loss
noreply at bugs.horde.org
noreply at bugs.horde.org
Tue Aug 13 10:40:35 UTC 2013
DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.
Ticket URL: http://bugs.horde.org/ticket/12567
------------------------------------------------------------------------------
Ticket | 12567
Updated By | Thomas Jarosch <thomas.jarosch at intra2net.com>
Summary | AS: Unlikely race condition leads to data loss
Queue | Synchronization
Version | Git master
Type | Bug
State | Unconfirmed
Priority | 1. Low
Milestone |
Patch |
Owners |
------------------------------------------------------------------------------
Thomas Jarosch <thomas.jarosch at intra2net.com> (2013-08-13 10:40) wrote:
When Horde_ActiveSync processes an incoming change, we write down the
modseq number and the object_uid of that change among other things.
The idea is to not send back changes originating from a client _back_
to this client on the next getChanges() run.
Because of the problems detailed before, the idea is to add
an md5 based change detection. The changes to the code
is limited to Horde_ActiveSync and not that huge.
The current, existing sync map looks like this:
-----------------------------
CREATE TABLE "horde_activesync_map" (
"message_uid" varchar(255) NOT NULL,
"sync_modtime" int,
"sync_key" varchar(255) NOT NULL,
"sync_devid" varchar(255) NOT NULL,
"sync_folderid" varchar(255) NOT NULL,
"sync_user" varchar(255)
, "sync_clientid" varchar(255), "sync_deleted" boolean);
-----------------------------
The only thing we would add is the new
"sync_md5" varchar(255)
field
Here's the pseudo new code:
-----------------------------
function process_change_from_PIM()
{
$md5 = md5(array_sort($object));
if (send_change_to_backend($object) == true) {
store_into_syncmap(highest_modseq, sync_key, md5 and the others)
}
}
function send_changes_to_client($start, $end, $current_sync_key)
{
$changes = getServerChanges($start, $end);
// GC the sync map (collect all entries for this device+folder)
$client_changes = "SELECT * from sync_map WHERE folder_id =
current_folder,
dev_id = current_device,
user = current_user
ORDER BY modseq";
// <-- the ORDER BY is important here
foreach($client_changes as $row => $client_change)
{
if ($client_change['modseq'] < $start &&
$client_change['sync_key'] < $current_sync_key) {
sql("DELETE from syncmap WHERE modseq =
$client_change['modseq'] AND folder_id=current AND devid=current AND
user=current");
unset($client_change[$row]);
}
}
// collect most current md5sum (=object state) on the client
// (the client could change an object multiple times
while a big sync _to_ the client is running)
// the $client_changes array is already sorted by modseq
// therefore we have the latest object state of the client at the
end of the loop
$latest_md5 = array();
foreach($client_changes as $client_change) {
$latest_md5[$client_change['object_uid']] = $client_change['md5'];
}
// final check if server change originated on the client
foreach($changes as $id => $change) {
$object_uid = $change['object_uid'];
if (isset($latest_md5[$object_uid])) {
$object = fetch_from_backend($object_uid);
$server_md5 = md5(array_sort($object));
if ($server_md5 == $latest_md5['object_uid']) {
Horde::log("Ignoring PIM initiated change for:
object_uid, modseq, md5");
unset($changes[$id]);
}
}
}
// get all remaining changes to send
$objects = fetch_from_backend($changes);
send_to_client($objects);
}
-----------------------------
As Michael pointed out, my first implementation is not perfect
since it currently needs another roundtrip through the API.
We try to move this to the point were we need to fetch the object anyway.
The new code simplifies garbage collection of the syncMap, too
since it's self-cleaning as soon as an entry is not needed anymore.
During normal work loads (=changes coming in from the server),
the syncMap is stays empty and no md5 checksumming is done at all.
We only need to checksum changes initiated by the client.
More information about the bugs
mailing list