[Tickets #12567] Re: AS: Unlikely race condition leads to data loss

noreply at bugs.horde.org noreply at bugs.horde.org
Tue Aug 13 10:40:35 UTC 2013


DO NOT REPLY TO THIS MESSAGE. THIS EMAIL ADDRESS IS NOT MONITORED.

Ticket URL: http://bugs.horde.org/ticket/12567
------------------------------------------------------------------------------
  Ticket             | 12567
  Updated By         | Thomas Jarosch <thomas.jarosch at intra2net.com>
  Summary            | AS: Unlikely race condition leads to data loss
  Queue              | Synchronization
  Version            | Git master
  Type               | Bug
  State              | Unconfirmed
  Priority           | 1. Low
  Milestone          |
  Patch              |
  Owners             |
------------------------------------------------------------------------------


Thomas Jarosch <thomas.jarosch at intra2net.com> (2013-08-13 10:40) wrote:

When Horde_ActiveSync processes an incoming change, we write down the  
modseq number and the object_uid of that change among other things.  
The idea is to not send back changes originating from a client _back_  
to this client on the next getChanges() run.

Because of the problems detailed before, the idea is to add
an md5 based change detection. The changes to the code
is limited to Horde_ActiveSync and not that huge.

The current, existing sync map looks like this:
-----------------------------
CREATE TABLE "horde_activesync_map" (
   "message_uid" varchar(255) NOT NULL,
   "sync_modtime" int,
   "sync_key" varchar(255) NOT NULL,
   "sync_devid" varchar(255) NOT NULL,
   "sync_folderid" varchar(255) NOT NULL,
   "sync_user" varchar(255)
, "sync_clientid" varchar(255), "sync_deleted" boolean);
-----------------------------

The only thing we would add is the new
   "sync_md5" varchar(255)

field


Here's the pseudo new code:
-----------------------------
function process_change_from_PIM()
{
     $md5 = md5(array_sort($object));

     if (send_change_to_backend($object) == true) {
         store_into_syncmap(highest_modseq, sync_key, md5 and the others)
     }
}


function send_changes_to_client($start, $end, $current_sync_key)
{
     $changes = getServerChanges($start, $end);

     // GC the sync map (collect all entries for this device+folder)
     $client_changes = "SELECT * from sync_map WHERE folder_id =  
current_folder,
                                           dev_id = current_device,
                                           user = current_user
                                           ORDER BY modseq";            
  // <-- the ORDER BY is important here

     foreach($client_changes as $row => $client_change)
     {
         if ($client_change['modseq'] < $start &&  
$client_change['sync_key'] < $current_sync_key) {
              sql("DELETE from syncmap WHERE modseq =  
$client_change['modseq'] AND folder_id=current AND devid=current AND  
user=current");
              unset($client_change[$row]);
         }
     }

     // collect most current md5sum (=object state) on the client
     // (the client could change an object multiple times
         while a big sync _to_ the client is running)
     // the $client_changes array is already sorted by modseq
     // therefore we have the latest object state of the client at the  
end of the loop

     $latest_md5 = array();
     foreach($client_changes as $client_change) {
          $latest_md5[$client_change['object_uid']] = $client_change['md5'];
     }

     // final check if server change originated on the client
     foreach($changes as $id => $change) {
          $object_uid = $change['object_uid'];
          if (isset($latest_md5[$object_uid])) {
               $object = fetch_from_backend($object_uid);
               $server_md5 = md5(array_sort($object));

               if ($server_md5 == $latest_md5['object_uid']) {
                   Horde::log("Ignoring PIM initiated change for:  
object_uid, modseq, md5");
                   unset($changes[$id]);
               }
           }
     }

     // get all remaining changes to send
     $objects = fetch_from_backend($changes);
     send_to_client($objects);
}
-----------------------------

As Michael pointed out, my first implementation is not perfect
since it currently needs another roundtrip through the API.
We try to move this to the point were we need to fetch the object anyway.

The new code simplifies garbage collection of the syncMap, too
since it's self-cleaning as soon as an entry is not needed anymore.

During normal work loads (=changes coming in from the server),
the syncMap is stays empty and no md5 checksumming is done at all.
We only need to checksum changes initiated by the client.






More information about the bugs mailing list