[dev] [Corrected] Horde_Imap_Client and fetching vanished messages.

Michael M Slusarz slusarz at horde.org
Mon Jan 16 08:09:27 UTC 2012


Quoting Michael J Rubinsky <mrubinsk at horde.org>:

> Quoting Michael M Slusarz <slusarz at horde.org>:
>
>> Quoting Michael J Rubinsky <mrubinsk at horde.org>:
>>
>>> Quoting Michael M Slusarz <slusarz at horde.org>:
>>>
>>>> Quoting Michael J Rubinsky <mrubinsk at horde.org>:
>>>>
>>>>> Some background: This code is all for the purpose of syncing  
>>>>> email over ActiveSync. I'm using modseq and changedsince to  
>>>>> retrieve the uids of any recently changed email. In this  
>>>>> context, 'changed' would mean a new, never before seen email, or  
>>>>> an email that has had the seen flag added or removed.
>>>>
>>>> Pardon my ignorance: what does the ActiveSync client send to the  
>>>> server to indicate the current status of its synchronized cache?   
>>>> Is it a user-definable cache ID?  Or is it a timestamp of the  
>>>> last sync?  Or something else?
>>>
>>> First off, you probably know this, but the ActiveSync client knows  
>>> *nothing* about IMAP (or POP3 for that meatter). In fact, it  
>>> doesn't care at all where the messages come from, as long as it  
>>> receives them in the format defined by the ActiveSync protocol. As  
>>> far as state information goes, the only thing the ActiveSync  
>>> client sends to or receives from the ActiveSync server is it's  
>>> current syncKey. The syncKey is basically a random hash string  
>>> with an integer tacked onto the end of it. The server generates  
>>> this key during the first sync of each collection  
>>> (email|contacts|calendar etc...). The client sends this key along  
>>> with each SYNC and PING request to notify the server what it is  
>>> assuming the last known state was. When the state changes, the  
>>> server increments the syncKey after sending changes to the client.  
>>> This is the only bit of identifying information the client ever  
>>> gets or sends. Server side, this syncKey is linked to the server  
>>> state at the time that the syncKey was generated. So, e.g., with  
>>> contacts or calendar data, this state is basically a timestamp. We  
>>> use this timestamp to query the History system to get server  
>>> changes to send.
>>>
>>> For mail, the state consists of the modseq/nextuid/uidvalidity  
>>> data, along with a list of UIDs and their seen state that are on  
>>> the device.
>>
>> To be clear, this is what each setup (QRESYNC and plain RFC 3501  
>> IMAP) needs in term of state:
>>
>> QRESYNC: UIDVALIDITY, MODSEQ
>> IMAP: UIDVALIDITY, UIDNEXT, message flag information
>>
>> QRESYNC does not require UIDNEXT or mesage flag information  
>> (unclear if that's what you were suggesting).
>
> I keep the flag data (actually just whether or not the seen flag is  
> set), so we know what the device thinks the message's "read" state  
> is (what ActiveSync refers to it as). That way we know if a flag  
> change needs to be sent to the device or not. Sending unnecessary  
> changes, even just flag changes, to the device wastes mobile  
> bandwidth and contributes to poor battery performance since the  
> change causes the currently running PING to terminate, a new SYNC  
> request to be made and handled, and finally, a new PING request to  
> be made. Each one of those requests, obviously, has all the normal  
> ActiveSync protocol overhead.
>
> I guess one could argue here that this point is moot since these  
> changes would rarely be duplicates; If the IMAP server is sending a  
> change, it would be rare that the flag on the device would already  
> match the flag on the IMAP server. The only place this would  
> consistently happen is when a flag is initially changed on the  
> device. This causes the change to be sent to the IMAP server  
> (through the ActiveSync code, of course) which, in turn, will cause  
> the flag change to be detected the next time we FETCH changes with  
> changedsince.

No, this shouldn't happen.  Say that, on a sync, you get a read flag  
from the ActiveSync device.  You will take this flag change and send  
to the IMAP server.  Once you are done with all the changes needed on  
the IMAP server, you will grab the highestmodseq number and store in  
your local cache.  Thus, if nothing changes on the IMAP server the  
next time you sync, the highestmodseq will equal the current modseq of  
the mailbox thus indicating you are properly sync'd.

> Plus, this case can be dealt with the same way we deal with  
> device-caused changes in the other collections - we save the  
> incoming change in a separate cache and compare changes that the  
> server is sending against those we *know* came from the client. When  
> we find a match, we ignore the change and remove the entry from the  
> cache. This might still cause premature PING termination, but *most*  
> of the time the change would be caught (and ignored) during the same  
> SYNC request that is sending the device changes anyway.

This is an optimization, yes, but not technically necessary.

> Since I already needed to save the UIDs to detect deleted messages,  
> it was easy to just add the flag state there as well. The bottom  
> line to this point is that if you implement this functionality in  
> Horde_Imap_Client, I would no longer need to cache the UID list and  
> flag state in the ActiveSync driver.

This is not something that can easily be built into the Imap_Client  
driver, per se.  As previously discussed, the Imap_Client object is  
only concerned about syncing between the client object and the remote  
IMAP server.  It can not track syncing at a different level.  This was  
a valid design decision - useful because it hides all caching logic  
within the client object so that an application does not have to worry  
about how/why caching is needed.

Once you start caching an independent data store that Imap_Client has  
no knowledge of, we necessarily have to expose some level of caching  
since the application now has to manually handle the sync state  
(Imap_Client can't automatically track this anymore).  So implementing  
this functionality won't take place in the Horde_Imap_Client_Base  
drivers itself.

I was thinking instead that this could be implemented in an overlay,  
or utility class, that just abstracts out some of the Imap calls  
needed but doesn't try to automagically control the synchronization  
process.  We can leverage the existing Horde_Imap_Client_Cache object  
which would save some overhead.  But I haven't thought this out too  
much yet.

> For IMP that makes sense. For an ActiveSync client it does not. An  
> ActiveSync client can be considered always active as long as the  
> device is turned on. Every email the IMAP server receives will be  
> pushed to the device and marked as unseen. If I am sitting at my  
> desk, dealing with email throughout the day, I don't want my device  
> to still show all of those emails as unseen when I get home.  
> Granted, this point will be moot if the emails are moved to a  
> different folder while I read them, but not everybody keeps their  
> INBOX that clean. The reverse is not as big of a problem, since even  
> if I leave IMPs session open all night and when I return I find all  
> the mail I have already read on my device still marked as unseen, I  
> can simply refresh the browser. The only way to 100% guarantee a  
> full refresh like this on an ActiveSync device is to force-remove  
> the device's state on the server and cause a complete re-sync (this  
> is also how we would deal with the need to invalidate the device's  
> state due to e.g., UIDVALIDITY changing).

The good news is that UIDVALIDITY changing should never ever ever  
never happen in practical usage.  So you should not worry about  
performance issues regarding this.

> ActiveSync NEEDS to provide a way to reliably sync flags to the  
> client. Ideally, it would be great to provide this for both QRESYNC  
> and IMAP. Perhaps make it a configuration switch to turn on the  
> support in IMAP only servers. Would be nice if the functionality was  
> abstracted in the Imap_Client before 4.1. If not, I can implement it  
> in the ActiveSync driver.

This is something that can/should be added to Horde_Imap_Client_Base.   
Most likely as a configuration flag/option along the lines of "always  
sync flag changes when opening a mailbox".  When using a  
QRESYNC/CONDSTORE server, this is already done automatically so this  
adds no load.  This flag would only cause extra work for servers that  
don't support these extensions.

> Ah. I did not realize that the IMAP client cache would be the same  
> for both the user's Horde session and the ActiveSync session. I  
> guess I'll need to keep the flags cached in ActiveSync after all,  
> right?

Technically, it does not need to be the same - you could pass a  
different cache object in.  There would just be duplication of IMAP  
data though if that happens, which could be quite a bit of data.

Maybe what needs to be done is to separate message data caching from  
mailbox list caching.  Although I'm not sure the activesync code has  
access to the Imp IMAP cache object anyway, so this might be a  
non-issue.

> ActiveSync connects, in addition to getting new messages and  
> expunged messages, gets a list of changed uids/flags from  
> Horde_Imap_Client (regardless of how it determined them - IMAP or  
> QRESYNC) and can then compare those flags against the client state.  
> So, basically the ActiveSync client code would really not change  
> much from what I was planning - I would still need to compare the  
> device flags with what Horde_Imap_Client tells me they are - it's  
> just that the optimizations that would come from having QRESYNC  
> available would be done inside Horde_Imap_Client?

The optimizations from QRESYNC are already available in  
Horde_Imap_Client.  It's the opposite - abstracting the code necessary  
for the non-QRESYNC situation into Horde_Imap_Client.  So you don't  
have to write all of that code in the application.  Although again,  
not quite sure how I will/would/can do this - there does not seem to  
be an easy way of abstracting the idea of a MODSEQ value on the client  
side.  So that's what I need to figure out - an easy way to determine  
sync state on both QRESYNC and non-QRESYNC devices.

(Wow, this thread is getting very technically dense.)

michael

___________________________________
Michael Slusarz [slusarz at horde.org]



More information about the dev mailing list