[dev] STATUS_SYNC* discussion (Was: [Tickets #11612] Re: Broken imap fetch query)

Mon Nov 5 19:59:09 UTC 2012

Quoting Michael M Slusarz <slusarz at horde.org>:

> Quoting Michael J Rubinsky <mrubinsk at horde.org>:
>
>> I do some switching based on the availability of QRESYNC and use  
>> the STATUS_HIGHESTMODSEQ value when it's available. At least at the  
>> time I wrote the code, QRESYNC was only being reported by  
>> Horde_Imap_Client when caching was turned on, so in a way I'm  
>> relying on the cache being available, but not actually using the  
>> data.
>
> This doesn't sound right.  Meaning a call to:
>
> $imap->queryCapability('QRESYNC');
>
> Should return true/false based on the availability of the extension  
> on the server, not the current caching state.
>
>> For that, I only take UIDNEXT into consideration for performance reasons.
>
> Word of warning: UIDNEXT is *not* available automatically on all  
> servers, so it is not necessarily a performance advantage (UIDNEXT  
> does not have to be returned automatically when SELECTing/EXAMINEing  
> a mailbox.  IIRC, Courier does not do this).  I've added a  
> STATUS_UIDNEXT_FORCE option that will automatically determine this  
> value if it doesn't exist, but this may involve an additional server  
> call (and it can't reliably generate a UIDNEXT value for an empty  
> mailbox).
>
> UIDNEXT is worthless for flag changes and message deletions so I am  
> assuming you are only using it to determine if new messages have  
> possibly been added to the mailbox.

Correct. Some background: With ActiveSync, the client "PINGS" the  
server for changes in long-lived loop. At this stage, we don't care  
about what the changes are, just that there *is* a change we want to  
transmit to the device. The current request is then terminated and a  
new SYNC request is issued, where we send the exact changes to the  
device. When I say for "performance reasons" I mean that at the PING  
stage, I am only checking for new messages. That is the only change  
that I want to trigger a SYNC. This saves countless new connections  
(and save battery life and bandwidth) by not causing a new SYNC to be  
issued when there is only a flag change or deletion. In other words,  
the only thing that will trigger a new SYNC request is a new email  
arriving in the mailbox - but once the SYNC request is triggered, it  
asks for new, vanished, and flags.

 From what you are telling me though, I should be using  
STATUS_UIDNEXT_FORCE for this since the data might not be available  
without the extra server call, correct?

>> Only after some change has been detected does ActiveSync request  
>> the change specifics - at which point I want to know all the  
>> changes available, including VANISHED and flag changes.
>
> The advantage of using QRESYNC in SELECT/EXAMINE, instead of this  
> approach, is that it allows this synchronization without the  
> overhead of 2 additional server calls.
>
>> I will need to look at this more closely, since it sounds like I  
>> might be able to write some logic to first synchronize/compare  
>> *ActiveSync* code's last known MODSEQ with the *IMAP* client's  
>> STATUS_SYNCMODSEQ on the first access of each request and then for  
>> every other check during the PING loop simply check  
>> STATUS_SYNCMODSEQ.
>
> This is how we do things in IMP.  You take advantage if these values  
> happen to match, and fallback to a normal mailbox sync otherwise.
>
>  To further complicate things, I need to track
>> different versions of the cached MODSEQ/UIDNEXT values. ActiveSync  
>> uses a sync_key to specify the last known state of the collection  
>> being synched and we need to keep at least two of these keys cached  
>> since the device can request the old sync_key if it never received  
>> the data in full or there was some other communication issue - not  
>> uncommon on a mobile device.
>
> So... are you using an imap client object that *is* caching?  My  
> initial idea was to do something like implementing a stub cache  
> driver that does nothing more than return a MODSEQ value (i.e. the  
> activesync modseq value).  This cache driver is then used to open  
> the mailbox, and any flag/vanished changes can be retrieved via the  
> STATUS_SYNC* methods.  The disadvantage is you can't cache anything  
> using this method, so it only works if using a non-caching client  
> object.

Core's ActiveSync driver gets the IMAP object directly from IMP. So,  
if it is configured to use the cache, it has it. The AS code does it's  
own persistence of the data it needs. That means if QRESYNC is  
available, I cache HIGHESTMODSEQ, UIDNEXT, and UIDVALIDITY and rely on  
the various IMAP methods to get the changes I need. If it's not  
available, I have to persist all the UIDs that we know are on the  
device, along with the state of the flags AS supports. Then I  
calculate the changes by comparing the server's known UIDs/flag state  
against what I have in the ActiveSync's cache of the data. It's this  
data that I need to remember at least two snapshots of (the current  
state and the previous state).

It's beginning to sound like I might be duplicating some of the  
internal cache's functionality and I might be able to create a custom  
cache driver (or use the stub you are talking about), but I'm not sure  
how to implement that while still using IMP's imap object.

> The better solution (and maybe what we previously talked about) is  
> to somehow abstract the syncing code so that the calling code  
> doesn't care whether QRESYNC/CONDSTORE is in use.  I could see a  
> "getToken" method that returns a status identifier of a mailbox  
> (MODSEQ or UIDNEXT).  This token can than be passed to a sync  
> function, which would return the list of deleted messages/flag  
> changes - how this information is obtained would be transparent to  
> the calling code (i.e. QRESYNC could leverage STATUS_SYNC*; non  
> CONDSTORE enabled clients would have to use the original inefficient  
> mailbox syncing code).

This sounds like the cleaner (albeit more complicated to implement for  
you) solution.

>>> But see Ticket #11590 - If not using Imap_Client's full caching,  
>>> there is still an opportunity to take advantage of this.  Namely:  
>>> creating a custom cache driver that has the known MODSEQ value  
>>> that can be returned from it.  The mailbox will be sync'd on open,  
>>> and the flag values can be cached in the custom cache driver so  
>>> that SYNCFLAGUIDS/VANISHED doesn't need to hit the server again  
>>> (the custom cache driver could be an in-memory cache).
>>
>> Hmm, I could use/refactor ActiveSync's current caching object  
>> (Horde_ActiveSync_Folder_Imap) as a custom cache driver and let  
>> activesync decide which sync_key's data to populate it with before  
>> injecting it into the imap client. I know this will let me capture  
>> changes after the mailbox syncs, but how would I capture the  
>> changes during that first sync, by a normal fetch query using the  
>> MODSEQ that I have?
>
> Not following you here.  Maybe I clarified what I meant by a stub  
> cache object above.

Basically, I'm talking about implementing a custom cache driver for  
the imap client that would be populated with the data that activesync  
currently persists to database storage (the modseq, uidnext etc...).  
But again, I'm not sure how to go about doing this cleanly with  
current code while still being able to use the imap object from IMP.

>> I'm probably misunderstanding something but for example:
>> ActiveSync has a last known MODSEQ of 100 and hasn't connected in  
>> e.g., a week. During that time the mailbox's MODSEQ has changed to  
>> 200. During that first connection, STATUS_SYNCMODSEQ will be set to  
>> 200 so any of the STATUS_SYNC* values will be compared to 200, not  
>> to 100, right? How do I get that first change set without having to  
>> write switching code to determine if we are using a cache etc...?
>
> No.  STATUS_SYNCMODSEQ will be set to the MODSEQ of the mailbox as  
> existed before the mailbox was ever opened in this session.  In  
> other words, STATUS_SYNCMODSEQ is the cached MODSEQ value.  (If it  
> returns 200, it does nothing more than replicate the  
> STATUS_HIGHESTMODSEQ option).

Ah, ok. That's clearer. The language in the phpdoc was confusing me.  
It speaks about the value of "the mailbox when it was opened for the  
first time in this access."

> So STATUS_SYNCFLAGUIDS and STATUS_VANISHED contain all changes that  
> were sync'd to whatever existed in the cache at the beginning of the  
> page request (if messages are flagged/deleted subsequent to the  
> mailbox opening, these changes will also be stored).
>
> Conceptually, you could actually use STATUS_SYNCFLAGUIDS and  
> STATUS_VANISHED if your MODSEQ happens to be *greater* than the  
> cached MODSEQ - you will just get some changes returned that you  
> already know about.  Depending on how many duplicate changes there  
> are, it is still likely more efficient than having to make the 2  
> round-trips to the IMAP server to get the information otherwise.
>
> michael
>
> ___________________________________
> Michael Slusarz [slusarz at horde.org]
>
> -- 
> dev mailing list
> Frequently Asked Questions: http://wiki.horde.org/FAQ
> To unsubscribe, mail: dev-unsubscribe at lists.horde.org

-- 
mike

The Horde Project (www.horde.org)
mrubinsk at horde.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 6062 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.horde.org/archives/dev/attachments/20121105/041e8549/attachment.bin>