JMAP support in pimsync
[permalink]

I’ve finalised support for JMAP in pimsync, and tagged a release with it. It’s still experimental. While preliminary testing shows it works fine, it still hasn’t had extensive long-term testing. Feedback for it is most welcome, but please keep frequent back-ups of your data while testing it. This is still in an early state.

On my previous status update I mentioned that I’d be switching to the calcard library for converting iCalendar to and from JSCalendar. calcard uses serde for serialisation and deserialisation, and that resulted in a mismatch with my own libjmap, which (at the time) used the json library. I therefore ported libjmap to use serde instead. The resulting code is much clearer. After a few iterations, it is also much more reliable.

With libjmap and calcard using the same serialisation representations, they fit together much nicer. I then moved to port the JmapStorage implementation to use this.

Etag and State
[permalink]

The most complex part of the storage implementation was handling the mismatch between Etag and State. In CalDAV, each item has an Etag property, a version identifier which changes each time an item changes: if an item has changed, its Etag has changed. Pimsync stores an Etag value on a per-item basis, and tracks it each time an item is updated. This happens internally in the synchronisation logic, quite detached from individual storage implementations. Whenever we update an item, we tell the server to only update it if its Etag matches the one last seen. If it doesn’t match, then the item has been modified by some other client, and a conflict needs to be resolved.

For the filesystem storage, we use a combination of inode number + mtime as an Etag, and this matches the same conditions just fine.

JMAP doesn’t have an equivalent to Etag. It has a State value, but that changes whenever any item changes. We still use State as an Etag internally, since we don’t have anything else on which we can rely, but this brings about several complications.

As soon as we update a single item, we’ve invalidated the Etag/State for all other items in that storage. My early prototypes simply omitted the usage of these, but this could easily result in conflicts being ignored and data being overwritten. I.e.: losing changes made by some other client. This is unacceptable for anything beyond an early prototype.

In order to use the State value while keeping within the constraints of both pimsync and JMAP, I had to change the process of updating an item to the following sequence:

Try to update the item passing the last known State.
If the process completes, then all is good.
If the process fails due to a State mismatch, then:
- Fetch the current State.
- Request all changes since our item’s previously recorded State and the current State.
- Check whether this particular item has changed between those two States.
- If the item has not been modified, retry updating it, passing this new State as conditional.
- If the item has been modified, then we have a real conflict — someone else has modified it since we last saw it.

This works, but has a really ugly side: uploading a single item can end up causing four network round trips. We can do better.

Tracking state locally
[permalink]

In order to avoid continuously querying the server for state changes, I moved to caching states and their changes inside the JmapStorage implementation.

Essentially, it keeps a mapping of states which show up, and items which were modified between that state and the previous one. Each time that an item is uploaded (or deleted), we pass an “expected state” to the server and it returns a new state. We record in the cache that this new state transition only modifies the items which we’ve modified.

When we later update (or delete) some other item, we’ll have a State for the last time we’ve seen that item from the server. We can check our local cache, and if there’s a full path of transitions between that state and the current one, we can determine whether any of the intermediate states modified this item. If none of them did, then we can send the request telling the server to update the item assuming that it has not changed since the most recently seen state.

This implementation has mixed results. When continuously running (e.g.: pimsync daemon) there are plenty of cache hits and it saves a lot of network queries. When running ad-hoc (e.g.: pimsync sync) the cache usually doesn’t have enough updates to make any meaningful difference.

All of this is, essentially, a band aid. Execution of pimsync’s synchronisation algorithm currently expects to update items one at a time, but JMAP allows us to send a single request updating any amount of items. Using this API, we could dramatically reduce the amount of queries required. This requires large changes to the executor, which could also benefit the singlefile storage (which saves N events into a single iCalendar file). This is somewhat of an invasive change, which I’d like to address at some point in future, but there are higher priorities at the moment.

Concurrency
[permalink]

Due to how state changes occur, JMAP is incapable of dealing with concurrent writes. If two clients send a write operation, they’ll instruct the server to only apply the changes if no other change has occurred. But because the state tracking is for all items, if any other item changes, then the operation fails. Because of this, the JmapStorage implementation only sends a single write query at a time. Any more than that would, by definition, always fail. Again, this will be ironed out once we simply write all changes in a single query, but it’s also an issue for concurrent clients trying to operate on entirely different items.

Final bits
[permalink]

The final bits of the implementation were integrating the JmapStorage into pimsync. Mostly, allowing configuration blocks for storages with type jmap. At this time, these storages need to specify jmap/icalendar or jmap/vcard. This is somewhat of an annoyance, since the same storage can actually handle both. In future, I’ll move the declaration of item type into pair blocks, so a single storage declaration can be re-used for both.

Collection IDs and URL segments
[permalink]

Unrelated from JMAP: it is now possible to configure which URL segment is used for collection identifiers. A few months ago, I explained why this is needed on some unusual configurations. This includes Google’s CalDAV implementation.

The new collection_id_segment configuration directive allows using the second-to-last segment as a collection id. See the manual page for reference documentation.

While native OAuth support is still on the road map, this change allows using a proxy such as google-dav-proxy¹ with pimsync. It requires a bit more setup, but should be feasible to use today. If you do try it out, please let me know the results.

I have neither tested nor reviewed this proxy. ↩︎

JMAP support in pimsync[permalink]

Etag and State[permalink]

Tracking state locally[permalink]

Concurrency[permalink]

Final bits[permalink]

Collection IDs and URL segments[permalink]