JMAP support in pimsync
I’ve finalised support for JMAP in pimsync, and tagged a release with it. It’s still experimental. While preliminary testing shows it works fine, it still hasn’t had extensive long-term testing. Feedback for it is most welcome, but please keep frequent back-ups of your data while testing it. This is still in an early state.
On my previous status update I mentioned that I’d be switching to the calcard library for converting iCalendar to and from JSCalendar. calcard uses serde for serialisation and deserialisation, and that resulted in a mismatch with my own libjmap, which (at the time) used the json library. I therefore ported libjmap to use serde instead. The resulting code is much clearer. After a few iterations, it is also much more reliable.
With libjmap and calcard using the same serialisation representations, they fit
together much nicer. I then moved to port the JmapStorage
implementation to
use this.
Etag and State
The most complex part of the storage implementation was handling the mismatch
between Etag
and State
. In CalDAV, each item has an Etag
property, a
version identifier which changes each time an item changes: if an item has
changed, its Etag
has changed. Pimsync stores an Etag
value on a per-item
basis, and tracks it each time an item is updated. This happens internally in
the synchronisation logic, quite detached from individual storage
implementations. Whenever we update an item, we tell the server to only update
it if its Etag
matches the one last seen. If it doesn’t match, then the item
has been modified by some other client, and a conflict needs to be resolved.
For the filesystem storage, we use a combination of inode number + mtime as an
Etag
, and this matches the same conditions just fine.
JMAP doesn’t have an equivalent to Etag
. It has a State
value, but that
changes whenever any item changes. We still use State
as an Etag
internally, since we don’t have anything else on which we can rely, but this
brings about several complications.
As soon as we update a single item, we’ve invalidated the Etag
/State
for all
other items in that storage. My early prototypes simply omitted the usage of
these, but this could easily result in conflicts being ignored and data being
overwritten. I.e.: losing changes made by some other client. This is
unacceptable for anything beyond an early prototype.
In order to use the State
value while keeping within the constraints of both
pimsync and JMAP, I had to change the process of updating an item to the
following sequence:
- Try to update the item passing the last known State.
- If the process completes, then all is good.
- If the process fails due to a State mismatch, then:
- Fetch the current State.
- Request all changes since our item’s previously recorded State and the current State.
- Check whether this particular item has changed between those two States.
- If the item has not been modified, retry updating it, passing this new State as conditional.
- If the item has been modified, then we have a real conflict — someone else has modified it since we last saw it.
This works, but has a really ugly side: uploading a single item can end up causing four network round trips. We can do better.
Tracking state locally
In order to avoid continuously querying the server for state changes, I moved to
caching states and their changes inside the JmapStorage
implementation.
Essentially, it keeps a mapping of states which show up, and items which were modified between that state and the previous one. Each time that an item is uploaded (or deleted), we pass an “expected state” to the server and it returns a new state. We record in the cache that this new state transition only modifies the items which we’ve modified.
When we later update (or delete) some other item, we’ll have a State
for the
last time we’ve seen that item from the server. We can check our local cache,
and if there’s a full path of transitions between that state and the current
one, we can determine whether any of the intermediate states modified this item.
If none of them did, then we can send the request telling the server to update
the item assuming that it has not changed since the most recently seen state.
This implementation has mixed results. When continuously running (e.g.:
pimsync daemon
) there are plenty of cache hits and it saves a lot of network
queries. When running ad-hoc (e.g.: pimsync sync
) the cache usually doesn’t
have enough updates to make any meaningful difference.
All of this is, essentially, a band aid. Execution of pimsync’s synchronisation
algorithm currently expects to update items one at a time, but JMAP allows us to
send a single request updating any amount of items. Using this API, we could
dramatically reduce the amount of queries required. This requires large
changes to the executor, which could also benefit the singlefile
storage
(which saves N events into a single iCalendar file). This is somewhat of an
invasive change, which I’d like to address at some point in future, but there
are higher priorities at the moment.
Concurrency
Due to how state changes occur, JMAP is incapable of dealing with concurrent
writes. If two clients send a write operation, they’ll instruct the server to
only apply the changes if no other change has occurred. But because the state
tracking is for all items, if any other item changes, then the operation
fails. Because of this, the JmapStorage
implementation only sends a single
write query at a time. Any more than that would, by definition, always fail.
Again, this will be ironed out once we simply write all changes in a single
query, but it’s also an issue for concurrent clients trying to operate on
entirely different items.
Final bits
The final bits of the implementation were integrating the JmapStorage
into
pimsync. Mostly, allowing configuration blocks for storages with type jmap
. At
this time, these storages need to specify jmap/icalendar
or jmap/vcard
. This
is somewhat of an annoyance, since the same storage can actually handle both. In
future, I’ll move the declaration of item type into pair
blocks, so a
single storage declaration can be re-used for both.
Collection IDs and URL segments
Unrelated from JMAP: it is now possible to configure which URL segment is used for collection identifiers. A few months ago, I explained why this is needed on some unusual configurations. This includes Google’s CalDAV implementation.
The new collection_id_segment
configuration directive allows using the
second-to-last segment as a collection id. See the manual page for
reference documentation.
While native OAuth support is still on the road map, this change allows using a proxy such as google-dav-proxy1 with pimsync. It requires a bit more setup, but should be feasible to use today. If you do try it out, please let me know the results.
I have neither tested nor reviewed this proxy. ↩︎