Sync status
As I mentioned before, when synchronising two storages, the sync algorithm keeps around a local “status” with some basic metadata. Future executions use this metadata to understand which side has changed and which side needs updating.
I had to re-write most of my previous status implementation due to the issues that I mentioned in my previous status update. The new implementation keeps track of the metadata for the latest synchronised version, instead of the latest seen version. I’ve also made sure that all operations are atomic and race-free. Atomicity ensures that if the process is ever interrupted, the status never ends up in a partially updated state. Race-free queries ensure that if another process applied some other operation concurrently, the current one will bail rather than overwrite conflicting information.
Sync errors
I also split synchronisation errors into two groups: non-fatal and fatal.
Non-fatal errors imply that the whole operation can continue, leaving just the failing item out of sync. Some examples of non-fatal errors are transient network error, or a server rejecting a calendar element for some server-specific reason.
Fatal errors imply that the entire operation must abort. Fatal errors occur mainly when writing to the status file fails for some unexpected reason.
Error reporting during sync
I need to provide a mechanism to report non-fatal errors to the end user. My current scope includes a command line tool, but an important goal of the sync library is that it has to provide enough functionality to build a proper GUI as well.
Initially, I tried accumulating non-fatal errors and later returning them together. This became rather entangled as fatal errors can interrupt the flow in various code paths, and the non-fatal errors need taking into consideration in each place. It also means that output would not be available until after the entire process completes.
I settled on a on_error(SyncError)
parameter for the Plan::execute
method.
This parameter is a function that gets called each time an error happens. The
error is not a plain string, but actually a proper error type with all the
details necessary to explain to the user what went wrong.
Vdirsyncer itself passes an on_error
function that merely logs the errors.
Collection auto-creation
I sometimes create a new todo list on another device (or on my phone), and I want these to replicate automatically without reconfiguring vdirsyncer.
When synchronising two storages, my new implementation will replicate new collections (e.g.: calendars, address books) to the other side too. It will also delete from one side a collection that was deleted on the other.
I later intend to make this feature optional; I have no doubt that many users would prefer no auto-creating or auto-deletion. Disabling this feature would result in the same behaviour as the previous vdirsyncer.
I also intend to add a feature to prevent emptying a collection if it was emptied on the other side. Again, this is a safety measure, and also exists in the previous vdirsyncer.
Conflict resolution
I thought long and hard about conflict resolution, and tried a couple of prototypes. Most dedicated APIs end up having too much complexity and various limitations.
After enough thought, I settled on a simpler approach.
The module that handles synchronisation, vstorage::sync
creates a plan. This
plan has all the individual actions that need to happen to synchronise the
storages. It also includes details about the items that are in conflict
(although no explicit action is actually taken for these).
The details for in-conflict items are enough to manually resolve the conflict. E.g.: they include the path to the item on both sides as well as their UID and Etag.
Instead of using a dedicated API for “conflict resolution” provided by this
library, consumers can read details on conflicting items from the plan, and
resolve the conflict externally by updating one or both storages with the
desired final version of the item. Most of the work involved here was ensuring
that enough information is made available for consumers to resolve conflicts in
a race-free way (e.g.: by exposing the current Etag
s).
Vdirsyncer is built on top of this library, so it will follow these steps when resolving conflicts:
- Prepare a plan by using
vstorage::sync
. - Inspect all items that are in conflict.
- If the conflict resolution strategy is
keep a
orkeep b
, vdirsyncer replace the “conflict” actions withCopyToB
orCopyToA
respectively. - If the conflict resolution is a custom command, vdirsyncer will download both
sides of the conflicting item and call the custom command. If the command
exits with a success exit code, vdirsyncer will update both storages with the
returned content (by using the
Storage::update_item
method.- In the latter case, during the next synchronisation the vdirsyncer will identify that the item is in sync again, resolving the conflict.
Providing enough information to resolve conflicts manually puts little burden in implementing conflict resolution while maximising flexibility.
This also means that conflict resolution can continue in the foreground (potentially, with user interaction) while continuing synchronisation in the background.
In the coming weeks, I will continue working on the configuration aspect of conflict resolution (e.g.: reading the configured command and executing it). Unless any new issues pop up, I intend to publishing an alpha release after this.
Current state
I have used this version of vdirsyncer myself for these last several months, and feel quite pleased with two large improvements over the previous version:
- If any single item fails for any reason, all others continue to synchronise.
- I can run it in the background and it keeps things in sync without need for restarts or prompts for credentials.
On the downside, the lack of conflict resolution soon became an issue.