‹ back home

Vdirsyncer status update, March 2024

2024-03-29 #status-update #vdirsyncer

Sync status

As I mentioned before, when synchronising two storages, the sync algorithm keeps around a local “status” with some basic metadata. Future executions use this metadata to understand which side has changed and which side needs updating.

I had to re-write most of my previous status implementation due to the issues that I mentioned in my previous status update. The new implementation keeps track of the metadata for the latest synchronised version, instead of the latest seen version. I’ve also made sure that all operations are atomic and race-free. Atomicity ensures that if the process is ever interrupted, the status never ends up in a partially updated state. Race-free queries ensure that if another process applied some other operation concurrently, the current one will bail rather than overwrite conflicting information.

Sync errors

I also split synchronisation errors into two groups: non-fatal and fatal.

Non-fatal errors imply that the whole operation can continue, leaving just the failing item out of sync. Some examples of non-fatal errors are transient network error, or a server rejecting a calendar element for some server-specific reason.

Fatal errors imply that the entire operation must abort. Fatal errors occur mainly when writing to the status file fails for some unexpected reason.

Error reporting during sync

I need to provide a mechanism to report non-fatal errors to the end user. My current scope includes a command line tool, but an important goal of the sync library is that it has to provide enough functionality to build a proper GUI as well.

Initially, I tried accumulating non-fatal errors and later returning them together. This became rather entangled as fatal errors can interrupt the flow in various code paths, and the non-fatal errors need taking into consideration in each place. It also means that output would not be available until after the entire process completes.

I settled on a on_error(SyncError) parameter for the Plan::execute method. This parameter is a function that gets called each time an error happens. The error is not a plain string, but actually a proper error type with all the details necessary to explain to the user what went wrong.

Vdirsyncer itself passes an on_error function that merely logs the errors.

Collection auto-creation

I sometimes create a new todo list on another device (or on my phone), and I want these to replicate automatically without reconfiguring vdirsyncer.

When synchronising two storages, my new implementation will replicate new collections (e.g.: calendars, address books) to the other side too. It will also delete from one side a collection that was deleted on the other.

I later intend to make this feature optional; I have no doubt that many users would prefer no auto-creating or auto-deletion. Disabling this feature would result in the same behaviour as the previous vdirsyncer.

I also intend to add a feature to prevent emptying a collection if it was emptied on the other side. Again, this is a safety measure, and also exists in the previous vdirsyncer.

Conflict resolution

I thought long and hard about conflict resolution, and tried a couple of prototypes. Most dedicated APIs end up having too much complexity and various limitations.

After enough thought, I settled on a simpler approach.

The module that handles synchronisation, vstorage::sync creates a plan. This plan has all the individual actions that need to happen to synchronise the storages. It also includes details about the items that are in conflict (although no explicit action is actually taken for these).

The details for in-conflict items are enough to manually resolve the conflict. E.g.: they include the path to the item on both sides as well as their UID and Etag.

Instead of using a dedicated API for “conflict resolution” provided by this library, consumers can read details on conflicting items from the plan, and resolve the conflict externally by updating one or both storages with the desired final version of the item. Most of the work involved here was ensuring that enough information is made available for consumers to resolve conflicts in a race-free way (e.g.: by exposing the current Etags).

Vdirsyncer is built on top of this library, so it will follow these steps when resolving conflicts:

Providing enough information to resolve conflicts manually puts little burden in implementing conflict resolution while maximising flexibility.

This also means that conflict resolution can continue in the foreground (potentially, with user interaction) while continuing synchronisation in the background.

In the coming weeks, I will continue working on the configuration aspect of conflict resolution (e.g.: reading the configured command and executing it). Unless any new issues pop up, I intend to publishing an alpha release after this.

Current state

I have used this version of vdirsyncer myself for these last several months, and feel quite pleased with two large improvements over the previous version:

On the downside, the lack of conflict resolution soon became an issue.

Have comments or want to discuss this topic?
Send an email to ~whynothugo/public-inbox@lists.sr.ht (mailing list etiquette)

— § —