This month started out rather slow, with me taking a week off on vacation and then coming back home only to be sick in bed for another week. But it ended up being rather productive after all.
Configuration parsing
My initial goal since last month was to implement a small binary that can work as a drop-in replacement for vdirsyncer. With some basic sync scenarios working1, the next step was to parse the configuration file.
The configuration format has a few quirks. In first place, the names of storages and pairs are in section titles, which means section titles aren’t a static thing. In second place, property values are then treated as JSON, so I need a second parser for that.
I’m using rust-ini
to parse the configuration file itself (it’s been a
perfect fit so far) and the famous serde_json
to parse the property values.
I’ve gotten to a point where most of my real configuration file parses
fine, and creates a Config
object. I say “most” because one key detail is
missing: the CardDav storage.
So I’ve temporarily paused development of this binary until I catch up with the CardDav implementation.
CardDav
So I set out to advance on the CardDav (i.e.: address books) implementation. Everything so far has mostly focused on CalDav (i.e.: calendars), while always leaving the API definitions in place to make sure that CardDav would fit.
Implementing CardDav is a bit repetitive. It’s very similar to CalDav but with little differences all over the place. I’m often tempted to try and abstract the common parts, but it’s often far harder than just having two very similar looking functions.
However, for all the XML parsing aspect of things (reminder: both CalDav and CardDav are XML-based), it did bother me how little could be reused from the CalDav implementation. In fact, XML parsing was already a huge mess in itself. Parsing worked (and pretty well I might say), but it was hard to understand, hard to reuse parts, and quite non-trivial to parse any new structure (which was required for CardDav).
I ended up rewriting all the XML parsing parts of libdav
. And then again.
The first rewrite improved code re-usability, but I’m not sure if it was
really easier to understand. This approach used the low-level parser from
quick-xml
. Using the low-level parser was nice, because I could extract
only the necessary bits from the XML data that I received, and discard all the
noise. However, the price being paid for this (in terms of complexity) wasn’t
worth it. Theoretically, performance could be stupendous with this approach,
but that would require investing a lot more time into this than what is
realistically ever going to happen.
After I was done, I was not happy with the result. It was a lot more code, and a lot more complexity, and I had the impression that some bits did almost the same work, but it was really hard to consolidate them into one thing.
I took a while to sit back and consider my options. It seemed like parsing the whole XML structure into an actual in-memory tree (with pointers to the original data, no copying!) would be superb approach. Converting data from this tree into the final data type should be pretty simple. I could definitely do this, but that’s not what I signed up for: I’m here to write a calendar (and addressbook) synchronisation tool, not an XML tree generator.
Obviously, such a thing already existed. Indeed roxmltree
seemed to fit the
bill perfectly and it also seems to be the best option around in terms of
performance. Great!
I set out to do this whole rewrite again, which took an entire day. And by “entire day”, I mean, I started at 9:00hs and finished at around 23:00hs, just in time for bed (don’t worry, I did eat and stretch).
I’m pretty happy with the result. It’s pretty easy to extract data from an XML
tree, and instead of a huge mess of intertwined Parser
and Node
implementations, I just have a few parse_
functions (basically, one for each
query type). These functions are smaller in scope and even have a few nice
individual test each, yay.
Something that’s left for later is dealing with parsing non-UTF8 data. So far, all the tests servers use UTF-8, but I’m sure that some exception exists out there.
CardDav progress
With the XML parsing bits improved as much, I’ve turned my attention back to
CardDav itself. It all looks in pretty good shape, and I’m currently writing
some live tests to test functionality with live servers (currently: xandikos
,
radicale
, baikal
, nextcloud
and cyrus-imap
). These are showing good
results so far, so I expect to close this phase pretty soon.
Some notes on configuration parsing
My intent is for the configuration format to remain unchanged, but there are some small subtleties that will have to change. I intend to clearly document these, and have a clear migration guide in place before retiring the previous tool.
Notes on synchronising email
I’ve synced up with @soywood, who is working on pimalaya
(open source
tools and libraries to manage emails), with regards to synchronising IMAP using
the vstorage
library and using my same synchronisation algorithm for
synchronising emails.
A vstorage
is generic on its item type and doesn’t really care what type of
data it’s handling or synchronising. It only cares that the data is a bunch of
bytes and that it has an API to extract a unique ID for each item.
In the case of IMAP, however, something else is relevant: flags. Flags are per-message and might change while the message does not (flags are things like “seen”, “starred”, etc). Flags are not part of the message blob itself. And re-synchronising emails just because flags change can be expensive: imagine having to re-download 10 email with 5MiB attachments just because they’re now marked as “read”.
vstorage
does have a concept for flags: properties
(previously called
metadata
). However, these are only applicable to collections, and not items.
IMAP flags apply only to items and not to collections.
It’s not clear to me how to make this work nicely. One potential option is to
have a new concept of ItemProperties
and shove flags in there, while defining
ItemProperties
as “nothing” for CalDav and CardDav. This doesn’t feel quite
right, so I intend to keep thinking on this before making any decision.
My main thought on the topic is: I intend to implement synchronising via JMAP one day. JMAP can handle calendars, address books and emails, so it just seems logical to synchronise emails at one point too. We’ll see how that goes.
Basically, CalDav with HTTPS and Basic Auth works. None of the custom TLS settings are implemented yet, nor are things like Digest Auth or Oauth. ↩︎