Request for Comments: CRDTish approach to Solid

Hi there!

I’m working on a new Solid app, and I’ve decided to follow an offline-first approach. I’ve been doing some reading, and something I’ve come across multiple times is CRDTs (I also read about it in this forum a while ago).

I’m building a recipes manager, intended for use by a single user at a time. So the only type of synchronization I care about is storing offline changes in multiple devices and synchronizing them when they are back online, not real-time collaboration.

Learning about CRDTs has given me some ideas on how I could do this in Solid. But I don’t think my approach could be considered a CRDT because the server won’t be running a CRDT node, It’ll just be a “dumb storage”. That’s why I’m calling it CRDTish, keep that in mind.

I wanted to share my solution here (which is a work in progress) in order to get some feedback.

My Solution

I am using Soukai, a library I built for working with Solid. You don’t need to know anything about Soukai, other than it uses the Active Record design pattern.

Internally, this library keeps track of the changes made to each model, and they are sent to the POD upon saving. I thought a good solution to this problem would be to send operations describing the updates together with the changes.

For example, if I make a new recipe called “Ramen”, and later on I change the name to “Jun’s Ramen”, this is the information I’d have stored in the POD:

Current State: { name: "Jun's Ramen" }
History:
    [T0] { name: "Ramen" }
    [T1] { name: "Jun's Ramen" }

(you can find the full example using Turtle at the end of this post)

The idea is to maintain the same format for the model data so that other applications continue understanding it, whilst adding some metadata that my app would use for CRDT merging.

In addition to the changes, operations would also store the time using Hybrid Logical Clocks (check the references at the end).

This metadata would also include a checksum of all the known operations, in order to avoid unnecessary processing for models that are already up to date. And a checksum of the model data, in order to detect changes made by other applications. In which case my app would create a new operation with the changes since the last known state.

Concerns

These are some concerns I have with my current solution:

  • Data overhead: If you look at the example that follows, you’ll notice that there is a lot of overhead. As far as I know this is a common issue with CRDTs, but keeping in mind my use-case I don’t think this will be a problem (and I could implement some algorithms to squash the history later on).

  • Interoperability: Other applications will understand the data, given that the main resource is still the same. But if they start modifying the data as well, some things could break down. I’ve already thought about it using the checksum and creating new operations in my app, but timestamps will be messed up and there could be other issues.

  • Custom Vocab: I haven’t found an ontology for this, given that it’s so custom. This isn’t such a big problem as I can create my own vocab, but I’m reticent to doing this because it’s likely that only my apps will understand it (or apps using Soukai).

  • Modeling Operations: Operation resources have both semantic properties (like the time, or rdfs:type) and the changes that happened to the model. I’m not sure this makes sense, because I am saying for example that a certain operation has properties of the model. Would it make sense to have yet another block of data, let’s call it changeset, that has only the model properties? without any other operation metadata. I’m also just using update operations in this example, but I will also need other operations like add/remove if I work with lists in the model (for example, ingredients).

  • Complexity: I can’t help but wonder if I’m overthinking this. I just wanted to make an offline-first app and I ended up here, I’m not sure how much down the rabbit hole I should go. But this looks like something that could be useful in the future if I want to tackle more complex use-cases, so I’m exploring to see where this takes me.

Example

(Imagine that T0, T1 and T2 are Hybrid Logical Clock timestamps, or just timestamps)

In my app:

// at T0
const recipe = await Recipe.create({
    name: 'Ramen',
    description: 'Ramen is delicious',
});

// at T1
await recipe.update({ description: 'Ramen is life' });

// at T2
await recipe.update({
    name: "Jun's Ramen",
    description: 'Instructions: https://www.youtube.com/watch?v=9WXIrnWsaCo',
});

In the server, ramen.ttl at T0:

@prefix : <#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix schema: <http://schema.org/> .
@prefix soukai: <https://vocab.soukai.js.org/> . # This doesn't exist yet!

:it
    a schema:Recipe ;
    schema:name "Ramen" ;
    schema:description "Ramen is delicious" .

:it-metadata
    a soukai:ModelMetadata ;
    dc:subject :it ;
    soukai:created "T0" ;
    soukai:modified "T0" ;
    soukai:modelChecksum "hash(:it properties)" ;
    soukai:operationsChecksum "hash(T0)" ;
    soukai:history :it-operation-0 .

:it-operation-0
    a soukai:ModelOperation ;
    soukai:time "T0" ;
    schema:name "Ramen" ;
    schema:description "Ramen is delicious" .

In the server, ramen.ttl at T1:

@prefix : <#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix schema: <http://schema.org/> .
@prefix soukai: <https://vocab.soukai.js.org/> . # This doesn't exist yet!

:it
    a schema:Recipe ;
    schema:name "Ramen" ;
    schema:description "Ramen is life" .

:it-metadata
    a soukai:ModelMetadata ;
    dc:subject :it ;
    soukai:created "T0" ;
    soukai:modified "T1" ;
    soukai:modelChecksum "hash(:it properties)" ;
    soukai:operationsChecksum "hash(T0+T1)" ;
    soukai:history :it-operation-0, :it-operation-1 .

:it-operation-0
    a soukai:ModelOperation ;
    soukai:time "T0" ;
    schema:name "Ramen" ;
    schema:description "Ramen is delicious" .

:it-operation-1
    a soukai:ModelOperation ;
    soukai:time "T1" ;
    schema:description "Ramen is life" .

In the server, ramen.ttl at T2:

@prefix : <#> .
@prefix dc: <http://purl.org/dc/terms/> .
@prefix schema: <http://schema.org/> .
@prefix soukai: <https://vocab.soukai.js.org/> . # This doesn't exist yet!

:it
    a schema:Recipe ;
    schema:name "Jun's Ramen" ;
    schema:description "Instructions: https://www.youtube.com/watch?v=9WXIrnWsaCo" .

:it-metadata
    a soukai:ModelMetadata ;
    dc:subject :it ;
    soukai:created "T0" ;
    soukai:modified "T2" ;
    soukai:modelChecksum "hash(:it properties)" ;
    soukai:operationsChecksum "hash(T0+T1+T2)" ;
    soukai:history :it-operation-0, :it-operation-1, :it-operation-2 .

:it-operation-0
    a soukai:ModelOperation ;
    soukai:time "T0" ;
    schema:name "Ramen" ;
    schema:description "Ramen is delicious" .

:it-operation-1
    a soukai:ModelOperation ;
    soukai:time "T1" ;
    schema:description "Ramen is life" .

:it-operation-2
    a soukai:ModelOperation ;
    soukai:time "T2" ;
    schema:name "Jun's Ramen" ;
    schema:description "Instructions: https://www.youtube.com/watch?v=9WXIrnWsaCo" .

References

I’ve read/watched other resources, but these are the ones I found most useful. If there’s anything I’m missing that you think I should check out, let me know!


So, what do you think? Does it make sense? Am I missing something? Am I overengineering for my use case?

All feedback is welcome!

2 Likes

I’m a beginner, so might just be missing something, but it seems to me that by being CRDTish it ends up being primarily a list of operations, without any guarantees to avoid corruption from concurrent edits (from changes on multiple devices) or different orders of those concurrent edits?

I’m still struggling to get my head around CRDTs (and the difference with operational transforms) but based on this example (https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpe.5670) I would add insert/delete counters on the main data model, an append-only operation broadcast log in a separate document, and then use reconstruction of the document to detect untracked changes to the main data model?

Really great to see you working on this - looking forward to following how you tackle it.

I’m not much beyond a beginner either, I started learning about this some weeks ago, so I may be missing something as well :sweat_smile:. But here’s how I understand it.

I say it is CRDTish because the Solid POD is not a CRDT node, it’s only a “dumb store”. But this same architecture could potentially be used for nodes communicating among themselves, and that would be a proper CRDT. Although that’s not a use-case I’m considering at the moment.

There cannot be any concurrent edits, because even if two operations happen at the same time, Hybrid Logical Clocks take care of making each event unique and sorted chronogically. So the latest operation would win. I’m still a bit fuzzy about the clocks, but worse case scenario I will just use normal timestamps. That’s normally not advisable for real-time collaboration, because you can’t trust the local timestamp of different devices in a distributed system. But for my use-case, I think it’s acceptable.

Most of this is inspired by the talk I linked in the references: CRDTs for mortals. If you want to dig deeper, he also did an interview on a podcast and goes more in depth: Building Distributed Local-First JavaScript Applications.

As I understand it, the main difference is that in Operational Transformations the operations can be transformed after they have been created, usually by a centralized server. With CRDTs, the operations are immutable and there is eventual consistency (a node with the same operations will have the same end state).

I’m tracking everything in the same document because it’s easier, but technically speaking it doesn’t matter in which document the state/operations are stored. They are still linked with urls through semantic properties (dc:subject and soukai:history in my example). The operations are effectively an append-only log, and the resource (:it in my example) is the reconstruction of the state through operations, but I need to store it so that other applications understand the data without looking at the operations.

About untracked changes, that’s why I’m using a checksum to see if the current state is the result of all the operations or that something else changed it.

Thanks!
Here are a few updated thoughts:

  • It looks like James Long just uses the server as a message store, and given that the merkle tree is an optional efficiency improvement, the server being a dumb store does not prevent this from being a true CRDT. It looks like you are indeed implementing a Last Write Win map + grow only store with messages stored on solid, which makes this a CRDT? I found the annotated version of James’ example useful: GitHub - clintharris/crdt-example-app_annotated: A fork of James Long's CRDT example app, annotated with extra code comments and a NOTES.md
  • With a LWW approach with an appropriate timestamp, writing to the document seems to
    be the main spot where concurrency could corrupt the data. In that context, I was actually suggesting using webacls to make the document append-only and guarantee that deletions are not possible and the message list is indeed grow-only. Edits to soukai:history should only ever be inserts, and deletions would add tombstones for fields in new operations? Maybe you’re already doing this.
  • Both James’ example and yours so far change entire fields at once. For the description field, this could be a problem given a user might make separate edits to different parts of the field. Automerge has a text object that records changes to individual characters (GitHub - automerge/automerge: A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically.). Two alternatives seem to be making it clear somehow in the UI that the field is atomic (e.g. a label that emphasises that this is “new replacement text”?) or that detects conflicts between the operations (i.e. detects that LWW actually had to pick one edit over another, GitHub - automerge/automerge: A JSON-like data structure (a CRDT) that can be modified concurrently by different users, and merged again automatically. )
  • I suppose part of the attraction of formally adopting a CRDT (i.e. ensuring that the implementation meets required conditions) is that it then provides guarantees. I’m not sure how I’d approach testing of concurrent edits otherwise, given the edge cases.
  • I’m quite curious to see the performance of this approach and how it scales. It seems like storing and loading operation logs for each recipe will run into similar problems as with media kraken, and continue stress testing solid server performance…

Looking forward to the next update on your RSS feed :slight_smile:

Ok, that’s cool. So maybe it is a real CRDT after all :D.

I heard him talking about the server having a node, but I didn’t realize it was only a message buffer (or I forgot about it xD). That annotated repo looks useful, I’ll take a look.

Yes, this is already taken care of in Soukai, in theory. Every time Soukai makes an update, it deletes previous properties before adding the new ones, and if it tries to delete a property that doesn’t exist (meaning, that it was changed by someone else) it’ll throw an error.

Now, I say “in theory” because I haven’t really tested this too much, and ideally I’d like to use Etags with the If-Match header instead.

About tombstones, something I don’t like about CRDTs is that data is kept forever, so if I delete a resource I’ll delete it for real. I’ll see how to handle that in the UI if there’s any conflicts.

Yes I looked at automerge, but I think it’s too complex for my use-case (I’m already second-guessing if I should be using CRDTs at all). And I think it’d be a lot harder to implement that in Solid (I’d need to store the automerge metadata on the POD). At least for the first version, I don’t think I’ll go beyond a LWW map.

Since I’ll have the entire history, I may add something in the UI to see the history so that information is not lost and users can “fix the merge” manually. But to be honest, I’m not sure if I’ll do even that in the first version.

Indeed. One of the biggest problems in Media Kraken, for me, is the initial loading that takes ages (on mobile). Following this new approach, that initial loading will still take place, but It’ll happen on the background so I’ll be able to use the application instantly. This is actually my biggest motivation for going offline-first, I don’t really experience connectivity issues. But I think it’s cool to take it all the way to offline-first :).

Thanks!

In case you’re curious, yesterday I published a video with a proof of concept. Under the hood, that’s already using a Solid POD :). The code is not published anywhere because I hard-coded a lot of things, but it’s cool to see that it works!

Great to see you’re looking into this. I haven’t prototyped anything yet, but I did think about it to some extent. The main conclusion I reached relates to this:

If you want such an approach, you’re going to have to go all in. That is, interoperability is only possible with other apps that also only store commands. You can periodically store snapshots when running into performance issues, but the source of truth is the command history, and any modifications done to the snapshot can and will be discarded unless they’re stored in the command log as well.

One other thing that’s interesting is that Resources could suffice with just Append (i.e. not Write) access.

The downside, of course, is that deleted data is always retrievable, and there’s significantly added complexity and potential failure modes.

Other terms that you might be interested in researching are event sourcing / command query responsibility segregation (CQRS).

In that thread @pukkamustard mentions Distributed Mutable Containers. Just wanted to add to that this DMC spec has been progressing a lot since then, and now lives at different locations:

Very interested in these myself. In combination with Domain-Driven Design (DDD), which maps well to closed Linked Data vocabularies (acting as bounded contexts to model a particular business domain), and is a very good way to take non-technical folks along a software-design process, up to testable and very modular, maintainable codebases.

(Note, the Event Sourcing is optional. You see it used in many examples, but it adds a lot of complexity in form of eventual consistency issues and code that is harder to test. You can always start with CQRS and extend to ES later on)

2 Likes

In theory I agree with this, and I wish it were possible. Maybe if CRDTs become more popular, and a vocabulary for CRDTs becomes as common as schema.org is today, that could be an option.

But in practice, in the state we are in today, I don’t think that’s feasible. I could do it, sure, but it’d be synonymous with my app not being interoperable. Also, I don’t think following this approach is doing it half-way, if anything I’m making it backwards compatible. An app aware of the operations would behave as expected. And what I mention of amending the history by adding a new operation with the diff doesn’t make it wrong.

Having said that, I haven’t looked into this a lot and this may come back to bite me in the future. But I think interoperability is one of the most important aspects that differenciates Solid from other technologies, and if users don’t start experiencing it, Solid won’t be any different than any other solutions.

One of the reasons why I’m so adamant about this is that when @aveltens used Ramen, he told me that he was already using schema:Recipe for recipes in his POD, so that was a great experience for him (or something like that, correct me if I’m wrong xD).

I want to see more of that :).

I knew about event sourcing and to be honest, I think it’s almost the same as what I’m doing (or maybe I don’t know it well enough to tell the difference). CQRS is one of those buzz words I’ve heard multiple times but I don’t really know what it means, I’ll check it out.

Thanks for the suggestions!

I suppose a possible test case is:

  • I make an edit offline that I cannot immediately sync
  • On another device, I decide to delete the item instead

In order for the deletion operation to win over the edit operation, or to detect the conflict, I think the deletion operation does need to be stored even if the properties are deleted for real?
I suppose the only “corruption” then is that a copy of (part of) the item still exists in the edit operation unless you have a garbage collection process?

I understand there are other CRDTs that handle deletions more elegantly but need to do some more reading…

I still have to think about this, but my current idea is that I’ll just show a message “this recipe was deleted, but you’ve done changes to it, what do you want to do? DELETE IT | RESTORE IT”.

Which again, doesn’t make it a CRDT because that is a conflict :). But I think it’s important to delete data for real, if you want to give people control over their data. As I said, I’d also like to eventually add a way to squash the history, with a similar UX (having to resolve conflicts manually).

Yes, I would always expect of a Solid App to find and use my existing data as far at possible. Of course there can be differences in the amount of data used/undestood. Some apps might use more or fewer terms then others, but this should not prevent them from using as much they can understand and conform to existing conventions (like storing my recipies in a certain folder I already choose and not “invent” a new one)

1 Like

So, there was something that was bothering me: that LWW is a state-based CRDT, not an operation-based one, yet James Long’s approach uses a message database that seems to list operations.

I think I’ve now got my head around it: basically with a state-based CRDT the messages could be deleted after the replicas have updated - unless one wants to store history, the messages are not actually a long term part of the CRDT. A state based CRDT just involves merging two states to create a new one CRDT Glossary • Conflict-free Replicated Data Types

So instead of treating the Solid pod as a message database, it should actually be possible to just make it a replica and the key issue is granularity of edits.

Both an etag and modified timestamps provide ordering of edits, so just checking those already provides a crude LWW register at the level of a document.
Implementation of the LWW register simply involves not replacing the document if our edit is older.

We would prefer to do this at the level of a triple or a a record, which then means we need a timestamp at that lower level for a LWW. However, the fact that we have the timestamp at the higher level could still provide a level of robustness to other applications that wouldn’t store the more granular timestamps.
In terms of implementation, the CRDT state merging logic either needs to be embedded in the sparql update query, or in the client app, which then pushes the updated document to the pod. The latter approach potentially seems easier if etags are available and there are not too frequent concurrent edits.

It seems that the existing RDF CRDT implementations don’t use LWW, so I’m still planning to do some more reading, but thought I’d share what I’ve learnt.

1 Like