I’ve been working on m-ld, a software component for live information sharing [1]. It uses RDF as its native data representation, and CRDTs for eventual consistency. So far I’ve been focused on making m-ld generally applicable and useful in many architectures, whether RDF-based or not, by working top-down from a JSON API.
I’m aware of the need for convergence though, and I’d like to start a deliberate project to lean-in to Solid with m-ld: to make interoperation with Solid seamless.
This could have lots of upsides, for example working directly or indirectly on:
Multi-collaborator editable Pod data (e.g. for private group data, as an extension of personal data)
Local-first offline editing of Pod data (e.g. [2])
Reliable caching of Pod data at the edge (e.g. [3])
A generally applicable patch distribution model for Pods [4]
An authorisation and identity model for m-ld (currently delegated to the owning app [5])
I have in mind to apply for grant funding for this project. But I think the support of the Solid community is even more important.
Is this timely?
Would anyone like to work on this with me?
Would anyone like to work on this with me if we can get funding?
Thanks to Sarven Capadisli’s prompt on solid/chat, I have scraped more of the Solid community for motivation and prior work on this area. I’m sure the references below are not yet comprehensive, but it’s a start.
Opinion
Support for live collaborative editing of Pod data in the corpus of documented use-cases and requirements is found only in allusion, and not called out as motivating for the Solid specification. However, engineers have made calls for patch-passing data distribution mechanisms, for strong reasons which do include enablement of user features. This is a disconnect.
There is an accelerating trend towards remote collaboration in software, including live collaborative editing becoming table stakes for many kinds of user content; as well as version control with branching and merging, for others. I think engineers are aware of this, but it’s hard to capture feature requirements for new applications that users don’t have in front of them yet. This is especially a problem when other motivations for patch-passing are to do with performance and resilience, things that are only perceived by the user when they’re broken.
Would it be fair to say that integration of m-ld with solid would involve using the pod as a message queue + persistent clone that would need to be updated by a client?
More generally in the context of the issues you highlighted from solid/specification, in your opinion what specific support from the spec is needed for live data sharing to make sense? Is what’s there currently sufficient?
It would definitely make sense to persist data in a Pod – it would be a natural division of labour to use m-ld for a live document, and Solid for the long-term data availability.
With my current understanding of Solid I would be very careful about proposing to use a Pod as a persistent message queue. Message queueing is hard to make robust, scalable and performant. As evidence I present Apache Kafka, RabbitMQ, Eclipse Mosquito – all substantial long-term projects dedicated only to that task.
Further, to me it dilutes the value of Solid to have implementation artefacts like operation messages in a Pod. You might be able to hide them away using access control, but they have very different lifecycle needs to the characteristic “personal state” data. So having the right interfaces to manage such data will bloat the Solid spec and complicate the vision of switching apps without switching storage (storage independence).
I did discuss this with Noel and his needs were not for multi-actor editing but for offline support. While these could be boiled down to the same computer science problem, his engineering needs seemed quite tractable to him so he concluded to go ahead with messages-in-Solid.
So that’s a big question, which I was hoping to answer with the project. And the answer will almost certainly be “it depends!” There are lots of possible architectural choices with different trade-offs in liveness, resilience and complexity, which apply not only to m-ld + Solid but also to any CRDT + any personal data store.
I can only say that the lowest-complexity option I can intuitively come up with does not require much that is new from the spec as I understand it. If you want it to be robust and still achieve storage independence, then tying down Solid’s consistency model (e.g. spec ticket and ticket) would certainly help.
I’m sorry to report that there may be some delay in addressing the ideas in this Topic .
Dear Applicant,
We are sorry to inform you that, after going through the Evaluation process described in the Guide for Applicants (Section 4), your proposal has not been selected to take part in the Support Programme of NGI Pointer.
Your proposal has been evaluated by 2 recognized experts, who assessed the potential of your project. Your proposal failed to pass the overall threshold of 10 points.
Find below the final score and comments provided by those evaluators as feedback.
Final Score of your proposal: 9,5 out of 15 points.
Sorry to hear that! There definitely seems to be a fair bit of work required for solid to offer a credible and usable solution in this space. Hopefully there’ll be other opportunities…
That is indeed a bummer @gsvarovsky … so close! Next time I’m sure you’ll make it. I find your project very interesting, and hope to look more closely in near future if time permits.
Wanted to mention some CRDT-related information resources (which I’m sure you are already aware of, since you’ll be presenting together with @pukkamustard on the next NGI event Semantic web and metadata solutions:
Many thanks Arnold. I have come across the DREAM project before but I have not yet been in contact. There’s no time like the present though… @pukkamustard, it would be great to sync up, so to speak! I’ll DM you on the Dream forum.
I created a post at SocialHub in the topic Querying ActivityPub collections where I mentioned Meld as interesting (possible solution?), but I cannot gauge the extent this is true. Maybe you’d like to jump in and provide some more info @gsvarovsky?