A sketch of RDF about Solid environment

Hello, friends,

I recently discovered the Solid initiative and community and I must say it is a good surprise for me. I never knew, someone approach Semantic Web technologies in practical way now.

I would like to introduce a knowledge graph (RDF) on Solid environment, that shows a way how it is possible to compactly describe facts and keep them on the POD.

Here is the repo: prozion/solid-kgr: An RDF knowledge graph over Solid platform environment - Codeberg.org

And here is RDF on my POD: SolidOS Web App

The project follows my belief, that knowledge can be coded just like an algorithm in programming language, and that RDF processing can be done on the user side, without relying on heavy centralized SPARQL endpoints (that come and go all the time).

I guess this project can cause questions and doubts in several points, because maybe it is not a familiar way to do things in SW. So I would appreciate the feedback and discussion.

Also excuse me for my English, as I haven’t communicated to English-speaking human beings for quite a long time :slight_smile:


Cool stuff. Any particular reason why you are not using peoples WebIDs as subjects? Also the WebIDs are missing the http(s) protocol, and mine also the #me fragment

1 Like

Thank you for reply, Angelo!

Yes, that’s a known issue about webids without https: (are there http:// still somewhere in practice?) I going to fix asap.

As for not using WebIDs as subjects, and in general avoiding using indexes as subjects, like they do it in databases and large ontologies - I like to label things as close to their names as possible (in the case of person ‘Name_Surname’). It is like to have a sane name for variable or function in programming language - then it is possible to apply in different parts of project just by remembering it.

Another reason - I don’t write in Turtle format, but use a custom one (prozion/clj-tabtree: Library to work with OWL ontologies, taxonomies and data encoded in Tabtree format. - Codeberg.org), where expressions come in one line (lines are easier to manipulate in code editor, comparing to blocks of code as in Turtle). So I have to keep lines short, and so object ids must be not too long. Webids, as identifiers of person, tend to be objects in other triples, being quite long.

Of course, here arises the problem of duplicated names. But hopefully in the small controlled knowledge graphs this is a not big problem. For other cases, namespaces (prefixes) could be a solution.

Nice, @denis. We’re a Clojure shop currently storing information as JSONB because there seems to exist no precedence for porting centralized data to distributed PODs.

I would love to hear more about your thoughts on how this work would operate in a real-world Solid application. So you think that Tabtree would help with the management of personal data? As far as I understand, Solid doesn’t really promise us SPARQL because you’re really dealing with a filesystem frontend that has linked data underneath. So Tabtree is giving us a both readability for the individual and machine parsability of individual/combined files?

Hi @schmudde,

Honestly, I didn’t think much about Tabtree in app-centric way.

What I needed when developed this - is the ability to assemble large-enough RDF files, collecting data from ground-zero and then make SPARQL queries over it.

I am not sure about migrating existed data from centralized SQL databases into the PODs…

But if to think… What I see that would be nice to develop, while having big static RDF data files of various concerns, gathered from solid apps GUIs, built from Tabtree or dumped and converted from SQL - is to pick up just the referenced nodes from them, thus building a custom RDF (= graph) on the fly, depending on the assembling configuration.

If to look deeper into implementation - it could be reading RDF file into the clojure map where key of the map is the subject id, and then filtering it, taking just the part of this map, that contains only referenced ids. Do it over all the assembled RDF files, merge, serialize resulted map back to RDF and load into triple store for SPARQL, inferencing or other standard operations, depending on the tasks of the app.

But the “origin of truth” will be static RDF files, located on one or several PODs in whatever directories (containers).