Is RDF "hard"?

dynamoRando · January 28, 2022, 4:45pm

I just looked at your links. I’ve got a pretty busy day ahead of me, so I don’t know if I’ll be able to get to mentally boot all of these into my head, but I’m glancing over at Shape Expressions based off of your links, since I’m not familiar with that concept either.

Is it possible to take a ShapeEx => Language Of Choice?

I see that you linked a project to take ShapeEx => Typescript; would it be possible to do for other languages?

For me, the other nice thing would be, if someone’s already kind of done the modeling from RDF to Object (which I’m not sure if that’s the whole point of ShapeEx, again, kind of in a hurry at the moment), then it’d be nice if I could publish that work in a format that others could use in their own projects so that their time is not also spent trying to map vocabularies to objects in their own work.

naturzukunftSolid · January 28, 2022, 4:51pm

I feel i know exactly what you mean. i started with a kind of “OM” therefore.

Sample: AsObject.java

And on more concrete one: PostalAddressLoa.java

And this annotated POJO’s i was able to convert:

	ModelCreator<?> mc = new ModelCreator<>(myPojo);
	Model model = mc.toModel();

model is the easily to convert like in the post above.
The other way:

	Converter<MyPojo> c = new Converter<MyPojo>(MyPojo.class);
	MyPojo result = c.fromModel(subject, model).get();

This was my first draft, what was not working sooo bad.
But i very quickly came to the conclusion that this was the wrong way to go. This may work quite well if you have very concrete objects. But then I leave the advantages of rdf.

I’m not sure yet how I will work in the long run, probably it will be a mixture.

NoelDeMartin · January 28, 2022, 5:13pm

In theory yes, in practice you’ll have to implement it yourself. You’ll find that this is the answer to a lot of questions in Solid. The opportunities are huge in theory, but in practice there is a lot of work to be done.

I think in principle that’s the idea of what my Soukai library does. To use your example with a Customer, you can do this:

class Customer extends SolidModel {

    static fields = {
        firstName: FieldType.String,
        lastName: FieldType.String,
        shippingAddressUrl: FieldType.Key,
        billingAddressUrl: FieldType.Key,
    };

    public shippingAddressRelationship() {
        return this.belongsToOne(Address, 'shippingAddressUrl');
    }

    public billingAddressRelationship() {
        return this.belongsToOne(Address, 'billingAddressUrl');
    }

}

class Address extends SolidModel {

    static fields = {
        streetName: FieldType.String,
        stateOrProvince: FieldType.String,
        postalCode: FieldType.String,
        countryName: FieldType.String,
    };

}

And this is how you use it:

// To create a new customer...
const customer = new Customer({ firstName: 'John', lastName: 'Doe' });

customer.relatedShippingAddress.create({ streetName: 'Foo bar', ... });

await customer.save();

// To retrieve an existing customer
const customer = await Customer.find('https://my-pod.example.com/customers/123');

console.log(`Hello, ${customer.firstName}!`);

The code above works in practice, that’s what I’m doing in my apps (take it as pseudo-code though, I haven’t run this code).

Potentially, my library could be combined with shapes so that you don’t have to define all those classes and fields manually. You could generate them from a local shex file for development, or from a url if you want to generate models at runtime. That’d be similar to what shex-codegen is doing in practice today (I think, I haven’t used this library myself). But I don’t have any plans to do this anytime soon, I haven’t found the need to declare models by hand so hard. If anything, I prefer it because it gives me more flexibility.

dynamoRando · January 28, 2022, 5:30pm

That looks really sweet.

Thank you @NoelDeMartin and @naturzukunftSolid both for your comments. I’ll need to dig into all this material this weekend, I’ve got meetings today.

Based off of the responses, I have a sneaking suspicion that I emotionally will land in the same spot as when people were building ORMs and NoSQL solutions on top off of away from SQL… my personal lesson I learned was that after it was all said and done; it’s often easier to just get super in tune with the tech than trying to build abstractions on top of it. Not that there isn’t a place for abstractions; just that there exists a threshold of diminishing returns based on what you’re trying to do.

kidehen · January 28, 2022, 8:46pm

Here’s the reply I posted to Hacker New.

What you’ve outlined above is a common misconception about RDF. Here’s another way of looking at RDF:

An abstract data definition language for structured data represented as entity relationship graphs. Basically, the same Entity-Attribute-Value model that’s existed for eons plus the addition of formalized identifiers (i.e., IRIs) for denoting entities, attributes, and values.

That’s it!

RDF Schema, OWL Ontology, etc.. are simply Data Dictionaries comprising terms that describe Entity Relationship Types.

Summary

Solid project seems most focused on a Grand Vision where its success hinges on tackling enormous complexity and widespread adoption of standards and ecosystem projects.

There’s no gradual path to adoption with stable well-productized milestones all along the roadmap and options to choose from. No way for people to get acquainted with Solid without going all-in and face the brunt of the complexity.

There’s little focus on the process of software development, how Solid fits, and what benefits it brings in the short term (i.e. before the Grand Vision is realized).

While there is a deep technical focus, there seems to be almost a business myopia, as to all the process and design best-practices that are also needed to create interoperable apps that satisfy people’s needs. Social aspects of development are neglected.

Recommendations:

Focus on all the practical things that help average developers leverage Solid technology right now in their apps. Tools, documentation, different langauges supported, etc. And with these people now invested in Solid, entice them towards deeper community involvement.

Ensure that not just tech is covered, but that Process is as well. How do I design my app with reasonable expectations for interoperability? How can and should I collaborate with others, and what organization structure and tools can we offer to help with that?

I notice the tendency to deep-dive into code, and I understand that as you are all coding apps and libraries. I am just chiming in as an outside observer going deliberately beyond the technical. And I may see this all wrong, of course, but I’m writing this down in hopes it may be useful, not to be critical

Developer experience

Some have said that RDF is actually pretty simple, but that’s just when looking at the format and how it presents information. However, presenting meaningful, machine-readable information that is interoperable with other apps is where it gets tricky and where - according to the HN comment - “layers upon layers of complexity” are added and “no good library and tool support in general” exists. As @NoelDeMartin mentions as soon as a standard is not implemented in your language you have to create it from scratch.

My view: Unless you don’t care about broad interoperability things become pretty hard pretty quick on a technical level.

Semantic Interoperability

Why go through all this trouble? That is the potential ease of interoperability, i.e. of others reusing and building upon your data model without having to closely cooperate with you to formalize how that looks like, right? The work has been done, and a common vocabulary is for grabs.

@jeffz outlines outlines a process for doing so, i.e. first go to the most widely used ontology at schema.org and then move on to LOV ontology search engine and find more ontologies to fill the gaps. This focus on the Process is a big step in right direction, and I feel that it is exactly the process part that gets overlooked by the more tech-oriented people in the community.

@anon85132706 wrote in another topic:

I don’t know about that, but I argue that the process above only partly solves interoperability issues. By choosing from common ontologies we’ve only increased the potential for our data to be reused in interoperable ways by others, and without being super duper careful while doing so we may have increased the chances that that happens in the wrong way and will lead to problems down the road.

My view: Unless having a very well defined and deep understanding of the process and following it properly, the technical deep-dive may not be a worthwhile exercise at all.

What do I mean here? Given the example above I have a certain semantic interpretation of a “Customer” as it exists in my app. Now I start to look for a representation in a common ontology that I think matches my idea of the concept. Now, if someone else using the same ontology has a slightly different interpretation, then now we have a semantic mismatch. And these mismatches are I think very easy to make, and everyone makes them all the time and at all different places in the interconnected graph. I can query with SPARQL all I want, but the further I get from my application boundary the more meaningless the data becomes. I cannot trust the results to be useful, unless the common understanding of their meaning is still guaranteed.

Of course, having really well-designed ontologies helps here. And some of these exist, where their concepts are so ubiquitous that they are hard to misinterpret. But it is as @dynamoRando says:

In reaction to @anon85132706 a reference was made by @justin to Solid Application Interoperability, a new draft standard that builds upon other standards, and aims to address Problems and Goals for Interoperability, Collaboration, and Security. That is great, but in short term that makes the circle round to adding more “layers upon layers of complexity” and “lack of tool support”.

The Faces of Solid

Some personal impressions, that may be wrong.. To me it seems like there are 2 tracks to the Solid project that are intertwined:

The practical path. Defining ways to have secure personal data vaults that are decoupled from their application, and where the linked data in them increases the potential for semantic interoperability (lowers the barrier to reuse). Comprehensive, mostly existing standards suffice.
The Big Vision. Broadscale, seamless interoperability with ease. Building on 1) with many additional specs and layers of complexity, but abstracted away by a rich and mature ecosystem. A rebooting of the Semantic Web.

You can very well build apps that align to 1) currently and then RDF will be “manageable”, but if you want to also comply or evolve to 2) then RDF is “hard” and you should accept years of ‘growing pains’ of the ecosystem yet to come. While 1) is within reach now, the visionary path 2) may never be reached. It requires broad acceptance of the standards and widespread adoption for it to materialize.

To me it seems that 2) The Broad Vision is what the core team is most interested in, and I sometimes wonder if there shouldn’t be much more attention to a gradual transition path from track 1) to track 2) and a clearer distinction between both. That track 1 is neglected as a valid approach for people who just want to make a start with Solid technology.

After all that is what this whole thread is about. We get smacked around the ears with all the inter-related and still very much emerging standards and their low-level technical implementations.

Process

Back to the process. What do we try to do? Make delightful software that addresses people’s needs!

The whole “what RDF should I have?” is much further down the line of the software development lifecycle. It is a technical concern. And so too is the answer to the question “How interoperable does my application need to be?”, which ultimately translates to technical requirements.

In all this technical onslaught we forget that software development is in large part a social process.

I find this whole notion mostly missing from this community, and frankly so too do I miss that in the SocialHub community (dedicated to evolving the Fediverse) where I am more active. (It is I think the reason why many apps when adding federation support seem to me like they just added Microblogging capabilities. Because that’s the technical interop infrastructure that the first popular app Mastodon put in place. And with a pure tech focus it seems that that should be “plugged into”.)

Where is the Process? We can code as much as we want, but without proper process how do we know we build the right thing? That we translate real needs into code? Even more so, how - without attention to process every step of the way - do we know that track 2) of The Big Vision will eventually even support the processes we need? That things will live up to their promise?

I feel that Solid project would be much stronger if it provided good stepping stones towards the big vision, with production-ready deliverables all along the way and proper guidance in place to go from one stone to the next. That way it can take an ever growing developer base along its trajectory, instead of just the most adventurous developers that are in for the long haul.

Domain modeling

Stepping away from Solid for a bit, and looking at software development. In general we analyse stakeholder needs to learn about the business domain, then we create a business model - which has both a particular data structure as well as business rules / logic - and only then we start our elaboration into ever more technical details. It can be iterative and agile, but we always do this, even when not being explicit about it.

Strategic Domain Driven Design is a more explicit method. It tries to capture the common or Ubiquitous Language of the business and then split them into Bounded Contexts which are like the scope where the semantic meaning of concepts is well-defined. A “Customer” in Ordering isn’t the same as the “Customer” in Shipping context. And that makes them manageable.

If we want to fall back on using well-known ontologies to express ‘universal semantics’ we are not only reusing a data format, but we are reusing all the social interaction that led to their design. We have to understand and adhere to the business domain they were defined for. Or, alternatively, we have to have these layers of complexity in place that can parse the universal semantics in machine-readable fashion.

This is not the first time I wrote about this. In Aligning efforts in LD schema / ontology design + adoption I quoted from Kevin Feeney:

“[..] the big problem is that the well-known ontologies and vocabularies such as foaf and dublin-core that have been reused, cannot really be used as libraries in such a manner. They lack precise and correct definitions and they are full of errors and mutual inconsistencies [1] and they themselves use terms from other ontologies — creating huge and unwieldy dependency trees.”

I feel that it is worthwhile to ask what we want to achieve with Solid in terms of benefits, both in the short and in the long term, and pay more attention to where it sits in the process of going from people’s needs to actual code.

dynamoRando · January 29, 2022, 9:54pm

Long post ahead.

Wow, there’s a lot here to unpack. I want to be mindful that I’m not sucking the oxygen out of the thread, but there’s some tangential thoughts that I have as well overall. I’m also going to do a little bit of self promotion as well.

Topics

Item Number	Item
1	Developing With RDF
2	Solid Project Execution / Dev Experience
3	What problem are we trying to solve?

I’m going to skip the first item for a moment and come back to it.

Solid Project Execution

In response to @aschrijver:

I mentioned this in my introduce yourself post, but I’m coming back to the Solid project after reading about it a few years back.

A few years ago, I could see that the Solid project was just getting off the ground, and as I poked around, didn’t see initially a whole lot that I was wanting before jumping in (more in a second); so I figured I’d give it a year or two as the project matured. As mentioned previously, I was kind of expecting more DevRel.

I’m going to skip defining what DevRel is; but having been used to that experience with big tech companies (Microsoft, Apple, Google) - I’ve been spoiled with an abundance of:

Docs
Video walk-throughs
Social media exposure

Since I’ve been mostly exposed to Microsoft technologies, I’ll refer to what I’m familiar with when getting booted with any new technology that Microsft introduces (or takes over )

If I need …

To figure out how to use a feature, I can almost always find both a technical walkthrough as well as an architectural one on Microsoft Docs. This is going to sound dumb, but I the documentation. It’s really good in my view.
To understand a general walkthrough, I can almost always find a corresponding YouTube video, either by the community or by MS employees (example 1 and 2) that walks me through technology item X.
To ask questions: there’s almost always a Discord server I can hop into to ask questions; and/or there’s usually MS people on Twitter that I can tweet at and usually get a response. Or alternatively, Stack Overflow.

The point is: there’s a lot of channels to flatten the learning curve.

Now, I understand this is not a fair comparison. Microsoft is a a $2 trillion company. They can afford to have an army to accomplish all this.

And I’m not saying that the Solid project isn’t doing this. What I am saying is there’s a chance to enhance and enrich what is already out there.

For example, I’m really grateful for the walkthrough that @jeffz gave to my earlier question in the thread:

I’d love for that mental walkthrough to be available in the documentation, or done in a YouTube video, etc. Anything that is a channel for others to not have to do through what I just did in this thread, which is: “How do I even start this?” And yes, I could continue to Google for more examples, but in my view, just put it on the SolidProject.org’s documentation pages: one less click and Google search to have to do; reducing the time to learning and code for any developer who’s come to SolidProject.org and is just trying to “figure it all out.”

I’m probably wrong on all this, and maybe I’m just not good at being educated and finding answers on my own; but that’s something that has been lurking around in my head for the past few days.

What problem are we trying to solve?

This is something I’ve been thinking about for awhile.

At the root, we are trying to give internet citizens more authority over their data. Data breaches, identity theft, surveillance capitalism, I think we all agree these are “bad things.”

When I say giving citizens authority over their data, I define this to mean:

Giving users the ability to define where data about them is stored. When I sign up at Amazon.com, my order history, shipping address, etc is stored on their servers. Why can’t I have my data on my server?
Giving users the ability to define access rights to their data. If I don’t want Amazon to have access anymore to my data, why can’t I revoke it? (There’s a big disclaimer here, which I’ll acknowledge in a second.)
Determining where my data can be used. If I write up a post on Facebook, and then later decide I don’t like Facebook, why can’t I just move my data to Twitter? Or, a more common example: if I decided I don’t like Spotify, how can I move my playlists to Apple Music? This topic, as I’m sure others are aware of, is about data portability or interoperability.

As mentioned, there is a big disclaimer on item #2 from above: access rights to data. Once someone has read data about you, even once, all bets are off on how it’ll be used. This is something that Solid admits as well, and to be clear, this problem already exists today. Now, it’s been suggested that perhaps the way to solve the above is by legal measures, but that’s a topic I don’t want to go into that right now.

Now we come to the first topic I opened with and that others have already pointed out…

Developing with RDF

I appreciate the clarification here. I didn’t think about this, but it’s true that I don’t even have to try and make it work with everyone else…

I also appreciate this, because in theory, if I want to put in the minimum amount of effort to claim “Hey, my app gives users freedom of data!” I wouldn’t be wrong… I’d just be a pain because I didn’t put in my effort to make it use a well known vocabulary. But that is an advantage of RDF.

Before I came back to Solid this month, I had been thinking about ways to give users authority over their data; but rather than implementing the solution via protocols a few years back (by my estimation, actually about the same time that Solid and Inrupt were announced), I wanted to solve it at the data layer. Specifically, I thought the solution to where data is stored and access rights could easily be solved if we simply modified the way existing relational databases worked. So I’ve spent the last few years attempting to write my own database system with no prior systems programming experience. As you can imagine, it has not gone well I did a write up of the high level concept here on my blog:

https://dynamorando.com/blog/aboutdrummerdb/

Which itself is about the third or fourth time I’ve tried to implement this idea I had of “cooperative database systems.” The repo is a paper that I wrote (rather poorly) of a bunch of ideas that I had on how to maybe make this work.

The advantage in my view was this: Depending on implementation, it would give developers less of a surface area to have to grasp. My intent was that if we just introduce a few new concepts and new SQL keywords, we could make this work quickly for existing programmers today. Developers already understand SQL, we just need to alter the data tier a little bit to make it work.

The disadvantage to this is that we still don’t really make data portable. I also had thought about this, and in my database implementation I actually tried to hand wave this problem away by making the database schema by default public. Either by publishing the full database schema to all participants of the database system, or by making it a public read-only endpoint that could be queried.

Back now to RDF and to parts of why I started this thread: I’m all for using something that already works, and if RDF works, I’m all for instead directing energy towards something that already works. As someone once said, it’s better to steal someone else’s good ideas rather than spending your time inventing your own bad ones.

So as of today, I’m still trying to understand if the lack of adoption of RDF is that it’s not actually hard, it’s just not been advertised well, or if the tooling isn’t there, or if it’s lack of an external impetus: i.e. data protection laws that would just drive tech to more adoption of something like RDF.

So far, thanks to all of you wonderful folks on this thread, I’m coming 'round to the notion that -

RDF, in particular, triples are a foundational concept like rows, columns, tables, and keys in a database system; and just like anything new, these concepts take a little bit of getting used to
RDF is not hard so long as you don’t care about making it interoperable
If you do want to make it interoperable, it will take a little bit of effort, but this can be overcome over time for a developer with experience (and maybe with more advertising on the internet)
There’s plenty of tooling available to work with RDF - depending on how you want to work with RDF. If you want to work with it natively, there’s probably already libraries available. If you want to abstract it away to work with it in a more object oriented format for example, there may be some tooling there, but not fully fleshed out.

If you’ve made it this far, thanks again, and thanks again for your patience and for everyone’s sharing of ideas and encouragement.

aschrijver · January 30, 2022, 7:27am

Great post! I am not sure about DevRel wrt Solid, but that is just because the strategy of Solid vs. Inrupt regarding positioning and finding adoption in ‘the market’ is not clear to me. That’s okay, they don’t need to share that with me, and there may be commercial interests that make other audiences more interesting to pitch to. But I can’t help but wonder if Solid couldn’t have a way bigger, way more active developer community by now. That folllowed by the thought that maybe Solid is still in too early stage of development to spend much time on ‘DevRel’, or that I am just not in the audience that’s interesting for Inrupt and other companies / institutions involved. Maybe the intention is to find uptake through corporate, governmental and academic channels mostly. I don’t know.

Yes, @jeffz is among the group of very helpful community members. Such help is so important to get people started. Thank you, Jeff!

Yes! I’ve had several similar discussions on ‘product’ positioning in the path. The Solid website has been significantly improved and now states more clearly what Solid is about. The things you mention provide a simple, comprehensive and compelling offering that could be one of those ‘stable stepping stones’ I mentioned, that can entice technologists to become passionate for Solid and stay on as long-term technology advocates building all kinds of ecosystem tools in different languages.

Compared to not so long ago the various specification documents have been much improved, so I am hopeful that such a stable staring point is indeed forged. Due attention then also needs to be given to some of the DevRel-related activities you mention.

Thanks @NoelDeMartin for reminding us about your great article on Interoperable serendipity. In the discussion at the time I mentioned the challenges that I experience promoting this kind of interoperability for the Fediverse. A big issue here is that developers tend to focus most on their own apps and take what is available to them, expand a bit on that, but then do not spend the time to make their extension easily accessible to others. The result is what I call Ad-hoc Interoperability. Over time this creates a patchwork of slightly incompatible apps that make interoperability harder and harder.

This is what I described above. But now we’ve only created the potential for reuse. As your article states this is not enough, and will not work if something extra isn’t added. Those imho are social aspects, collaboration, etc. that are missing. And this is something I feel as an issue for Solid Project to tackle too, as there’s imho too much of “If we just have all these specs then Interoperability is a solved problem”.

Btw, I came upon a much better, more universal, term for Lens-based interoperability, namely interconnectivity. I was introduced to it by @steffen who started Iconet Foundation that is dedicated to the concept.

On the whole I think that social aspects are crucial, not just within the software development lifecycle of one project, but across an entire ecosystem. And that focusing on formation of a healthy “social fabric” is key to interoperable serendipity. I am involved in an initiative that is not yet launched called Social Coding that intends to explore these social aspects for the Free Software Development Lifecycle (FSDL) and leverages the Fediverse (a Peopleverse!) as the social fabric to do so. This will be a crowdsourced initiative.

megoth · February 3, 2022, 11:59pm

Reckon https://github.com/w3c/EasierRDF should be of interest to people interested in this topic.

aschrijver · February 4, 2022, 5:51am

That is truly very interesting. Thank you @megoth.

kidehen · February 4, 2022, 7:33pm

Here are a link to a post about N-Tuple (Tables) and 3-Tuple (RDF Graph) Entity Relationship Types (Relations):

Why Graph Database is a confusing moniker

Topic		Replies	Views
A community tool for interoperability Interoperability	29	3187	September 20, 2021
Pitching RDF at individuals with data	15	295	April 1, 2026
Solid scope and ecosystem Use Solid	45	4529	November 15, 2020
Constructive criticism from an experienced developer	12	2392	November 9, 2020
Aligning efforts in LD schema / ontology design + adoption Linked Data	21	2976	September 30, 2020