Is RDF "hard"?

Prefix: I’m going to conflate two different things here: Solid and RDF/Linked Data, even though I know these are two different things - Solid is a spec implementing many open standards such as RDF/Linked Data. (I also know it’s also a suite of libraries, etc, but I’m simplifying here.)

I agree that there’s likely a gap in selling the idea: both of RDF and Solid. And perhaps I’m wrong on the latter, but in my view a lot of people who come to the Solid project already have the sense that things are “wrong” with the internet and need to be fixed. That’s okay - the problem is trying to sell that notion to the rest of the "developer-verse."

In my view the target audience is the rest of the developers who are writing apps without really thinking about the social impact of what they create. Either you have to convince more of the developers in the world to think about these things up front, or you solve this at at technical level and make it easier such that when they write apps the toolkit they work with have these protections built in, or both. And finally, you have to just continually advertise these ideas and prove them out with demonstrative and impactful products. The goal of the last statement is to make “new” concepts (yes, I know the Semantic Web is not a new idea) a not so abstract idea; that developers get a warm fuzzy feeling of “ok, I see how this works and why I should do this, I can also do this.” Yes it’s marketing, when it all gets boiled down, but that’s to me just the nature of things.

I popped into SolidWorld last month intending to only watch, but when I heard that there’d be an open QA session/working group I decided to stick around. I asked if there was an active DevRev effort being made, and I don’t think the question was really understood. That probably was my fault, I had some technical issues at the time, and I’m not really good at explaining myself sometimes.

As I’m sure people who are on this forum are already aware: good ideas are not automatically implemented. And ideas are easy, production is hard.

I’m game for talking more about this, if others are interested.

Finally, I also want to state that this isn’t intended to be critical to what Inrupt and the Solid project are doing right now. I enjoyed the SolidWorld presentation this month and think those things are really great. I’m just of the view that there’s a really good potential to expand in that area: maybe a “Solid Developers” YouTube channel, a Twitter account, etc. And while I have personal problems with those companies, the point is sustainable and accessible channels for “educational advertisement” of the production and usefulness of Solid/open standards.

EDIT: I guess what I’m saying is there needs to be more product champions.

EDIT EDIT: Sorry, the coffee is kicking in. To me, the question becomes for any developer sitting down at their keyboard when starting a project: Why should I use Solid and develop my app to use Solid principles over what I already know? Using the MEAN stack, or SAFE stack, or any other stack really? In my view, if you can convince the general conversation of: I really should be writing my app in a “Solid” manner most of the time, then we’ve reached our goal. (And by Solid, I’m abstracting here to really mean: leveraging socially aware protocols, such as the Semantic Web, etc.)

2 Likes

It is funny. Just now there’s a cool HN thread The Block Protocol | Hacker News where one of the comments says: “Oh look, someone reinvented semantic web again.” and further on it goes into “Why is RDF a bad thing?” with response:

  • Layers upon layers of complexity: Implementing CURIs alone is a non trivial task, although all that’s really needed to describe entities and attributes is 128bit UUIDs.

  • There is no good build-in way for authentication and trust.

  • Description Logic (the foundation of OWL) has a fundamentally prescriptive philosophy, which makes it inappropriate for most practical applications.

  • No good library and tool support in general, due to the complexity.

  • Blank Nodes

  • No good consistency mechanism for distributed data generation, and Quads (having multiple graphs) don’t properly solve this.

  • Using human readable entity and attr id’s leads to more bike-shedding and accidental collisions than it’s worth.

  • High barriers to entry.

  • After years of developer disappointment the earth is pretty much salted.

Some of these refer to what we discuss here.

Yes, I feel the same as you @dynamoRando and also want to stress there’s no criticism to my post, and mostly an encouragement to delve deeper into aspect that maybe deserve more attention.

1 Like

That is funny, and in my view, encouraging. It seems to me that everyone agrees that we need more interoperability, but we can’t all agree on how to solve it. In that sense, it’s probably because the problem is actually hard.

Solid is actually my first exposure ever to RDF. Prior to this I wasn’t even aware of the Semantic Web. This thread has been helpful in my understanding the existing … pain points around it. It makes me feel better with the notion that there is some friction behind leveraging it; at least to the point that it’s not widely adopted by developers.

I was mulling this over yesterday, and while I don’t have the energy, experience, and bandwidth to do this, I was thinking that what I wanted as a developer to work with RDF is something akin to Github Copilot.

Basically, if I acknowledge up front that the data in my application is likely not unique, then maybe I can have something magical build the RDF for me if I describe it “well enough.”

For example, building an online retail app is not a new problem. There’s customers, there’s products, there’s orders. I don’t want to sit down and figure out the RDF for all of that - I would just like to describe it and have the data model built for me. (With the understanding that I shouldn’t accept the defaults, but rather validate it.)

More specifically, I’d love it if the AI built for me the code base for a repository for my app to work with, and handled the RDF well enough that I didn’t have to figure out all the needed vocabularies, etc.

Kind of a wild idea, anyway.

1 Like

maybe I can have something magical build the RDF for me if I describe it “well enough.”

I don’t know if this qualifies as the kind of magical build you are talking about. But I have a library work-in-progress that lets you generate high level web page components, apps, and websites from declarative RDF. Basically you use SPARQL or several other methods to define the data you want and then pour it into one of many templates. See the demo and the repo.

Ironically, I produce less coding by using RDF itself declaratively where its human readability becomes a big plus.

2 Likes

That’s pretty interesting! Although I was thinking about it from the other direction, i.e. starting with the data itself.

So, for example, let’s say that I wanted to start building the online retailer app that I was describing earlier. Traditionally, I would say, ok, I need to store Customer information, and maybe model a Customer entity and supporting structures (this is for example purposes only) –

class Customer 
{
   string FirstName;
   string LastName;
   Address ShippingAddress;
   Address BillingAddress;
}

class Address
{
   string StreetName;
   string StateOrProvince;
   string PostalCode;
   string CountryName;
}

And then build some supporting data structures behind it in a data store of my choice (in my case, usually SQL) and just keep going from there.

When I say that “the data in my app is not really unique”; I mean just that: the concept of an Address is pretty well understood. So at least to me, it seems like I should easily be able to build this into an RDF document? This way, I can store it in a pod.

Except that… at the moment, I have no idea how to do that. In the Developers section of the Solid project, there are links to various vocabularies and some well known ones. It’s not immediately obvious to me which I should use. I assume I should use vCard for addresses? What about customer? I’ll need to use FOAF for name, right?

And so on. I admit I’m completely lost on this part, and this is the part that I wish to abstract away - I’d love to just feed a magical box my data entities, and have it figure out the supporting RDF for me, or at least take a first stab with recommendations on alternatives.

I’m definitely open to being educated on this. This is the other part I was referencing on: is this confusion natural, or is this just the way it works to get started using linked data? Do we just need to educate more developers on this? And so on.

1 Like

How I would do it. A first stop for common RDF tasks is schema.org. I go there and I look up “customer” and find a number of predicates. I poke around at to see if schema has enough to cover what I need (it often does for common tasks). If something is missing, I go to the LOV ontology search engine and search for customer. I spend an hour or two poking around looking at vocabularies that cover it. All in all I’ve spent half a day or a day researching ontologies and terms which sharpens my understanding of the domain. After I’ve done this for several projects the time doing that is diminished drastically.

3 Likes

Not really a whole lot different from develoing ER diagrams for a database. And one is wise to put a bit of up-front effort into either a database or RDF.

1 Like

As Solid matures, there will be off-the-shelf templates for most common data structures.

1 Like

@jeffz Perfect!

That walk through is the part that I was missing. Oddly, when I was punching in “rdf vocabulary customer” into Google, Schema.org never came up. This is the kind of thing I wanted to be educated on, so thank you for that!

I hope that I too, as I get better booted into this space that the time to compose these things will be reduced. That’s the part I’m trying to understand on if doing this is “hard” - a lack of understanding/education, a lack of supporting technology, or it’s just the nature of things. I agree that if something is going to be well done, it usually takes time.

@anon36056958 - That looks interesting! I’ll have to take a look over that when I get a chance.

1 Like

I think this is the terrain that TerminusDB is exploring towards. From the start they wanted to become the “Github for Linked Data”. I posted about them in this forum before in TerminusDB a delightful database for linked data

Since then they have come a long way, and launched their cloud-based TerminusX product:

Haven’t looked at that service in detail yet. TerminusDB database is open-source, and this obviously not. So it may become a FOMO + network effects de-facto walled garden like Github over time, if successful.

A great project @jeffz, thanks for posting. To what extent does this depend on Solid-specific stuff vs. usable directly on any RDF compatible (lower-level) apps?

Wow, indeed. Thanks @anon36056958

1 Like

The issues you mention are some I’ve been thinking about for a while, so here’s my two cents on the topic :).

First, is RDF “hard”? I don’t think so. In fact, I think RDF is easy. I also came from a background similar to yours (I think), because I didn’t know anything about the Semantic Web and I was used to just creating an app with Laravel. But, it wasn’t too difficult to understand how RDF worked. In particular, after reading the RDF primer and the RDF Schema spec I just saw RDF as a more general way of declaring an object-oriented mental model I already had. All along the way I’ve been learning more, and I realized that some of my initial assumptions were wrong. But overall, I’d say the mental model I got on the first weeks of learning RDF still applies.

But there’s a caveat to that. RDF is not hard; but choosing a vocabulary for your app is hard, and I think that’s where the issues arise. However, if you don’t even try to be interoperable, it’s very easy to create your own vocabulary. You just create a class and properties like you would in a normal object oriented programming language.

If I don’t care about data portability, then I’m not going to use RDF, I’ll just continue to store my data in a SQL database, or a CSV, or whatever.

I don’t agree with this part, because RDF/Solid has an inherent advantage over SQL, CSV, or whatever. In Solid, even if your vocabulary is unique to your app, the data is available to users. So even if you made up a vocabulary and no other app uses it, the community will be able to start using your vocabulary, or implement tools to convert from your data to other formats, without your involvement. Using a traditional architecture, the data will be enclosed in your server and you need to implement a custom API if you want to expose it. I wrote a blog post talking about this, maybe you find it interesting: Interoperable serendipity.

Now, having said that, I would agree that it’s more difficult for developers to get started with Solid/RDF. But I don’t think it’s because it’s more difficult to understand, it’s just that the community is smaller and not many people is focusing on developer experience. I can make a comparison to PHP. I remember, years ago, that there was a joke running around that PHP was dead and everybody hated working with it. But then, Laravel came around and now a lot of people love PHP. Sure, the joke is still going around, but Laravel has a thriving community with a lot of happy developers. I think the same could happen with Solid, but we haven’t got our Taylor Otwell yet (the creator of Laravel).

Personally, I have been working in some tools in that direction. But to be honest, I’m not really doing it to contribute to the community; I’m just working in the open and that’s why I open source my code and document my libraries. But it takes me ages to finish anything because I’m just working in sideprojects and it’s not my intention to go full time on this at this point (I’ve been working in my latest Solid app for over a year now xD). In case you’re interested though, here’s the library I’ve been working on: Soukai Solid. And you can check out an app using it here: Media Kraken. Eventually, I’d like to publish a framework as well, allowing for the framework new my-project workflow that you mention. But it’ll probably take months or even years until that happens.

4 Likes

Hi all,
i have read @aschrijver thread on social hub and the most of this one here. And now i wish to seperate the things here in different topics. The title is ‘Is RDF “hard”?’ but the most problems described in this thread is not really a RDF problem, isn’t it?

It’s more about finding existing vocabularies. And thats independent from RDF.
So if you don’t have to worry about interoperability, is RDF still “hard” ?

Did I understand it correctly:
We are talking about RDF not the form of representation like ‘json-ld’, right ?

3 Likes

Java sample to generate rdf:

	void generateSample() {
		
		Model model = new ModelBuilder()
				.subject("http://example.com/myTestObject")
					.add(RDF.TYPE, SCHEMA_ORG.PostalAddress)
					.add(SCHEMA_ORG.name, literal("John's address"))
					.add(SCHEMA_ORG.postalCode, literal("82273"))
					.add(SCHEMA_ORG.addressLocality, literal("Munich"))
					.add(SCHEMA_ORG.addressCountry, literal("Germany"))
				.build();
		
		Rio.write(model, System.out, RDFFormat.TURTLE);
	}

output:

<http://example.com/myTestObject> a <https://schema.org/PostalAddress>;
  <https://schema.org/name> "John's address";
  <https://schema.org/postalCode> "82273";
  <https://schema.org/addressLocality> "Munich";
  <https://schema.org/addressCountry> "Germany" .

and to create a database and save that model i have to add

		// create a database (inMemory)
		SailRepository repository = new SailRepository(new MemoryStore());
		
		// Save the model to the database
		try(RepositoryConnection con = repository.getConnection() ) {
			con.add(model);
		}

So using java and rdf4j is not so bad.

Ok, but I admit that this is a simple example and the way to this example was also hard for me ,-) But it was worth it. a good year ago semantic web was a headline of an article I skipped. And I had no idea what RDF was.

And now, ~ one year later: i love RDF

2 Likes

maybe one day there will be:

Repository repository = new SolidRepostory(“https://john.solidcommunity.net/”);

:heart_eyes:

2 Likes

@ludwigschubi has been working on something on that line, I think. You can check out his shex-codegen tool, and there’s a couple of threads in this forum talking about it:

3 Likes

A great project @jeffz, thanks for posting. To what extent does this depend on Solid-specific stuff vs. usable directly on any RDF compatible (lower-level) apps?

Currently it can take as a ui:dataSource any RDF from any provenance and is specific to Solid only in the sense that if you are logged in, you can access private Solid materials. The RDF may be in the form of a Collection or the library can gather a Collection from a SPARQL query. I am just now finalizing additional specific dataSources for RSS/Atom feeds, Wikidata & Internet Archive searches from which I munge RDF .

1 Like

That’s a fair point - the title of the thread is misnamed in retrospect due to my misunderstanding of the concepts.

To try and condense what is “hard”, at least for me at the start of this thread, and what I’ve learned so far:

Item Number Challenge Remarks Conclusions/Resolutions
1 When modeling entities in a new application, how do you find the vocabularies for it? This is a requirement if and only if you want your RDF to be compatable. Yes, this naturally takes time. Over time though, it may be easier as a function of experience and working with RDF. Toolkits exist and are being made to also help reduce the time to implement.
2 When coming from a traditional SQL relational model, how do you map to RDF triples? RDF triples are a foundational concept; foundational in that you really need to understand them; just as you would take the time to understand how SQL tables, keys, columns and rows work. This only takes time if you are new. The more education and experience that becomes available to developers, the time to implement this can be reduced.

To your point @naturzukunftSolid, RDF is not new and there exists plenty of other frameworks for working with it. I’ve been using dotnetRDF as an example for myself in some of the learning projects I’ve been building.

Going back to kind of what I’d like to do (I’ve been evolving this in my head), given again the previous example:

Given a simple model for handling customers in an online retail website:

class Customer
{
   string FirstName;
   string LastName;
   Address ShippingAddress;
   Address BillingAddress;
}

class Address
{
   string StreetName;
   string StateOrProvince;
   string PostalCode;
   string CountryName;
}

Based off of what I’m reading here, if I could tell my code to infer what vocabularies to use (something similar to leveraging a standard library like in C++; a standard or “common” mapping to various vocabularies), and build the corresponding mapping at compile time (in C#, maybe using reflection or source generators), that would be helpful.

So, for example, just as the Solid website points out that there are well known vocabularies for common things; it would be nice if people could create various “bundles” of vocabularies that might fit a data model (online retail, etc.)

So what I’d like to do, code wise, is something like:

string retailerVocabularies = "Bundle of vocabularies to infer from goes here as a link, or something, that might commonly be used in online retail";

var rdfModeler = new RDFModeler().UseVocabularyBundle(retailerVocabularies);
rdfModeler.RegisterType<Customer>();
rdfModeler.RegisterType<Address>();

Essentially, I’m trying to reduce the implementation time of Item #1 in the table of challenges I mentioned above.

The modeler would use the bundle of vocabularies (maybe it defaults to trying to bind objects that seem to fit with common vocabularies like from Schema.org, w3.org, etc.) to build code at compile time that does the mapping for me - essentially producing the code that you just wrote in Java:

void generateSample() {
        Model model = new ModelBuilder()
                .subject("http://example.com/myTestObject")
                    .add(RDF.TYPE, SCHEMA_ORG.PostalAddress)
                    .add(SCHEMA_ORG.name, literal("John's address"))
                    .add(SCHEMA_ORG.postalCode, literal("82273"))
                    .add(SCHEMA_ORG.addressLocality, literal("Munich"))
                    .add(SCHEMA_ORG.addressCountry, literal("Germany"))
                .build();

        Rio.write(model, System.out, RDFFormat.TURTLE);
    }

I’d like to keep my objects as-is, so then if I had a repository object, I could just leverage it to use my RDF modeler. As an example (this code is obviously made up, but trying to build upon your code example):

// configure to save in-memory
// and pass my rdfModeler object from before
// to help it understand how map things

var retailerRepository = new Repository(new MemoryStore()).ConfigureWith(rdfModeler);

And so I could continue to keep working with my objects as-is:

// skip for example actually init the Customer, just trying to show that the customer 
// object works "as-is" in code

retailerRepository.SaveCustomer(new Customer());

And the repository would leverage the rdfModeler to write out the model to the storage location, in this case, in-memory as an RDF document.

There’s probably some things I’m overlooking, but that’s kind of where I’ve mentally landed on what would be nice. I’m sure that there’s likely a chance that this theoretical RDFModeler could get a bunch of things wrong, so ideally there’s also be an option to inspect the generated source code or to decorate your objects to override what you think the libraries should be, if needed.

I haven’t investigated all the links here, but it sounds like maybe others have tried to solve this as well?

I also want to thank everyone for their input and enthusiasm. It keeps me going, and I appreciate the insight into all of this.

2 Likes

@NoelDeMartin -

I just looked at your links. I’ve got a pretty busy day ahead of me, so I don’t know if I’ll be able to get to mentally boot all of these into my head, but I’m glancing over at Shape Expressions based off of your links, since I’m not familiar with that concept either.

Is it possible to take a ShapeEx => Language Of Choice?

I see that you linked a project to take ShapeEx => Typescript; would it be possible to do for other languages?

For me, the other nice thing would be, if someone’s already kind of done the modeling from RDF to Object (which I’m not sure if that’s the whole point of ShapeEx, again, kind of in a hurry at the moment), then it’d be nice if I could publish that work in a format that others could use in their own projects so that their time is not also spent trying to map vocabularies to objects in their own work.

I feel i know exactly what you mean. i started with a kind of “OM” therefore.

Sample: AsObject.java

And on more concrete one: PostalAddressLoa.java

And this annotated POJO’s i was able to convert:

	ModelCreator<?> mc = new ModelCreator<>(myPojo);
	Model model = mc.toModel();

model is the easily to convert like in the post above.
The other way:

	Converter<MyPojo> c = new Converter<MyPojo>(MyPojo.class);
	MyPojo result = c.fromModel(subject, model).get();

This was my first draft, what was not working sooo bad.
But i very quickly came to the conclusion that this was the wrong way to go. This may work quite well if you have very concrete objects. But then I leave the advantages of rdf.

I’m not sure yet how I will work in the long run, probably it will be a mixture.

1 Like

In theory yes, in practice you’ll have to implement it yourself. You’ll find that this is the answer to a lot of questions in Solid. The opportunities are huge in theory, but in practice there is a lot of work to be done.

I think in principle that’s the idea of what my Soukai library does. To use your example with a Customer, you can do this:

class Customer extends SolidModel {

    static fields = {
        firstName: FieldType.String,
        lastName: FieldType.String,
        shippingAddressUrl: FieldType.Key,
        billingAddressUrl: FieldType.Key,
    };

    public shippingAddressRelationship() {
        return this.belongsToOne(Address, 'shippingAddressUrl');
    }

    public billingAddressRelationship() {
        return this.belongsToOne(Address, 'billingAddressUrl');
    }

}

class Address extends SolidModel {

    static fields = {
        streetName: FieldType.String,
        stateOrProvince: FieldType.String,
        postalCode: FieldType.String,
        countryName: FieldType.String,
    };

}

And this is how you use it:

// To create a new customer...
const customer = new Customer({ firstName: 'John', lastName: 'Doe' });

customer.relatedShippingAddress.create({ streetName: 'Foo bar', ... });

await customer.save();

// To retrieve an existing customer
const customer = await Customer.find('https://my-pod.example.com/customers/123');

console.log(`Hello, ${customer.firstName}!`);

The code above works in practice, that’s what I’m doing in my apps (take it as pseudo-code though, I haven’t run this code).

Potentially, my library could be combined with shapes so that you don’t have to define all those classes and fields manually. You could generate them from a local shex file for development, or from a url if you want to generate models at runtime. That’d be similar to what shex-codegen is doing in practice today (I think, I haven’t used this library myself). But I don’t have any plans to do this anytime soon, I haven’t found the need to declare models by hand so hard. If anything, I prefer it because it gives me more flexibility.

1 Like