How to define rdf for interchangeability between apps


#1

Hey Guys,
I want to develop a linked data collections application for Solid as part of my master thesis. I choose a collections app as this is one of the use cases I am most excited about. Currently data like this is saved with the content providers and is a pain to migrate to a new platform. I want a single source of truth of all movies, tv series and songs I have ever listened to or watched that can be synced with existing content providers.

I have set up the basics already but now I come to the actual storing part of the application. I have been wrapping my head around countless articles on how to store RDF correctly and about the used ontologies. But some points especially the modeling of how to store the data so other apps can easily compare them is still a problem.

Some of the questions/ideas that are still open:

Should I use a existing ontology to safe the data and what data should I save?

I was thinking about defining my own ontology for the saved data, but could I somehow store that within my Solid pod?

I could also use the existing movie schema from schema.org but than where do I stop with saving data?

Has somebody experienced similar problems?

The thing is we don’t have a go to solution yet for stuff like this and we need to figure out how we really want to do this. We can only save the most basic data like a link to the movie with stuff like a rating, comments and a state like seen, but in reality most of this data is quite static so it might make sense to store it with the entity.

In my opinion we should have like a file per entity e.g. movie, tv series, artist with its sub entries like episodes, albums and tracks.

Than there would be collection files that link to the entity files where collections could be watched, owned or want to watch.

For individual actions like watched, added to collection, rated I would create something like an activity stream per resource entity. The activity stream would write to a new file each month/year to keep files manageable.

Any input on this is highly welcome!


#2

That looks a lot like my own questions :slight_smile: There are threads here My first app - adding resources? and here Browsing larg-ish lists - best representation? that will answer, maybe, some of your questions.


#3

Regarding ontologies - just use your own namespaces/ontology for a starting point - anything can be stored in your POD. When your basic app works you can then explore standard ontologies and refactor those in.


#4

You can avoid all the heavy lifting of HTTP if you use the RDFLIB JS library. I have had luck with these:

Adding resource at URL X:

  • Add statements <X, some-predicate, some-value, X> for each property of the resource. Use store.add(X,p,v,X) to save the statements locally;
  • PUT the new statements onto the web: Use fetcher.putBack(X);

Reading list of resources:

  • Load all resources into local store using “globbing”. Use fetcher.load('baseUrl/*")
  • Extract all the found data from the local store. Use store.match(…)

Remove a single resource at url X:

  • Remove the resource from the web. Use fetcher.delete(X).
  • Remove local knowledge of the resource: Use store.removeDocument(X).
  • Remove all local statements: Use store.removeMatches(X);

See my repositories at https://github.com/JornWildt/SolidRC/tree/master/wwwroot/js for an implementation.


#5

I already checked out the linked threads thank you. And the saving part is not that big of a problem but how to save it so others can also use the data is the main problem. I think we should somehow develop a standard on how we save data like this. Maybe a working group that defines standard formats for collections or entities like this. What do you think?


#6

Ah, sorry, I wouldn’t know where to start and end that project. But I do like the idea of asking the music content providers to copy you usage data into your POD in a standardized way. I am pretty sure though they won’t rely on that only as they need all the data centralized for statistics.


#7

The representation is RDF (and in Solid that means you must handle RDF in Turtle, though other representations are available as optional). And RDF ontologies are what define how particular concepts are represented.

So I’m not sure if you are aware you can first of all look for existing ontologies, and only need to define them if the ones available are not suitable?

I agree that where new ontologies are needed, it would be helpful to have ways of coordinating this. And the same for discovery.

It is inevitable that there will still be holes, overlap, and duplication as well, so it’s not an easy thing to do well, and maybe one way to help with that is with a Solid ontologies group?