Reuse of data by multiple applications

Let’s say I want to create two different applications with solid and reuse some of the data.

The first application is a calendar application where calender items can be managed, they can be inserted, updated and deleted from a user’s pod.
These calender items can be for example: a dentist appointment, a birthday of a friend, a soccer game, etc. Let’s say that these calendar items have three fields: title, date and location.

A second application is a birthday application showing a list of upcoming dates of birth. These dates are taken directly from a friend’s pod or from a user’s their own pod.
Here we have two fields: the name of the friend and the date. Since I want to reuse data already existing in their pod I would ask the user to indicate which dates (found in their pod) that can and cannot be used by the birthday application.

  • How should both applications save the data?
  • How can different applications use the same data without knowing the data structure used in another application?
  • I need to keep track of dates that can and cannot be used by the birthday application, what would be a good solution for this?
  • What about duplicate data in a user’s pod? Should an application only rely on their own data?
  • When the user creates a new date of birth in the birthday calendar and stores it in it’s pod and wants to use this date in the calendar application it doesn’t have the same fields. The title became John’s birthday for example with a missing location?

Firstly I would recommend reading through [Post deleted] and Let’s talk about pods | Ruben Verborgh if you haven’t already as they cover a lot of ground in what you’re asking.

How should both applications save the data?

Pick a good ontology that accurately describes your domain to write your data in, and for now pick a location of your choice in the Pod to write the data.

How can different applications use the same data without knowing the data structure used in another application?

At the moment you will need to rely on the different applications modelling data in the same way. If there is data modelled slightly differently or in a different ontology that you are aware of and want to re-use then you could use eye-js to perform schema alignment using rules you specify (any reasoner of your choice will work - I’ve singled out eye-js because it runs natively in the browser). We are aiming to improve automation of such alignment in due course.

I need to keep track of dates that can and cannot be used by the birthday application, what would be a good solution for this?

This should be made clear from the way the data is modelled. In particular when we are talking about birthdays, we are usually describing the Birthday of a particular individual - so here we can use the foaf ontology to write the data as follows

ex:Sam a foaf:Person ;
    foaf:name "Sam Smith" ;
    foaf:birthday "1985-04-23T00:00:00Z"^^xsd:dateTime . .

So the dates that can be re-used by the birthday application are those dats that are the object of a foaf:birthday. Your other events can also be described using specific schema.org event types.

The general rule-of-thumb here is be as specific as possible in the way you describe your data (e.g. describe a sporting event as a schema.org SportsEvent rather than just an Event).

What about duplicate data in a user’s pod? Should an application only rely on their own data?

Data can and should be re-used. As I discussed in Client to client standard resources and guidelines? - #6 by jeswr - in the short term Type Indexes provide a means to discover data written by other applications. In the long term work related to Let’s talk about pods | Ruben Verborgh will make it much easier to re-use data.

When the user creates a new date of birth in the birthday calendar and stores it in it’s pod and wants to use this date in the calendar application it doesn’t have the same fields. The title became John’s birthday for example with a missing location?

In this instance it is up to your app as to how it wants to display this information. What you describe is sensible - another option would be for your calendar application to prompt you for the missing fields.

3 Likes

Wouldn’t that also maximise hindering of reuse between different clients?

Its very difficult to code against a data type that may or may not be (a) preset to begin with (b) in the expected shape (c) arbitrarily complete.

I guess the reality is that different clients are just going to make their own specific vocab for their use case because that is the easiest thing to do. And easy always wins.

1 Like

Wouldn’t that also maximise hindering of reuse between different clients?

Quite the contrary - In the example I gave SportsEvent is a subclass of Event and both are from the schema.org ontology so it should allow re-use by both applications that are both interested specifically in SportsEvents (e.g. a fitness app) and applications that are interested in all Events (e.g. your calendar app) provided that the applications you build reason about such relationships.

I guess the reality is that different clients are just going to make their own specific vocab for their use case because that is the easiest thing to do. And easy always wins.

I’d argue that for common use-cases such as a calendar app it is easier to just look up and re-use an existing vocab where all the data modelling is done for you rather than needing to build their own vocab from scratch.

1 Like

Yeah, sure, if the sports event app does bother to do inheritance which is more work (mentally) than creating your own. But for this toy example it seems a no brainer, if you had something like a invoicing and expense claim system for a multi-national company with dozens of apps with hundreds of types in possibly a large, complex hierarchy then the mental and bureaucratic overhead becomes great. But perhaps this is no worse than any other large microservices meshes. The same lessons might be applicable here?

I might be wrong, but nobody has done anything beyond a toy example with this stuff so we can’t really know. I can only guess from my 25 years of experience as a software engineer. I mean if you walk into a job and you have to learn 20-30 small ontologies vs one, albeit, large ontology which would you choose? The fact that the public ontologies are well documented is in their favour.

Perhaps someone should try to create a large complex system as an experiment to test/push this technology? It would be a terrific way to inform the spec community going forward.

In my experience, people/developers always prefer to design themselves from scratch than learn how a legacy system works. The deeper question here is leaning easier/preferedf than creating your own?

1 Like

Another great resource for reusability are shapes, e.g. ShEx or SHACL. They describe the structure of a graph, and can be used for documenting or validating graph structures.

(I’m not up-to-date on how much shapes are used in Solid, but they are used at the spec work in webid-profile and type-indexes. I also think it will be important in the works of the Solid Data Interoperability Panel.

As a sidenote, I’m currently building an experimental Solid app using LDO, a library that allows you to work with TypeScript and ShEx shapes. I’m using a custom vocabulary and created my own shapes (it’s only an experiment, not designed to be anything for production), but I’m finding the combination of TS and ShEx (and Next/React) to be quite powerful.

The code is over at GitHub, if you wanna have a look, but be warned that it’s quite messy yet.

1 Like

I guess this video answers the questions around the practical application of many ontologies in a large organisation Quite inspiring!

2 Likes