For an internship I’m currently working on a PoC implementation to use Solid as a publication platform for IoT data.
The idea is to have a Solid pod as a data storage and sharing platform for the data by consumer IoT devices (e.g. weather stations, air quality sensors, activity trackers, power consumption, etc.).
This data could then be shared by its owner with third parties like research groups, companies, cities, etc for use in applications or research (s)he believes in.
My belief is that this could possibly simplify the process to setup a citizen science or voluntary data-driven research experiment while offering maximal transparency of the data the user is sharing.
Up to this moment I’ve succeeded in saving sensor measurements (translated from SenML to either SAREF or SSN through an RML mapper).
There are however a couple challenges I’m currently thinking about and hoped I could draw inspiration from this community.
- The SenML messages contain quite little metadata. The sensor isn’t required to present itself before publishing messages and at most sends the unit of measurement along with the measured value.
I believe for the data to be more useful, it should be possible for the user to add some metadata (sensor description, what’s being measured, sensor location) at the moment of sharing.
- Once the IoT data is stored on the pod, some mechanism must be in place for the third parties to fetch the data and pull it into the data processing application of their choice. But how do they know what resources on which pods are publicly available to them?
- Since multiple ontologies exist for describing measurment data and only very few data processors support linked data yet, I suppose ideally the data ‘aggregator’ tool should have the means of retranslating it and exposing it in more classic formats.
- If I understand correctly, access control on Solid pods is on a file-based level. So it currently isn’t possible to give access only to certain resources in one ttl file. If only the data for a single sensor is to be shared then, each sensor would need a corresponding file to store its measurements in.
This is the solution I could come up with for points 1-3:
Build a REST API which collects and combines the data from the resources it has access to and re-exposes it in JSON format through means of GET requests, in essence hiding the distributed working and rdf format of the data. This to make the process of getting the data less cumbersome for the people wanting to work with it.
To add metadata (sensor description, location,…) a simple web interface could be built to edit the resources and through which access could be allowed to the data aggregator to read its contents. A POST request could be sent to the API to add the resource to the list of resources the aggregator has access to.
While I think this would technically work, something tells me this isn’t the best way to go, hence this post.
Any feedback, questions, remarks, other solutions and resources you believe might help me are more than welcome. Please do keep in mind I’m still a student and this project (which I’m working out in the time span of a couple weeks) is more than anything a learning opportunity for me.