Why "on top of the file-system"?


#1

Hello Solid Community,

The node-solid-server tagline is “Solid server on top of the file-system in NodeJS”. This is very appropriate as data is stored to the filesystem as Turtle files (and other filetypes).

I am wondering why such an architecture was chosen as opposed to storing data in a database (a relational database or a triple store)? I imagine implementing SPARQL queries would be easier if data is stored in a triple store that can run SPARQL queries (see related GitHub issue)? Other advantages of using existing databases might be performance. Are there any particular reasons why node-solid-server stores data to plain files?

Background: I am working on a social networky/open data project (https://miaengiadina.github.io/openengiadina/), am trying to understand existing standards and am interested in architectural decisions made by various implementations.


#2

I’m a complete newbie here, so take my answer with plenty of salt. The specifications as far as I can see don’t mention] or even recommend any particular storage or I/O mechanism. In a sense, the network is a whole is like a big filesystem. The problem of putting bytes to metal is no more answered by SOLID, than what an HTML5 browser would say about the requirements of a web server.

But, I think this is nevertheless a valid discussion - especially for use cases where latency, data streaming or long term archival becomes an issue. Internet development today has a plethora of rather complicated/proprietary solutions to highly available or highly secure storage. Countless databases, cloud hosts and even filesystem flavours catering to an ever growing diversity of perceived needs could be rather well upended by an evolutionary approach to data storage. The DAT project and IPFS from Protocol Labs stand out as projects that are perhaps closer to solving this infrastructural problem.

In the end, if you want to not just access and use SOLID, but really contribute to the ecosystem, you’ll be installing servers and choosing a server platform. Virtuoso purports to support a wide range of local or remote database sources, the others seem to all work from a plain old filesystem. Speaking as an engineer might, it seems wise to go for the path of least resistance. A dependency on databases, especially of a particular flavour, can slow or even block adoption in the places where the idea of distributed data ownership is probably needed most.

:wave: @pukkamustard ~ really interesting project, by the way! I bet the local open data community would be keen to hear more about your work :-)


#3

IIRC one reason for choosing the filesystem was to make the data easily accessible to a large set of available tools for working with data, backing up etc. The *nix command line tools and shell pipes for example, rsync, scp etc, as well as text editors.

There’s no one best backend, each has pros and cons, and my own aim is to help implement support for Solid using SAFE Network and a decentralised, secure, perpetual Web /data store.


#4

Good question.

The file system is the simplest platform for using, understanding, and exploiting the file (document) create, save, and share pattern. A DBMS, despite being an application that operates on database documents hosted by an operating system, isn’t as straightforward to understand and exploit.

Linked Data Principles Recap
Fundamentally, this is about identifying anything with a hyperlink (specifically, an HTTP URI) and describing everything using RDF sentences where a hyperlink is used to identify the subject, predicate, and object (optionally, as this could also be a literal i.e., one of several primitive datatypes).

Linked Data Goals
To enable anyone say anything about anything, whenever, and from wherever using tags (e.g., hashtags – which are RDF sentences in disguise) or richer RDF sentences.

Powerful data access and flow across boundaries (machines, operating systems, applications) using the same follow-your-nose interaction pattern popularized by the World Wide Web.

Experience via Solid Pod
You initiate your pod, or have someone grant you access to theirs.
Start describing entities that catch your interest i.e., no different to tweeting or blogging but in a space that isn’t tightly bound to a 3rd party service provider who ultimately controls the data you are creating.

Database Management Systems and Solid Pods
Our Virtuoso product supports the same open standards used by Solid i.e., you can experience the same file (document) create, save, and share pattern using an instance of this DBMS platform via its WebDAV, LDP, and ODS-Briefcase modules.

Here are live example links referencing two of my Solid Pods:

Here are screenshots illustrating how a Solid is mounted to a Virtuoso DBMS instance:

OpenID Connect selection as Authentication Protocol

Authentication

File Manager View, after successful authentication

Conclusion

Solid’s focus on File Systems is about simplifying entry into the read-write realm of Linked Data. There is already DBMS support, as demonstrated by our Virtuoso implementation.

Also note, I can mount my Virtuoso instance to an Solid Pod via OS mount functionality too, courtesy of WebDAV support across most operating systems :slight_smile:


#5

Thank you all for the answers!

I intend to work on an backend that uses a triple store as underlying database and have started two small experiments to explore this idea:

If interest exists I will gladly post notable updates to this forum.

@sodacamper: Hi! Sent you a private message.


#6

I think @michielbdejong and @jaxoncreed are investigating using Redis, so they might have more to say about this.


#7

Yes, we’re getting closer now to getting https://github.com/inrupt/pod-server to work, so that will mean there are two implementations of a Solid pod server. Our aim is to update inrupt.net to use our new implementation, and then solid.community can keep using node-solid-server, and others can choose which one fits their needs better. We built the pod-server code from scratch, in NodeJS + TypeScript, but drawing inspiration from how node-solid-server works, and obviously both adhere to the same spec. :slight_smile: See also https://github.com/solid/test-suite for a list of all Solid servers (also including Gold and Trellis, and Virtuoso to be added soon, hopefully)