Solid: File server vs database?


#1

I begin developing applications with Solid, currently working on a fork of the Otto AA Solid File Manager I found in the Solid App Listing.
Up to what I read Solid aims being a way to build applications that will be competitors of Facebook, Instagram, WhatsApp, TripAdvisor etc… I fully agree with the importance of this goal but I have a technical concern : Data of those applications are managed by databases as Solid is a file server. And so I see Solid more as GoogleDrive or Dropbox competitor than all those social networks.
Did I miss something?

Edit: This post of @Adventure. Fully agree on what he says (If you can’t beat 'em, join 'em - Queen) but same problem: How could Facebook applications work with a file server?


#2

Hi @okilele. Solid is not just a file server - although it can function as such, using helpful projects like Solid File Manager. However, it’s explicitly intended to store structured data. You might be interested in the page Understanding Solid, for more on that.


#3

OK, thanks for your answer.
But at last information must be stored somewhere. Everything in computing is stored in files. You can access those files using an engine that will manage them, this is called a database or you can access and parse them directly if a specific format is used, XML or JSON for instance.
The second way, using flat files and parse them, is good for small chunk of information, we all use it in development environments, but is not possible if you have more than, let say, 1000 “record” to store.
So again, did I mis something?


#4

You are totally right.
In my understanding, the core target of Solid is to setup personal data store. When we interacting with such data stores, the most important part is interface (SPARQL/OWLs). For now the effort is focused on file-based, Since the performance is yet the bottleneck, the file-based is good enough for community to start experiment application ideas while fine-tuning standards.
It’s appreciated if database developers start to work on the DB version for Solid pods. However it wouldn’t be a big deal if not. One day if some app makes a bang, such DB projects will emerge very quickly. If the interface is finalized, migrating to DB would not cause much impact for existing apps.


#5

Yes, one day … But the best way this day happends is being able to use Solid with a database.
A “join venture” with Neo4J or MongoDB ?


#6

There are two things that are relevant there, I think:

  1. Node Solid Server currently uses flat files to store data. However, that is not at all mandated by the Solid standards, and it’s very much possible to create a server implementation that is designed to scale to larger data, and uses a database as the back-end.
  2. The scalability requirements with Pods are often different from regular websites, in that they commonly store personal data for a single person. Databases that need to scale to tens of thousands of users will needs to store a couple of orders of magnitudes larger amounts of rows, whereas e.g. a Document that stores my personal notes is far smaller. That’s not to say that there aren’t/won’t be challenges, but the bottlenecks might not always be what you’d expect.

#7

Neo4j is a good choice. It’s designed to optimize the complex graph-based structures. The mapping to RDF is also intuitive. Once distinct is that Neo4j supports properties for both entities and relations. So assigning the RDF literal values to Neo4j entity properties may enhance query performance.


#8

I was not sure Solid can use a database back-end, I am happy to learn this
But point 2 of your post needs clarification : You are talking about “a document that stores my personal notes” as I see Solid server as the storage of everything I want to share: Of course my personal notes but also records of my health watch (an example, I should never wear this kind of thing) or my posts on Facebook, TripAdvisor etc…
Do you agree with this way of Solid utilization ?


#9

I still believe the DB would eventually be the solution. If POD is only local, it might not be a big matter. But many (if not majority) people may tend to delegate it to POD provider, which may handle millions. Moreover, the file-based potentially introduce more local heterogeneity and logic collision. Say with authorization, app Facebooklet and Twotter write (Alice, age:owl1, 24), and (Alice, age:owl2, 26) to different path in your POD respectively, it’s not easy to identify or fix. DB with the unique access point can apply guard logic easier.


#10

@okilele Absolutely, but that’s still data with different properties than e.g. Facebook’s data.

@dprat0821 I was certainly not trying to imply that a database would never be necessary, just that the challenges are non-traditional and thus might, but might also not, have non-traditional solutions. And certainly, large-scale Pod providers will have their own additional challenges.