Why Backend-for-Frontend for Solid is categorically wrong

Recently the Inrupt team (@maxleonard) posted an article titled “Backend for Frontend for Solid”, and here is my response to that (originally posted on LinkedIn)

Backend-for-Frontend for Solid is absolutely and categorically the wrong approach for building Solid applications, in my opinion.

This results in Solid just effectively being used as data storage, without the other benefits of controlling & knowing where your data is and what systems your data feeds into or is stored into. All the applications you use end up being built on top of proprietary APIs that if they shutdown that API the application becomes useless, and likely the data they stored does as well.

The correct model, in my opinion, would be storage-local compute. “Deploy this application’s business logic to run against my storage using this access grant, but with controls as to where data flows”. This could be achieved by allowing serverless and wasm-based interactions with storage, where when you “install” an application, it gives you information on what systems it talks to (if it needs to even communicate with the outside world at all), what APIs it exposes (if needed, as you could effectively have an application inbox → permanent storage model, where data is read from inbox, processed, and then permanent storage is updated with the results), and what data the application works with (e.g., a schema of data it consumes and data it produces).

Further, such a storage-local compute model would enable for a marketplace of resources: company X wants to do Y against your storage, we can calculate billing for company X’s usage, and pay that to the storage provider / user, allowing for true data economies without data ever leaving your realm of control.

This is what I think @timbl originally envisioned Solid to be, or at least, it’s more true to the model set out in the specifications: You own and control your data, others just get access to it on your terms.

12 Likes

I wouldn’t go as far as saying that it is “wrong”, because the point of Solid is that it can be used in many ways. But I agree with the sentiment, and I also think that this BFF approach should be minimized as much as possible.

I’ve read through the post, and I get the point. But one of the things I disagree with the most is the one on “Universal reach & accessibility”. If anything, following this pattern would make it less accessible (for one, because it’d be difficult to make alternative clients in the spirit of Bring Your Own Client).

Many of the advantages they point out are also a symptom of bad practices, like this part:

This pattern helps these organizations deploy Solid in a way that better fits their existing technology governance and security policies, and it allows for simpler integrations.

If you need this pattern to fit your existing governance, it’s probably not much in line with the vision of Solid. It reminds me a lot of what happened with the BBC Together+ app, and I already shared my gripes with that.

Though something I can really sympathize with is the need for server-side computing. I agree that something like storage-local compute would be great. Even better, it would be awesome if you could use the POD itself for computation (I raised a similar idea in the spec repo). But with the current state of the spec, this BFF architecture seems to be the best choice to solve that problem. Although I think it should be done as a 3rd party service used to augment the app’s capabilities, rather than making it a requirement for the app to work.

I don’t think this should be a problem if the data is modeled properly. But that’s a big if. I’m not seeing a lot of emphasis on interoperability, so in practice I agree that this type of BFF architecture would result in data being useless without the service.

For me, the line that separates a “real” Solid app from one that is just using Solid underneath is that it lets me store data in my own self-hosted POD. After all, that’s the point of Solid, to chose where to store your data and what to do with it. But this BFF architecture and many of the statements in the post smell to closed PODs.

4 Likes

I’m suggesting categorically wrong because ultimately it’ll be used as the golden path for development, and alternatives (such as storage-local compute) won’t be invested in, because we can carry on with business as usual.

6 Likes

I am with @NoelDeMartin here (including the part saying “But that’s a big if”). Anyway, is the problem different between BFF and purely-client-side applications? What prevents the latter to store data in a format that is practically unusable by any other application?

I really like the storage-local compute model proposed by @ThisIsMissEm, and I agree that we must explore this path as an alternative to “business as usual”. But most importantly, we must insist that regardless of their model, applications write interoperable and reusable data to the user’s storage, leaving them the choice to switch from one application to another – and therefore, eventually migrate from a legacy BFF app to a more privacy preserving storage-local compute app.

2 Likes

That would still be a problem, but I think the big difference here is that client-side applications don’t have a centralized dependency (the server). So even if the server shuts down, technically you can continue using that client-side app.

Now, I’m aware that “client-side” is not the same as “open source”. So in practice, if I discontinue a client-side app it’ll probably stop working because nobody is downloading the app assets for offline use (unless the app is a mobile/desktop app). But I can see how it could still be salvaged by web.archive.org or other means, and I think that’s a big difference.

But yes, in practice badly modeled data doesn’t help anyone. Even if the application doesn’t shut down, the point of Solid is interoperability so if you’re only using some data with one app I don’t see the point of Solid in the long term.

2 Likes

The other thing here is that purely client side applications, you can inspect the network traffic and see data flowing between the app and your pod, and nowhere else.

In BFF, you’ve no idea if they’re siphoning off your data to data brokers or aggregating it in internal databases for reasons other than the advertised functionality.

4 Likes

I’m kinda disheartened to read this post. SPA’s were kinda a mistake by the JS ecosystem imo and now the shift to “hybrid” websites is really the future. Also, most the web relies on SSR not SPA JS frameworks… Is solid protocol not meant to apply for SSR?

SPA websites are bad SEO, for resilient archivable websites, and for slow or intermittent connections.

1 Like

I’m not rallying against SSR, but rather against building SPAs with proprietary APIs that are not the solid protocol. If we wanted to center user data protection, we’d have a platform that would allow deployment of applications local to storage, as without that where your data goes & who can view your data is anyone’s guess, and the only protections you have is either legal through laws that attempt to prosecute lax privacy practices, or by suing companies when they inevitably have data breaches.

Providing a sandboxed environment for applications to be run against pod contents and being able to strictly control what that application can do with data is important as a part of the puzzle in trying to solve the current woes of data in corporate control.

e.g., given a compute platform, an application could request to allow HTTP requests to return data via SSR, and you’d at least know that the application wasn’t directly funnelling your data into a third party data silo/warehouse. Instead you’d see odd requests come up in request logs & audit history allowing you to see when someone is doing something nefarious with your data. You can’t audit infrastructure you don’t control, you can only request they have an audit conducted.

3 Likes

I’ll throw my hat in to say that this would be awesome! It would take a lot of spec development and security testing, but a great feature to have.

2 Likes

Many (if not all?) frontend frameworks nowadays support SSR, so I don’t see a limitation there. Also, you’re saying that SPAs were a mistake as a matter of fact, but that’s actually your opinion. SPAs have pros and cons and depending on the app you’re building, it can make sense or not. In any case, the SPAs vs MPAs discussion is certainly something ongoing in the community and I don’t want to get into that. But I don’t think it has anything to do with Solid. You could have an SPA with SSR that still communicates with the Solid POD in the client (or maybe even has a hybrid approach, communicating with the POD in the server during the first request, but then using the client for any subsequent navigation).

SPAs are bad for SEO: Also a misconception. And personally, my opinion is that I don’t care about SEO and it shouldn’t exist. Ideally for indexing data and searching information we should be using RDF graphs, that’s also the point of Solid.

SPAs are bad for resilient archivable websites? How come? If anything, SPAs are more resilient because they don’t need a server so they can work forever. For MPAs you can save snapshots using web.archive.org but that’s all, snapshots. Also, again, if what you care about is not losing information we should care about storing the real data in RDF or any other open format.

And finally, slow internet connections? That’s actually one of the strong points of SPAs, if they are built correctly. You only request what you need once, and I’m not sure you can build an offline-first app if it isn’t an SPA. And if you’re thinking about the problem of downloading the whole application on your first visit, there are things such as tree-shaking and code splitting to prevent that.

In any case, as I said, I wouldn’t want this to become a discussion of SPAs vs MPAs, there are plenty of places to discuss that elsewhere and it’s mostly a matter of opinion and the use-case of each particular app. But I don’t think it has anything to do with Solid, the point of Solid is that you can use it with any architecture you like. And for everything we’ve mentioned in this thread until now, I don’t like architectures that rely too heavily on a centralized server (it’s just that without Solid, you don’t even have a choice).

6 Likes

Yes, and I stated it as such. (imo = in my opinion)

I think I was misunderstanding MissEm’s point.

I also didn’t want to debate if SPA vs MPA (or hybrid) was the best choice… Just wanted to express that if Pods were only supporting SPA it would be a lot harder to get adoption. As there is a big shift to hybrid apps (SSR first load), and leveraging the platform more. I for one would give up trying to use it.

3 Likes

I just noticed that my account apparently had an older signature associated with it, that I wasn’t aware of still existing, so just to be clear: this thread is my own opinion, and not that of my former employer, Inrupt, who in fact recommend “backend for frontends” for Solid applications.

I’ve removed the old signature from my account, so now my posts should definitely appear as just “me” posting.

5 Likes

Sorry about that, I misread your post :bowing_man:

Yes, I 100% agree with that, I don’t think SPAs (or any particular architecture) should be the only choice for making Solid apps. I think Solid apps, like any other app, should use the architecture that makes the most sense for the use-case they are trying to solve.

2 Likes

This has been an excellent discussion, thank you all for your thoughts. A forum member in another thread remarked on how difficult to follow this conversation is for those not up-to-date on the latest acronyms and industry terms. I myself had to google several of them. Obviously these terms are useful to those in the discussion and I wouldn’t ask you to have responded any differently. I wonder though - could I ask one of you to summarize the discussion in a more accessible manner? Y’all make such good points, I think a wider audience could benefit. P.S. This is not in any way a critique of the participants in this thread, rather wondering how this forum can support experts sharing their thoughts on a high level and those just coming on board or those who come to Solid from somewhere other than coding but need to understand these larger architectural issues.

3 Likes

I don’t have time/energy to summarise, but here’s a list of the acronyms used:

  • SSR: server-side rendering, when you create dynamic HTML on the server before sending it to the client.
  • SPA. single-page application, when you send minimal HTML to the client, and have javascript do the bulk of the work in deciding what to display and how to fetch data (also common here is “client side routing”, as opposed to handling routing on the server side.
  • MPA: essentially the opposite of SPAs, so Multi-Page Applications, where majority of the work is done on the server and javascript is upgrading that experience
  • WASM: a new technology / programming runtime… different languages can compile to it, and it has additional security guarantees due to the functionality it allows.
  • SEO: search engine optimisation, which these days is arguably a term devoid of meaning because the rules from search engines keep changing so it’s not “one hard fast thing” like “javascript is bad”, which once was the case, but modern search engines will run some amount of javascript.
  • BFF: backend for frontend, basically building an API to specifically service your frontend single page application. (Or if you’re of the younger generation I’m told this means “best friends foreva”, but let’s go with the first meaning in this context!)

Hope this helps

9 Likes

I really appreciate the clarification. I think you are talking about something interesting, but I had to find any of these acronyms to know what they mean.

In my opinion, I prefer having the possibilities to run computations on the server, and not in the browser. This doesn’t mean I want all computations must be done on the server. I just imagine a use case where my POD works as a SPARQL service. Let G be an RDF graph in my POD, and assume I want to provide different levels of access to G to multiple applications and users. For example, let A be an application that is allowed to list my published posts, but not to see my unpublished posts. To this end, I could create a middleware that has an API to lists my published posts. However, I would rather not depend on a particular API. I prefer giving uniform access to my data. So, the middleware could expose a graph G’ that only contains information about my published posts. For instance, I could define such graph G’ as a view over G (i.e., the result of executing a SPARQL CONSTRUCT query over G). Graph G’ can be read only, but for other applications I could allow for writing. Indeed, maybe another view allows for adding feedback to my posts.

In general, my concern is not with the backend-for-frontend strategy, but with having non-interoperable frontends which are not following the same standards.

On the other hand, consider an application B that only accesses posts that are encoded as Atom files. In this case, I don’t consider a problem on implementing a server-side rendering functionality that exposes my published posts (which are in the graph view G’) as an Atom file. Atom is application-specific, but I have no major concern with it because it is open and popular. Of course, I would be happier if web feed clients access directly G’ as RDF data, but in some cases I can just not modify all possible clients.

I am uncertain if I am interpreting well your issue.

1 Like

So I’m not arguing for all computation to be done on the client, nor on the server, but just that when we do wish to do computation server-side (perhaps in responses to events inside our pod like changes or created resources), that the computation should be local to storage & have a strict permissions policy applied.

i.e., you use Pods EU but wish to use App US, it’s much easier to ensure data safety & privacy if App US runs in compute that’s local to Pods EU, as data would never leave Pods EU, and if App US needs additional access to say a web service, then it has to explicitly declare that in an allow list which the user can review & understand the privacy implications of.

5 Likes

I agree and this is also how we’re suggesting developers should use.id

Tokens should contain an azp claim, which is the application that was used to retrieve the token. This claim can then be used to restrict access to storage to only specific apps.

1 Like

@ThisIsMissEm I agree with a lot of what you’re saying here w/r/t to providing the user a guarantee that their data won’t be used in an unauthorized way, in particular that an app provider can’t persist a copy of the data on their servers and use it either a) within the boundaries of the grant but after it was revoked or b) outside the boundaries of the grant. This seems like a valuable guarantee in all cases, and critical in some (health records and financial data come to mind).

Along those lines I notice that even in the non BFF part (left side) of the diagram in the BFF post, the Pod is in the “Organization” box. If the Pod is in the physical infrastructure of the App Developer won’t they have access to that data as well? If the Pod storing the data is not in the App Dev’s infra (self hosted pod, or separate provider), and it’s a client side only app, then like you said the user might be able to pull the thread on something nefarious, and at that point sue, shame, etc, the dev if they’re doing something “wrong”…but if the storage or compute is in the clear on their infra I’m not sure what the guarantee is. I’m new here, so maybe I’m missing something?

You mention Sandboxes: is there any provision in Solid where a user can require that compute take place in a Sandbox, or some type of secure/attested environment?

This probably gets back to the flaw in Inrupt’s approach, and others who are selling pod hosting to enterprises: for Solid to fulfil it’s goals, this runs counter to decentralisation.

An organisation simply hosting pods out of the goodness of their hearts isn’t likely, because you’re basically giving members of the public (fancy) S3 buckets: they can put absolutely anything they want in their pod as long as they own and control their pod.

If the pod can only store the data that the pod provider decides, then that vastly limits the usage of the pod for other applications’ use cases.

Ideally, the Solid Pod and the organisation are independent entities, but that’s not currently the business model we’re seeing touted across the solid ecosystem.

Currently nothing related to storage local compute nor sandboxes really exist within Solid, which I think is ultimately a pretty glaring omission, and likely to lead to Cambridge Analytica type data exposure incidents and other “not great” things happening.

1 Like