Encrypted POD? Is solid designed with this in mind? If not, would it be possible to add?


#1

@taggart wrote:

My organization has several applications that might be interesting candidates to move to solid. We currently store the application data on our own backends, but we could potentially become a POD provider and allow our users to use us or another provider instead. One of our requirements for doing so would be to have a way that data on the POD provider is encrypted in a way that the POD provider itself cannot read. Or maybe put in more generic terms, access control should be cryptographically ensured.

Some of the reasons for doing this include:

  • organized crime, nation-state, etc hacking of services to steal user data
  • data breach through misconfiguation, API bugs, etc
  • physical theft of equipment
  • see https://haveibeenpwned.com/
  • malicious POD providers monetizing user data, injecting ads/malware/etc

The best way to protect the users is for the POD provider to not have access to the data.
Is solid designed with this in mind? If not, would it be possible to add?

Thanks

There has been some discussion on https://github.com/solid/solid/issues/170 with @panleya and @tatwater and would like to invite more here.


Will tying web IDs to hosters create lock-in?
#2

I’m curious (and have a rather limited knowledge of JavaScript crypto) - how could that be done in JavaScript? Is there a secure way to create and store an encryption certificate/secret in the browser, such that the browser can encrypt data before sending it to the POD (and vice versa)?

How would such a secret be moved to a different device such that I can access my POD from different devices?


#3

OK, so I interpret this question as a desire to not need to trust your POD provider.

Now, don’t take my answer as authoritative, and not as an official Inrupt answer. It is more like a braindump, because I think this question is really important and deserves an answer, and it has been open for long enough without one. Anyway, before I try to answer, let me just note a few things:

  1. You don’t need to trust any random POD provider, you can be your own, like I am, by installing the Solid server on hardware under your control. I think this should satisfy many of those who do not want to trust a POD provider, but I realize, not necessarily the poster.
  2. Stopping short of full end-to-end cryptography, the connection is always TLS in Solid, so that part is encrypted. Moreover, it is trivial to encrypt the file system the data resides on. However, this kinda misses the point, since the data will still be in clear text at some point in the Solid backend, and can be intercepted by an intruder or the POD provider itself.
  3. App developers can always encrypt on the client side, and make sure the literals in the RDF are encrypted with the user’s key.

To really address the concerns of the poster, we would need to have the data encrypted on the server side all the way down to disk with a key that the user controls. If the user is the only one who would have access to the data, that’s easy enough, but that wouldn’t be very social. I’m not a crypto guy, but it appears that TLS doesn’t make this easy, TLS connection terminates before we get to the disk. Again, not a crypto guy, it may well be solutions to this, and it would be interesting to hear.

However, there are another two things that makes it hard:

  1. People want to be social, they want to share data.
  2. More advanced apps are likely to soon require a more advanced query system, we might want to use e.g. SPARQL.

Without being a crypto guy, I could imagine that we build a protocol on Solid that uses the Web Access Control for key management to enable sharing keys with the people that you share data with. It doesn’t seem to me that this would require very substantial changes to Solid as it is today. I would require additional protocols, but I suspect it could be done with additions to Solid rather than a more extensive change, and so, to some extent the design of Solid should be accommodating.

Evaluating queries over encrypted data is very much an active area of research in academia. It has been going on for a number of years, and I have noted that quite a lot of this research revolves around RDF data on the Web, and thus, much of the research that is going into this is immediately applicable to Solid.

In conclusion, I think that Solid can enable a future where you don’t need to trust your POD provider, but right now, the shortest path to that is to install it on your own hardware. Beyond that, it will require quite a lot of work, but I certainly see the value of thinking in that direction.


#4

Our users are non-technical and aren’t the kind of people that could setup their own POD provider, they count on us to provide them communication services in the most privacy protecting ways possible. We might be able to educate them what a POD is, explain the concepts of providers and that they have a choice in who they choose as a provider. If we also knew that the POD provider couldn’t see the data, we could explain that rather than give stern warnings that “you better strongly trust your POD providers because they will have access to all your data in the clear”.

It sounds like it would always be possible to build the end-to-end encryption into our web apps on top of solid, but I was hoping there would be some way solid could provide this in a standard way (which would also help prevent app developers from getting it wrong, crypto is hard to do right).


#5

Thank you for your insight, @taggart! Indeed, I think there is a pretty strong case for end-to-end crypto-enabled PODs, as I also think there is a pretty strong for case for verifiable claims, zero-knowledge proofs and that kind of stuff in Solid. Maybe we should see this as a whole. I think it would take some time to get there though.


#6

Let’s say that I want to save my files to a server that is not fully trusted.

For me, integrity is the most important security aspect. If I share some content publicly, I would like some guarantee that the content is not tampered with. This would mean that my (web) application has to cryptographically sign the content, and the web framework need methods to verify the integrity.

Closely followed comes confidentiality. I have some content I would like to keep private. This means that my (web) application has to encrypt the content before sending it to the server.

For this to work, I would need a cryptographic key that I can use for signing and encryption. The key cannot be stored together on the same server as my content. So I would need a trusted party that can safely hold my key.

For instance, the identity, cryptographic keys and the content/files could be serviced by different organizations… Maybe I would pay my identity provider and key holder a bit more than I would for file storage…? :thinking:


#7

An opinion on encryption from another thread from @JornWildt Basic question about authentication


#8

In short, I think it should be possible to encrypt PODs and it should be on top of existing specs, not changing them.

Already a couple of people (@JornWildt, @hovenko) are suggesting storing keys separately from data itself. This is relevant (as said) where data is large and cheaper to store and keys are small but expensive to store in a trusted place. This is an attempt to gather and expand on previous comments.

  1. Signing is the easier part as no secrets must be distributed, but requires all apps to verify the signatures to all data. The public key is still stored in a separate trusted location (e.g., with the Web-ID). This ensures data is not tempered. Recommendations early on in the documentation to generate, store and verify signatures will help.

  2. Encryption for privacy requires keys to be distributed. Encryption keys will be stored separately. Sending these is not good as each accessor should be managed separately. Storing the keys on another (trusted) Solid server would make sense but is not required. ACLs are handled by the key storing server. Two ways for doing this so far:

2.1. Encrypt content only but leave the links/meta data in plain text. Hosts can still be Solid servers.

2.2. Encrypt entire PODS makes hosts web hosts only and presumably another Solid server will have the keys.

Concerns

  1. Server funcitonality – encryption of entire PODS or even text or other browsable content will prevent server functionality. As mentioned above encrypted PODS are hosted on dumb/web server and Solid functionality will be handled by another Solid server, possibly the key storing one, but probably yet another. Equally apps can do. Some would even argue for apps doing the processing in any case (“all is frontend with SOLID”).

  2. SPARQL is limited by encryption – yes, of course, but if the app doing the query has all the keys, it can still do the query. Admittedly, this is not straightforward now, but a layer should be added to the handle the decryption (and another one for verification).

  3. Queries over encrypted data is probably relevant.

  4. Leakage If an app provides summary data but preserves individually private data, there is a risk of leakage, but I think this is upto the app, not Solid to resolve.

In conclusion, I think it will be good (in order)

  1. Documentation Add recommendations in the Solid documentation for signing and public key storage and advertising.
  2. Code Add client library functionality for data verification.

Encryption looks much more difficult to agree on but should be easy to add without changing library specs. Signing should be possible later too, but looks much easier and gives credibility that security is considered early on.


#9

Hi,
Just one comment on this: “I think that Solid can enable a future where you don’t need to trust your POD provider, but right now, the shortest path to that is to install it on your own hardware. Beyond that, it will require quite a lot of work, but I certainly see the value of thinking in that direction”

I agree, but building on top of open source software already, might not be as difficult as one might think. Pushing security off into the future leads to problems later - look at DNS, SMTP, HTTP etc where the security had to be bolted on and it has taken decades to do so and it is not complete in many cases. e.g. spoofing senders, while humorous back in the 1980s, is still causing problems today with spam, phishing etc and while there have been proposals to deal with it, none have succeeded and now there are tons of bolt-on techniques to deal with suspicious emails, spam or whatever. Adding it in early, will save a world of hurt later.

Ditto for decentralization for DNS, and distributed PODs - having central points of failure or (perhaps worse) central points of pressure (e.g. from governments around the world) is already a problem. If someone in China or Russia wants their POD at MIT, they shouldn’t have to worry about a government hack from China, Russia, the US, UK, Iran, France or anywhere. Likewise, if someone from the US wants their POD in Australia, one shouldn’t have to trust the Australian government. Whatever the jurisdiction if the POD is solely located there, it is at risk from the government of that jurisdiction to be either hacked or taken down with no automatic network backup. When the contents are plaintext, it will make it easy for a government to say “delete this POD or we’ll take your entire server offline.” One doesn’t want to build a new system while leaving the issues of censorship, centralized control, and surveillance unaddressed.

I think it is clear that having a POD in plaintext on a server will mean they will be hacked and lost at some point either through API bugs or more likely server problems in the software (e.g. many including heart bleed) or hardware (e.g. Spectre, meltdown). The lesson since the 1970s is that security needs to be built-in and even then won’t be perfect, but at least there will be a default.

To the question above, there are a good number of crypto libraries in Javascript - e.g. bitcoinlib-js, twister-lib-js as two examples - which can be leveraged, but something to note is that if the app developers have to implement the crypto themselves vs having it “built-in” it will cause screw-ups. Likewise, if it isn’t included, a lot of developers won’t use it - they’ll do it “for version 2”.

There are decentralized domains and decentralized hosting so you don’t get tied to a particular provider or lose your POD if a provider goes down, is hacked, or is taken down. :slight_smile:


#10

I’m pretty new to the whole SOLID topic but I find the idea fascinating and very attractive.

However one of the first things that came into my mind was: “how will the POD - and hence my data - be protected server sided?”.

From what I understand, SOLID wants to bring back control over your personal data by storing them in a place where every person can control exactly who is allowed to access those data (the POD). I think it is save to say, that over the years we all learned that we can not “trust” any system or institution with taking care over our data (basically the points @taggart and @centaur lined out).

If we assume that the “average” user will try to store most - if not all - of his personal data in one POD the POD becomes a real nice and shiny gold nugget for data sellers, hackers etc. In some way it would be a lot more attractive and worth a lot more effort to get into possession of a well maintained POD with personal data.

Of course I could just host the POD myself (naively assuming my non professional home hosting environment ist perfectly save) but I don’t think this solution is practicable for the broad mass of users since:

  • they are not capable of setting up a server

  • don’t want to deal with the struggle of setting up a server.

  • their internet provider doesn’t allow them to host a “website like” service

So in order introduce POD Serving “as a service” people need to be able to trust that their PODs are save, no matter if the Hosting environment is “corrupted” by the hosting companies intention or a security breach.

I hope that POD encryption (or any other kind of securing the POD) makes it to the agenda.


#11

As a member of Netsso.com, I can load my files onto my Dropbox with (Netsso-provided) encryption, done properly on my local machine (whatever machine, no local software required). That is, when logged in to Netsso. Then I can download the files to any other machine, no local software required and they arrive decrypted, via my Netsso login. Or I can send them, encrypted, to any other Netsso member, i.e. sharing encrypted data, end-to-end. And all files can be managed individually via remote links, no need to log in to Dropbox, etc.
All my files may be stored in third party storages, managed via Netsso. Always encrypted, unreadable by storage administrators. Does this make Netsso a POD? Could Solid do the same?


#12

Yes, but people don’t always want to be social and they don’t always want to share data.


#13

Take a look at the Web Cryptography API