Inter-app access control

I think you describe the issue correctly. And I think Solid’s way to deal with this is through CORS (‘same origin’ policy, or similar) so that an app has access to your server, but can’t write to another.

I’d like to understand this better myself too because I’m not sure if it is bullet proof. This is currently a hard problem in the Web so I’m not sure how well anything can protect against it using existing browsers and protocols, so an important question.

I may be wrong, so better verify this, and I may have used the wrong terminology (CORS).

1 Like

I am not fluid in CORS, but I do not think it solves the problem since the POD server actually wants to serve all ajax requests for all apps. See for instance https://github.com/solid/solid-tutorial-angular/blob/master/solid_spec.md#cors---cross-origin-resource-sharing.

The default behavier on the web is “Same origin policy” (see https://en.wikipedia.org/wiki/Same-origin_policy) - but in Solid we actually want “Any origin policy”.

And it makes sense to allow it too. Assuming two apps https://app-a.com and https://app-b.com: to allow both to work with my POD at https://my-pod.com, the POD server has to allow requests from each app. And I do want both apps to be able to access my POD - but only a sub-set of it (namely “My contacts” OR “My memos” - not both).

I am pretty sure Solid apps could identify themselves by a certificate exposed by the server they are served from - and Solid apps could in principle use that to make authenicated requests to my POD where they BOTH pass my credentials, to access my data, AND some “app token” that can be verified by their corresponding server side certificate. The POD server could then slice the data by assigning each Solid app their own private share of my POD data.

And on top of that we then need the ability to grant different apps access to each other’s data.

/Jørn

3 Likes

Thanks, maybe one of the Solid folk has better insight on this issue. We have the same concern on SAFE Network, and this is one reason why we have SAFE specific browser which, by default, won’t let apps access http(s):// at all (only safe://), but evenso it is hard to stop an app from storing data to a place that leaks it. It’s a smaller problem for us, because an app must have access rights to the place it writes - which in effect means the user running the app must have those rights (rather than the app), but I think a dodgy app could write to a place where, say, anyone can write.

It is easier to deal with though, both through limiting the amount of data an app can write on your behalf, and through detection of this kind of behaviour.

I expect there will be further measures added to prevent this in SAFE (such as limiting where an app can write to a storage owned by the user), but at this time that is it.

I think this is an important privacy issue so I’m hoping it can be made watertight.

1 Like

That thing about CORS and allowing any website to access my POD combined with this part of the spec https://github.com/solid/solid-tutorial-angular/blob/master/solid_spec.md#finding-out-the-identity-currently-used scares me a bit.

The referenced spec states:

Regardless of the authentication mechanism that was used during an HTTP request, the server must always return a User header, which contains a URI representing the user’s identity

It makes me believe that any and all website on the great Internet can embed a piece of JavaScript that tries to do an ajax request to for instance the solid.community server and 1) get the WebID reference back and then 2) use the rdflib.js code to read all my data, as if I requested it manually, without me ever noting it or giving conset to it.

Can that really be true?

Okay, turns out the “User” header is for WebID-TLS only (if it works at all, that spec reference was three years old). There is no User header when signing in using OpenID Connect.

I just spent a few hours looking at lists of pre-existing rdf ontologies for use in modeling the data for an app idea. The vast preponderance of them had their origin in government and academic settings where the overriding concern was how to make the data easily shared and easily combined/mashed. The privacy concerns of individuals in apps designed for a ‘Person’ as the data owner just don’t exist in those contexts, and as a result, have not yet been made a priority. It’s clear that the answer here is ‘we haven’t figured that out yet, what do you think we should do?’

If I, as an institution, have written the application that I am running against my Solid server, it’s probably ok to assume that I don’t need to worry overmuch about privacy access controls. But individuals need methods to ensure that applications from 3rd parties can be trusted to enable and respect privacy constraints, while still supporting ease of data sharing / mashup, when desired.

I think it’s something that needs its own working group to focus on the issue. Any community protocol on establishing one?

2 Likes

In thinking about how access security might be implemented I’ve been considering the data I want to have for my related apps.

As the owner of the data: ‘personalProperty’ and ‘friends’, I want to allow my friends to see a filtered list of my personal property, limited to those tagged ‘shareable’,‘toGiveAway’,‘forSale’ allowing only apps that support such filters to expose my property data to only my friends.

As the owner of the data: ‘personalProperty’, I want to allow my insurance company to see a filtered list of items that were stored in my home that was destroyed by fire, including linked (meta) data: aquisitionDate, photos, modelIds, serialNumbers, storageLocation, cost, replacementValue, receipts, wasGift, (but not giftedBy) allowing only apps that support such filters to expose my property data to my insurance company, and related persons, such as independent agents or their/my legal representation.

As the owner of the data: ‘personalProperty’, I want to create a schedule b form in support of my federal bankruptcy filing. I want to allow a selected application to produce an ordered list of personal property including only metadata subType, description, ownershipClassification, currentValue, appraisal - allowing only applications that support this filter to expose this data to their report formatting app or my legal representation.

I have other use cases for this data, but I think I’ll stop at these three, for now. It seems to me that access control needs to be via query or ‘view’ in order to support the fine grain privacy control that Data owners who are individuals have come to expect.

Next up for discussion: create, update, delete permissions on data. In the context of individual ownership of data, does permission to create/update a type of data imply permission to delete? Not necessarily. What if meta/linked data not supported by your app exists? What support for end user UX messaging should exist to explain delete failure in such cases?

3 Likes

Further to the ‘friends’ filter/view. If the user controls what friends can see, does the user have control of which apps their friends can use to see the data? I can imagine a generic ‘friendView’ filter under the complete control of the user, which could then be granted permission by any app being used by a friend to view the data.

The way sharing is done today is by controlling access to resources - e.g. files. You grant access to agents, whether that be users or an app acting on the behalf of the user. The UX of this is a bit crude today, and we’re working on mechanisms/models that enriches this approach.

A short contribution to a long thread, but I hope this helps? (I’m still learning the model myself, so I might not be completely understanding it all yet, so take it with a grain of salt.)

1 Like

So, how and by whom is this being addressed? Is there a working group? Should there be one? I’d like to follow the progress, and perhaps contribute. I would certainly like to contribute my use cases outlined above. I find it helps to have a consistent set of examples to center discussion around, in order to save time, and ensure everyone is on the same page.

I created a thread for this question here:

2 Likes

Right now Solid is all about filtering by WebId. It is as though you have a choice about who you invite into your house, but once they are there you also need to be able to have some choice about what it is that they do. I think for questions about what data is harvested, and what 3rd party apps have access to what data, this is very relevant. So we need filters not just for people, but for ideas. We need to be able to set rules based on ideas, not just based on people or organizations of people. For this, a capability to do natural language processing seems pretty fundamental. That piece is fundamental to privacy just as filtering by people and their id’s.

Well, it is slightly worse than that - you do not even have to invite them! If they know where you live they can just walk in all by themselves - given that you visit their website.

Scenario:

  1. It is normal for people to make their webID public available on the web together with some kind of contact information, just like being listed in a phonebook (see for instance Ruben’s friend list https://ruben.verborgh.org/profile/#me or https://www.npmjs.com/package/solid-server#contributing).

  2. Using the public information and some social engineering, an attacker can craft a message to those people and ask them to visit a certain web page.

  3. Once they visit the webpage, it can use rdflib.js to access their data in the background - and if they happen to be logged in, the website can read/write/modify/delete any and all of their private data.

This is possible due to the fact that there is no application validation - I do not grant access to an application. This is very unlike Facebook and others where I explicitely have to grant each and every third party application access to my data before they can, well, access it.

Things to understand about the current situation as of 2018/10. You need to distinguish between different sorts of aps.

  • If you install a native app on you mac or pc, under mac os, linux, etc, then it runs under you control with full privilege, like Word or Quicken, or Chrome. The OS doesn’t give us the ability to stop it accessing anything you as a person can access. Desktop apps are trusted.

  • If you run a web app in a web page, the browser does severely limit its access to anything on the web, using the Same Origin Policy, and in complicated way allows limited access between different web apps, where a web app is defined by the domain, “origin”, likefoo.solid.community. Web apps are not trusted. That’s largely why each user gets a different domain. In solid, if you try to get at data from a web app, then the web app AND you must BOTH have the access required in the Access control system. You do this with the ACL by making an Authorization which has an origin property. Because that is a bit fiddly, we are also thinking about giving the publisher of data a way to say, in their public profile, a list of webapps they trust and will allow users to use with their own data.

  • If an app runs as a service at a remote place on the internet, some other computer out there, then it can only get access to the your data by logging in just like a person. It has to be set up with a way of storing the password (or certificate) it uses in a secure way. You then give it access just like a person. You can put it in groups, even. so yo can allow myimaginaryphotoprintigservice.com to have read access to your photos. Remote services are agents which are authorized just like people .

  • If an app runs on a locked-down system like iOS, the operating system can limit its access. As far as I know, IoS, while it does allow you to block an app from accessing the web, there isn’t as far as I know yet a way in which it warns the server which app is going the access (like the browser does for a web app). The solid code cannot control iOS apps. Apple does. iOS apps are trusted if (but only if) you give them network access.

So when we talk about authentication of apps, we have to distinguish between these situations. The most pressing case is webapps, the most complex, and the place where we can potentially addd more functionality specifically to the solid platform.

10 Likes

Ah, thanks :slight_smile: So there is actually a (future idea) way to protect against mailicious web apps, by including the value of the “Origin” header in the access control list. That makes it possible to solve the privacy problem.

The next step would be to make that solution a bit more accessible, by for instance 1) disallowing access by default, and 2) adding a standard protocol/workflow for granting access in an easy way.

Again, thanks for the clarification.

2 Likes

But now we have a similar situation with “remote services”, that is, applications running on servers that login as users. How do I grant such services access to my data?

Yes, the server-side apps can use some credentials to sign in as normal WebID users … but I do not want to hand over my login credentials to some random server on the web. And even if I did it could easily spoof the “Origin” header in any way it wanted (unlike apps running inside the browser), making it impossible to restrict its access to parts of my data.

That problem is normaly solved using good old OAuth2 where I grant access to the (server-side) application using the Authorization Code grant with a couple of redirects.

1 Like

So there is actually a (future idea) way to protect against mailicious web apps, by including the value of the “Origin” header in the access control list.

No, that is how it works now. NOW, webapps are blocked by default.

No you never do that. The remote service has it own secret key/password you never get to see. it logs in with ITS credentials. It has webid. You only give it the read/write/append access you want to explicitly give it, using the ACL system as it stands right now.

It is really important to distinguish the different cases. The origin header is only used with web apps. Web apps are not trusted. Now. In Solid. right now. The browser always restricts the script as to what it can do, and alerts the server. For a web app, the browser forces the origin header, and the app can’t turn it off. For any other case it is unused and irrelevant. You wouldn’t bother spoofing it, you would remove it, and then be a trusted app.

Yes, that is typical in a lot of systems. We could add this workflow. We would have to design it so that the remote service has a webid, as we would need to put that into the ACL system to grant it access to things.

(Down the road, it may be used to have a form of profile for an app, logging in as an agent, where it can declare itself as a (say) corporate agent rather than a human agent. (It could also down the road put digitally signed certificates that he service has been vetted as being benevolent, and thing like that.))

FYI this has sparked a parallel discussion on the SAFE Network forum where we have the similar concerns about privacy and security, while enabling the switch to user ownership of data, and allowing apps access on a permissive basis.

However, it lead me to see how much better the situation would be, both in security and user experience, if we could allow apps free reign to access our data, but have the ability to monitor and control where they are able to send it (other than writing to our own user storage).

I’m not sure if that’s feasible, but if it is I think we could fully realise the vision of users owning their data, and App developers competing on the basis of features and functionality, instead of on the ability to manipulate users and exploit our data. More in this post.

5 Likes

we’re working on mechanisms/models that enriches this approach.

So, how and by whom is this being addressed? Is there a working group? Should there be one? I’d like to follow the progress, and perhaps contribute. I would certainly like to contribute my use cases outlined above. I find it helps to have a consistent set of examples to center discussion around, in order to save time, and ensure everyone is on the same page.

We are working on the process of how we should be public about our work and how to let the community contribute to work on cases like this. We should have something ready soon.

I’ll note the new thread you’ve created, and try to link to whatever public resource we make available as soon as it is matured to that point. Hopefully it’s not long until we can do that.

1 Like

Thanks for the clarification. I can see that my test web-app is incapable of reading my private data (the inbox) unless I (1) give access to “everyone” - or (2) retry the same request outside the webbrowser without the Origin header. My assumptions so far was obviously wrong and only tested in the public part of my data (which anyone can read anyway).

The next problem is more of usability of the databrowser (which has one of the most obscure interfaces I have seen so far): how do I grant access to a web-app (adding my test web-app’s Origin value)? I can drag that globe to the intended access panel (how do you expect users to figure out that?) - but where is the “Enter WebID here” kind of input for granting access? Maybe I am just supposed to enter the ACL tripplets myself at this point?

This is where the OAuth2 kind-of mechanism comes in - it would be very useful to let the web-app ask permissions for a specific folder through something like OAuth2.

1 Like