Motivation of Solid

Hi, everyone,

I’m beginning to learn about Solid specifications for my final project by trying to figure out what Solid is designed for and what problems it aims to solve. I’m doing this by formulating my own questions about Solid and, in the end, answering them. This process will help me answer the final question: “Is Solid a good idea for my final project, and what should I learn to thoroughly understand it?”

So far, I believe Solid was created for two primary reasons:

I. Main Motivation: Empowering Users to Fully and Securely Control Their Personal Data:

  • To shift towards integrating social features into apps.
  • To reduce the dominance of major tech companies like Facebook, TikTok, and Google, which offer free services in exchange for vast amounts of user data.

The list of questions I’ve developed for this motivation includes:

  1. How can we prevent apps from copying data? What if a company deliberately creates multiple different apps to gain extensive access to user data? Will companies collaborate to aggregate large amounts of user data?
  2. Will there no longer be any free services available?
  3. How should the management and storage of vast amounts of data, particularly data not associated with specific individuals, be handled? For example, how should data collected by Google from user searches be managed?
  4. How does this differ from a data server that provides a user interface for data management and permissions?
  5. Will Pod servers become “social big tech” platforms?
  6. If users manage and grant data permissions through a data management app, does this mean that compromising the app would grant control over the data?
  7. Is the current trend of integrating social elements into the web and apps going in the wrong direction when everyone can create multiple WebIDs, fake WebIDs, and impersonate others to claim a WebID segment?

II. Interoperability: All Apps Can Share Data and Work Together.

  1. What is the workflow for this cooperation?
  2. How can we ensure that every member in the ecosystem works honestly? How can we be sure that an app with the right to update data does so correctly?

After listing my questions, I realize that there are some core components and concepts I need to learn about Solid specifications to answer them:

  1. Identification
  2. Storage
  3. Data in Solid
  4. Pods-Apps Communication

Thank you for taking the time to read all of this :slight_smile: I would greatly appreciate your feedback and contributions:

  1. Is my general, basic knowledge about Solid correct? Am I on the right path in terms of learning about it?
  2. Can you help me answer any of the above questions or recommend resources that I can refer to for self-study?
  3. Share your own questions about Solid specifications.
  4. Let me know if there are any unresolved questions about Solid up to now.

Thank you. I appreciate your assistance and input! :blush:

2 Likes

Hi @tranbau

I see this post is already a couple weeks old, and I saw that you were able to get at least one answer in another post, but for what it’s worth, I think that you’re off to a great start, and you’re asking very good and important questions. If you’re reading the protocols, you’re doing a pretty deep dive, IMO.

You asked a very important question that relates to preventing aggregation and abuse of data, and this is something that I also spend a lot of time thinking about. For me, it helped to think about the nature of personal data on the internet currently, how it is handled, and how Solid compares.

To do this, I came up with the list below that divides data into five types, based on who creates it, and who it is shared with. Going through the list, and thinking a lot about it, I believe that the Solid protocol provides at least as much ability to have the same or higher levels of data security, in every instance, as what exists today; however, the amount of security in specific instances will depend on implementations and how much people apply best practices. For example, as a Solid pod owner, I may have to choose between using two or more apps for a similar purpose, and I can choose to use the app that requires the minimum read/write privileges to my pod, or not, based on my preferences.

Anyway, here’s the list I came up with for data, based on whose data it is, and how it’s used or shared:

  • Personal data, for personal use
    • examples: tracking how many calories I eat a day, my phone contacts
    • this is personal information, potentially sensitive, intended for use only by an individual
    • currently, we have apps that help us keep track of things that are important to us, and all of these apps require that we: 1) trust the provider, 2) agree to a set of terms and privacy conditions, and 3) trust that the data is stored in a way that is not easily hackable
  • Personal, for use by friends
    • examples: texts, emails
    • this information is shared with a very small group, possibly two people, but, like anything shared with anyone that can be transmitted to a third party, it could be made public, and this part of the intrinsic nature of the act of sharing
    • currently, there are application that try to protect confidentiality (using encryption, or by deleting messages after a set time), but they cannot guarantee confidentiality
  • Personal, for public use
    • examples: public social posts (tweets, facebook posts, instagram posts, etc.), blog posts
    • this data is public by default, and intended to be so
    • this data is not intended to be have any “read” protections, but does require “write” protections
    • currently, unless we are posting using software we have control over (like a personal WordPress blog), we are at the mercy of the major platforms to protect our accounts from being hijacked–we have to agree to their terms, use their tools, and trust them to apply security best practices
  • Public, for public use
    • examples: public weather data, government reports, laws, etc.
    • similar to the category directly above, this data is not intended to be have any “read” protections, but does require “write” protections, but the difference is that the “write” access will probably be shared
  • Public, for personal use
    • examples: the data might have the same or similar origins as in the category above, but in this case it’s meant to be used in private apps, like a personalized weather app with location info, a personal movie watchlist that uses public info, but organizes it according to the user’s
    • this data could be intended to be private, and should be protected by default
    • currently, there are lots of apps that provide services like these, that take public info and give users different ways of interacting with it; in most cases, users have to: 1) accept terms of service and privacy policies (which aren’t negotiable), 2) trust the app providers, 3) accept whatever security measures are in place

What Solid gives us, in all of these cases:

  • to the extent we as app users choose to exercise it, we can have more fine-grained control over the read/write privileges to our data
  • more options between apps and app providers, eventually (this depends on wide adoption)
  • the potential for more transparency: in terms of seeing what the apps we use are doing (especially if the apps we use are open source); and in terms of seeing where and how our data is being accessed (since it lives in our Pod, not on a major Platform’s server)
  • freedom from Platform lock-in (due to Solid’s interoperability) along with their Terms of Service and Privacy Policies, and their level of service–again, this benefit will depend to some degree on level of adoption (vendor lock-in requires the lack of viable alternatives)
  • no downsides (that I’m aware of) in relation to the current internet

Here’s my personal view on your thoughts (“-” meaning no comment from me):

I. Main Motivation: Empowering Users to Fully and Securely Control Their Personal Data:

Not sure if this is the main motivation, but for sure a big part. Imo, allowing new apps to reuse existing data (instead of every app harvesting from 0) could be the main motivation too.

  • To shift towards integrating social features into apps.

I personally see Solid as more general, not only for “social data” (even though the name itself says so). You can use it for personal data, but you could also put research data, encyclopedias, etc on Solid pods. However, you’re right, social tends more towards social agents rather than abstract entities (see eg the Solid WebID Profile).

  • To reduce the dominance of major tech companies like Facebook, TikTok, and Google, which offer free services in exchange for vast amounts of user data.

… or to allow new companies to be competitive without having to harvest data themselves. It’s probably the same result, but I prefer to look at it from this side.

The list of questions I’ve developed for this motivation includes:

  1. How can we prevent apps from copying data? What if a company deliberately creates multiple different apps to gain extensive access to user data? Will companies collaborate to aggregate large amounts of user data?

For the first part, see this thread: Can I restrict where user's data can go?

One thing to keep in mind is, that with less vendor-locking, it is easier to choose companies that you trust your data with, which have a good reputation. If they behave badly, it is easier to change to another company, so the incentive is much higher to have a good reputation.

  1. Will there no longer be any free services available?

On the one hand, it will be easier to host applications, as many of them won’t need a backend server (eg put them on github.io and it’s deployed without you paying any money).

On the other hand, the ad-selling business would at least be different in Solid. So I’m not sure if this would still be a viable option to finance the developing process.

  1. How should the management and storage of vast amounts of data, particularly data not associated with specific individuals, be handled? For example, how should data collected by Google from user searches be managed?

From my point of view, the naiive Solid-style answer is: The search is a piece of data by the user who searched it, so it is stored in their pod. But if the user agrees, Google could use a copy of the data to improve their services. The question would be, how different this would be to the Google of today? Not sure where there is / or has been a discussion on this.

  1. How does this differ from a data server that provides a user interface for data management and permissions?

I think the main differences are, that Solid pods are standardized and intended to be filled with Linked Data.

  1. Will Pod servers become “social big tech” platforms?

  1. If users manage and grant data permissions through a data management app, does this mean that compromising the app would grant control over the data?

Yes. I’ve written a small document on Security in Solid if you’re interested: GitHub - Otto-AA/solid-security-basics: Basic security considerations for solid applications

  1. Is the current trend of integrating social elements into the web and apps going in the wrong direction when everyone can create multiple WebIDs, fake WebIDs, and impersonate others to claim a WebID segment?

When I see a YouTube channel called “Elon Musk” I don’t necessarily believe it to be made by him. I think the same would apply to WebIDs: You believe it when you learned about this WebID from a source you trusted (directly from this person, or someone/something you trust shared this WebID with you).

II. Interoperability: All Apps Can Share Data and Work Together.

  1. What is the workflow for this cooperation?

The user gives app A and app B access to their pod. App A stores data in your pod in an interoperable way. Then app B can use and modify this data (based on the permissions you gave). They could also use Notifications to subscribe to changes of this data.

  1. How can we ensure that every member in the ecosystem works honestly? How can we be sure that an app with the right to update data does so correctly?

We can’t ensure it, we can only create an ecosystem where there are incentives to do so (incentives being that you can simply switch apps, policies and laws, social norms, etc). And we can observe it a bit better, because we have more transparency who accesses which of our data.

  1. Is my general, basic knowledge about Solid correct? Am I on the right path in terms of learning about it?

Yes, pretty good questions :slight_smile:

  1. Can you help me answer any of the above questions or recommend resources that I can refer to for self-study?
  2. Share your own questions about Solid specifications.
  3. Let me know if there are any unresolved questions about Solid up to now.

I like Ruben Verborgh’s blog: Linked Research Articles | Ruben Verborgh
There are also some discussions you can find here concerning these articles and more. Particularly interesting and open is the question, on how this “App A stores data in your pod in an interoperable way.” would work in practice, eg here: [Post deleted]

Have a nice day :slight_smile:

1 Like