Location versus Content

SOLID is organized around location (url, folder, file, etc), rather than contents or data irrespective of location (by example, differing from IPFS etc) .

Have people explored other options that are still “SOLID” but more organized around content / data .

Essentially, SOLID talks about owning and controlling data etc, but that isn’t really possible, it is merely the location .

SOLID Access Control regards PODs, folders, files, etc, it does not regard contents, data (graphs, triples, individual URIs, etc) .

For instance, and by example, what if I just client side encrypted my URIs and triples, and then put them anywhere that had consistent public access ?

What else is / gets interesting here ?

@pukkamustard recently published some very interesting work on SocialHub: https://socialhub.activitypub.rocks/t/content-addressing-and-signatures/744

4 Likes

Thanks for the mention @aschrijver!

Yes! I have recently done some work towards this and am thrilled to see this being mentioned.

The basic idea is to make RDF content-addressable, in contrast to location addressable, in a way that keeps it compatible with existing data.

Basic idea is to reference a set of RDF triples by the hash digest of the triples themselves.

I think there are three tricks involved:

  1. Define a grouping of RDF triples
  2. Define a canonical representation for this grouping of RDF triples
  3. Use a URN to describe the grouping

I have a proposal on how to do this here: Content-addressable RDF.

This is an interesting point. Naive content-addressing does not address the issue of confidentiality. Pieces of content are not encrypted and any transporting and caching peers will be able to read content. Also for large contents the entire content is stored in an entire chunk, making transport complicated.

To solve this there has been previous interesting work (Tahoe-LAFS, the file-sharing application of GNUNet and Datashards).

We have a proposal on how to do this that is optimized for smaller file-sizes and adds an verification-capability that allows caching peers to cache the entire content without being able to read the content: An Encoding for Robust Immutable Storage.

We also have a demo that shows both ideas (content-addressable RDF + Encrypted content-addressing): An Encoding for Robust Immutable Storage - Demo.

I believe that RDF is a very well suited data-model for decentralized systems (even more decentralized than what SOLID and ActivityPub currently are). I see content-addressing as the stepping stone to be able to use RDF in decentralized systems.

As part of the openEngiadina we intend to explore these ideas.

5 Likes

@markjspivey have you read @sergejspomer’s post and work? Much of it is about how best to organise data in Solid. See: Concepts for apps in a decentralised web (Solid forum)

This lead to a discussion on the SAFE forum which is a good exploration of both his work, and a way to combine both the useful metaphors of addressing by location (folders, files etc), addressing by classification (labels which identify the creating app, data types such as photo, image etc, classifications such as personal, business, financial etc, mostly applied automatically), and where all data is also inherently content addressed (so three metaphors which all work together orthogonally). See: Concepts for apps in a decentralised web (SAFE forum)

I’m very interested to read about a fourth, using content addressing of triples @pukkamustard. RDF is a native data type for SAFE, so I’ll have a think about how this could work with capability based access control. I think it will be fine because access control is orthogonal - addressing by content is just knowing where the data is. You will still need permission in order to access and decrypt it.

So my question is more how content addressing based on triples could be used to locate the data in the backend (whether Solid, SAFE, IPFS or whatever). I’ll go read!

@pukkamustard I see your motivation is reliability in a system where trying on servers creates inaccessibility. Do you have other use cases for this?

This would not be an issue with SAFE, so I’m wondering about other uses. Intuitively I can see it would likely be useful and I’m aware that different approaches to supporting this kind of content addressing might be selected based on the use cases. Actually, having read that you have an efficient scheme for creating a canonical representation of any RDF my question is not relevant. This could definitely be implemented on SAFE, making any RDF document, including the canonical representation content addressable. I’ll pass your paper on because I think this is an important capability.

I posted on the SAFE Development forum for comment but the devs are very busy atm so I don’t expect an immediate response. See: Content addressing RDF

1 Like

Thanks for the links and projects to check out, this is what I’ve been looking for regarding what I’ve been exploring for architecting my own projects, will share anything I work up as well.

1 Like

Any thoughts regarding IPLD, or blockchain non-fungible token contracts etc ?

Do you see your approaches @pukkamustard @happybeing as being peer level competition to such, or a possible part of ecosystem including any and all those things, etc ?

This ties into my post regarding What is most minimum SOLID?, essentially, content -addressed URIs for RDF purposes breaks Linked Data, even non-http breaks it in definition, while certainly being valid RDF.

@pukkamustard I assume homomorphic encryption concept can also get interesting regarding what your exploring.

Regarding SOLID itself, ive been exploring the concept of distributed / decentralized / virtualized Linked Data Containers, as content-addressed, as the it very much seems impractical to continue thinking of a “POD” as a specific protected folder that actually exists only on one server exposed by location based address, etc …

My approach has been to provide a way for Solid apps to run on SAFE Network, and have demonstrated the essentials working by providing a Solid API on top of the SAFE API, that can be utilised with a slightly modified Solid app library (eg forks of rdflib.js and solid-auth-client both work).

How Solid and Solid on SAFE systems relate is hard to say at this stage. Ideally they can sit side by side, one able to utilise the other, and Solid could integrate SAFE as a backend to a regular Solid server, but there are compromises with any close coupling that need to be worked out because of the different protocols. Unlike dat: or ipfs: protocols, the safe: protocol is designed to avoid use of web services which even when encrypted are vulnerable to attacks, tracking, censorship and so on. To solve these problems SAFE clients should by default at least, block all non safe: traffic, and any client (or server) which allows non safe: protocols loses many of the privacy and security benefits of using SAFE in the first place. So it can be done, but whether or not and how it is done will depend on the reasons for using SAFE and http: together.

1 Like

Would you consider the first class constructs / granularity / atomicity of your work here to be regarding RDF URI and triple level, or RDF graph, document, file (turtle, etc …)?

At initial skimming through, I can see where it is accounting for many levels, but regarding some of the levels in detail it appears the focus is on file / graph level, especially regarding the immutable storage and discussion around how to handle inevitability of blank nodes etc.

Does this assumption line up with your intent, or am I not interpreting it correctly?

Do you think your proposal could be combined with js-ipfs https://js.ipfs.io/ in a webapp ?

1 Like

One thing that I think makes very easy with content-addressing is offline-first and decentralized applications. Clients can create content and reference the content from other content without having to interact with a server as the identifiers are commutable completely local.

This might be just one aspect of a broader sense of “reliability”.

Absolutely!

The Web demo actually uses js-ipfs already. Unfortunately I was not able to get IPFS from web-browser working reliably.

I was able to test this much more reliably from node.js. See this example (and also from the Guile implementation.

Yes, that is correct. I introduce a grouping of triples very similar to the Minimum Spanning Graph proposed by Tumarello et. al caled Fragment Graph.

2 Likes

Thanks, I read more up on the usage of Fragment Graphs in your example.

How would you compare and contrast your example here regarding interfacing with SOLID (without regarding things like Linked Data Platform or ActivityPub etc for now), for starters:

  • WebIDs
  • PODs
  • etc

The proposed scheme is for immutable content only. It is not possible to mutate content. Mutation is required to update a WebID and to add or remove something to a POD.

Slightly more abstract, this functionality of mutating content may be seen as dynamic namespaces. I’ve sketched out some previous work and intend to do further research/experiments in that area.

There is also a very nice write-up on the ideas by Joe Armstrong.

I think of a combination of Solid & ipfs : some app could use webid to connect & store data on pod & on ipfs

Some interesting thoughts about ERIS (contentadressable rdf ) & Underscore Protocol ( git for ideas) https://matrix.to/#/!OvRXRRVeYwAsqeIkoR:matrix.org/$tlO83t6IBm0z0TbHrhZHsPpyXjWQTcQYQZseL9yFY3A?via=matrix.org&via=ungleich.ch&via=public.cat