Location versus Content

Thanks for the mention @aschrijver!

Yes! I have recently done some work towards this and am thrilled to see this being mentioned.

The basic idea is to make RDF content-addressable, in contrast to location addressable, in a way that keeps it compatible with existing data.

Basic idea is to reference a set of RDF triples by the hash digest of the triples themselves.

I think there are three tricks involved:

  1. Define a grouping of RDF triples
  2. Define a canonical representation for this grouping of RDF triples
  3. Use a URN to describe the grouping

I have a proposal on how to do this here: Content-addressable RDF.

This is an interesting point. Naive content-addressing does not address the issue of confidentiality. Pieces of content are not encrypted and any transporting and caching peers will be able to read content. Also for large contents the entire content is stored in an entire chunk, making transport complicated.

To solve this there has been previous interesting work (Tahoe-LAFS, the file-sharing application of GNUNet and Datashards).

We have a proposal on how to do this that is optimized for smaller file-sizes and adds an verification-capability that allows caching peers to cache the entire content without being able to read the content: An Encoding for Robust Immutable Storage.

We also have a demo that shows both ideas (content-addressable RDF + Encrypted content-addressing): An Encoding for Robust Immutable Storage - Demo.

I believe that RDF is a very well suited data-model for decentralized systems (even more decentralized than what SOLID and ActivityPub currently are). I see content-addressing as the stepping stone to be able to use RDF in decentralized systems.

As part of the openEngiadina we intend to explore these ideas.

5 Likes