Fun-fact - using SPARQL to query the type registry

I have just managed to use SPARQL for querying your POD’s personal “type registry” which is a registry that informs web-apps about your preferred data location for various data types (like for instance contacts or photos) - and since Google is incapable of locating any documentation for it, I thought it might useful for others.

You can see the discovery spec. here: https://github.com/solid/solid/blob/master/proposals/data-discovery.md

The basic technique goes like this:

  • Use rdflib.js and it’s built-in “fetcher” to fetch data into a data-store as explained here: https://solid.inrupt.com/docs/manipulating-ld-with-rdflib

  • First fetch the user profile document. Here you find this little reference to the type registry;

    solid:publicTypeIndex </settings/publicTypeIndex.ttl> ;
    solid:privateTypeIndex </settings/privateTypeIndex.ttl> 
    
  • Use rdflib to find those triples above in the profile document using “match” or “any” and extract the URLs.

  • Use the fetcher again to load the type index resources. My private index tells apps that I want data about my radio controlled hobby stored in /rc-data:

    @prefix solid: <http://www.w3.org/ns/solid/terms#>.
    @prefix solidrc: <http://solid-rc.net/ns/solidrc#>.
    <>
        a solid:TypeIndex ;
        a solid:ListedDocument.
    
    <#reg1>
        solid:forClass solidrc:data;
        solid:instanceContainer </rc-data/>.
    
  • Now the discovery spec says “Find statement pairs of solid:forClass and solid:instanceContainer to get class reference and data location”. In SPARQL we can express this as:

    SELECT ?cl ?loc
    WHERE 
    {
      ?reg <http://www.w3.org/ns/solid/terms#forClass> ?cl.
      ?reg <http://www.w3.org/ns/solid/terms#instanceContainer> ?loc.
    }
    

    This means "Find all those pairs of cl and loc that applies to a common resource reg - ignore reg and return (cl,loc).

  • We have read all the necessary resources (turtle documents) in to the local store and now we only need to run the query:

    const sparql = `
      SELECT ?cl ?loc
      WHERE 
      {
        ?reg <http://www.w3.org/ns/solid/terms#forClass> ?cl.
        ?reg <http://www.w3.org/ns/solid/terms#instanceContainer> ?loc.
      }`
    
    let query = $rdf.SPARQLToQuery(sparql, false, store);
    return new Promise((accept,reject) =>
      store.query(query, result =>
      {
        if (result["?cl"] && result["?loc"])
        {
          console.debug(result["?cl"].value + " / " + result["?loc"].value);
        }
      },
      null,
      () => accept()));
    

    }

  • The query engine calls the result callback for each pair found.

  • The “gotcha!” part here is that you need to index the result by "?cl" and "?loc" - and you won’t be able to detect that by JSON-dumping result as it comes out as an empty array!

It seems like the SPARQL library can fetch data too, but I haven’t tried that (yet).

10 Likes

Great!

On a related note, you can use any SPARQL Query implementation that supports URI-variable and URI-constant dereference in the body of a query to crawl any Solid Pod, subject to ACLs in place.

Here is a SPARQL Query example using our URIBurner Service (a live Virtuoso Instance)

## Remove comments to sponge-afresh

# DEFINE input:grab-all "yes"

SELECT DISTINCT ?o
WHERE {
		 { <https://kidehen3.solid.openlinksw.com:8444/public/CoolStuff/>  ldp:contains ?o.
	         OPTIONAL { ?o owl:sameAs ?o2 }
		 }
	         
  } 

Examples:

  1. Live Query Results Page for one of my Folders

Related

1 Like