Searching and querying distributed data in a Solid ecosystem


This offer is part of a collaboration between the WIMMICS research team and the company Startin’blox.

WIMMICS is a joint research team between Inria, the Université Côte d’Azur and the CNRS (I3S). Its researchers are interested in the representation and processing of knowledge graphs, particularly on the Web. Wimmics – Bridging social semantics and formal semantics on the web.

Startin’blox is developing an innovative and ethical technology based on open standards to create federated applications based on linked data and web components.

The objective of this collaboration is the design and evaluation of methods for search, indexing and discovery of services and datasets within the SoLiD ecosystem.

The SoLiD project, for “SOcial Linked Data”, launched in 2015 by Tim Berners-Lee and incubated at the W3C, proposes the specification of a new web application architecture allowing a complete decoupling between data storage and business applications. Thus, the massive deployment of applications respecting SoLiD standards would make it possible to re-establish decentralisation on the web and give users the possibility of keeping control of their data, in “personal servers” called PODs.

At present, the project consists of a set of ten or so more or less advanced specifications and there is a very active community working on several implementations. However, some fields are not yet covered, such as the querying of these distributed data.


The aim is to design and evaluate methods for searching and querying distributed data in a SoLiD ecosystem.

The ability to perform advanced searches on large volumes of data with acceptable performance is one of the foundations of information flow and the construction of social applications.

The candidate will therefore investigate possible solutions to build on top of the SoLiD architecture capabilities for service discovery and pathfinding and access to distributed datasets, by standardising the search and filtering capabilities of PODs. To do this, we could use SPARQL traversal or decentralised query approaches to design a pilot architecture that also meets the performance challenges, for example via cache or index systems. This would allow us to support the diffusion of the SoLiD ecosystem on a web scale.

More information

Here: 2022-05143 - Searching and querying distributed data in a SoLiD ecosystem