I just finished the first version of a Solid app for tracking movies that I’ve been working on. I’ll post about that next week, but now I’d like to discuss on the state of the art for querying large containers (as in LDP containers).
I am using node-solid-server, and I have a container with 1411 documents. I wouldn’t say that this is “a large container”, but I’m calling it such because I’ve struggled a lot in making the application performant. The request to the container alone takes more than a second which is already slow for a single GET request. But then trying to query the documents is a nightmare. I tried using globbing, but after loading for a while I get a 500 error (probably due to an out-of-memory exception, although I haven’t checked). So I have to request the documents one by one, and even making parallel requests in chunks it takes almost 3 minutes for the application to load.
Given this situation, the only solution I could think of is caching all the documents locally. This still doesn’t fix the initial loading time of 3 minutes (much worse on mobile), but at least subsequent sessions are acceptable.
But there is still an additional problem. Given that I’m caching all the documents, I need to know when a document has been updated to invalidate the cache. I am using the
purl:modified of the document returned in the container request and that’s fine. But as I said, the container request takes more than a second. And it is much worse on mobile, although I suspect it is related to parsing a big turtle document (that’s a discussion for another day). So I’d like to avoid this request as well. What I’ve done is read the
purl:modified of the container, getting the container’s parent first. But one problem I’ve found with this is that value seems to be changed every time I perform a GET request on the container. I don’t understand why that happens, I was assuming reading the container didn’t cause it to be modified. So one problem with this is that every time I use the app in a different device, other devices will be slower to boot up.
In case you’re wondering why I need all the documents when I start the application, that’s because as far as I know, SPARQL is still not supported. So I wouldn’t be able to filter and sort documents which is not an option for this application.
I don’t know, in general I’ve done everything I could think of and the application is disappointingly slow. Where am I going wrong? Is
node-solid-server the problem? Is there something I’m missing?