State of the art for querying large containers

I don’t really have a solution to your problem, but here are some remarks and maybe starting points for you.

But then trying to query the documents is a nightmare. I tried using globbing, but after loading for a while I get a 500 error (probably due to an out-of-memory exception, although I haven’t checked).

Globbing is going to be removed from the spec, so I wouldn’t suggest to use it anyway (source).

I’m not sure exactly about the role of SPARQL in Solid, but it doesn’t seem like it will be completely publicly supported. I think this issue here will be relevant to you (I only skimmed through the answer): Querying multiple subjects in one request · Issue #162 · solid/specification · GitHub

The TL;DR from there:

I think that the question here is “fast access to multiple documents” and that the appropriate answer is “HTTP/2” (and a decent server implementation).

Maybe you will find more relevant issues with more discussions in this specification repository, I didn’t look that much into it.

Given that I’m caching all the documents, I need to know when a document has been updated to invalidate the cache.

Regarding caching, you could take a look into ETags (mdn reference). The server sends an ETag with each resource which changes whenever the resource changes and can be used to do conditional fetching. ETags are part of the api-rest spec so you can rely on its existence in Solid.

EDIT: And maybe you could also try merging all the files to a single one. I haven’t thought through the pros and cons of this, but I could imagine it to speed up initial loading and other bulk operations