I’m currently developing a Solid App using React and Inrupts js SDK. I’ve encountered a problem when I want to fetch all child resources of parent resources. Let’s say I have this structure in my pod:
parent/
child-1
child-2
…
child-n
Both the parent and the child are RDF resources. Currently, to fetch all the child resources, I first fetch the parent resource, get all the things defined in it, loop over the array of things containing the URL to the individual child resources which I use to fetch them:
The problem with this approach is that I’m sending n+1 request to the pod provider. When the number of child resources grows, this results in 429 too many requests as I’m rate-limited by the pod provider. Is it possible to send one request that returns all child resources? Is there a way to paginate so that I can fetch a subset? I’ve read about both LDP and SPARQL, but from my understanding, LDP is a bit limited, and SPARQL is not supported by all pod providers.
Unfortunately that is correct, Solid does not mandate a way to fetch all that data in a single request. I think your two options are either accepting the latency, or changing the data model to spread the data out over fewer Resources.
Thanks for the reply. Unfortunately, I cannot segregate the data any further, as users who upload child resources will only have append access. I guess I have to figure out something clever to solve my problem.
There are some search things in the works but none are implemented on open source Solid servers. You might want to look at TrinPod which I believe can support your use case.
Things depend on your use case a bit, but am I understanding correctly that each child Resource represents a single user, and that that user has Append access to that particular Resource? One other thing you could do is to have your own bot running that has access to all child Resources, and that indexes them into a single Resource. It’s a bit of extra work, which doesn’t feel great, but I think it should work.
Or indeed, if all those users are writing to a single server that you control anyway, you could indeed look at using a particular server that supports this. (Though in that case, you could also consider just not using Solid for that part? Since for the users there would be no material difference, I think.)
Depending on the app you’re making, you could also consider hosting each user’s resource on their own pod, then keep the links in an index.
This has some implications for control and discovery and may not be well suited for your use case. Otoh it could be an exercise of decentralization and user control, and may address your rate limit error, since you’d be fetching from many different resources.
To be more concrete, my app has a feature where users can send notification to each other via an inbox. When you create an account, the inbox RDF is created with public append enabled. If user A wants to send a notification to user B, user A can create a new resource and add it to the inbox. The inbox RDF lives inside user B’s pod, and B has full control over this resource.
Because A only has append access, I’m limited in how the data can be structure:
User A cannot create a new thing and add it directly to the inbox RDF. He has to create a new resource and append it as a child RDF with the notification as a thing inside the child RDF.
User A cannot chose the name of the child resource (at least when testing with Inrupt, the child resource is named with a random UUID)
User A cannot create sub-containers inside the Inbox.
The problem with fetching multiple children appears when user B logs into the app and wants to see his inbox. That’s when I have to fetch the inbox RDF to get the link to all the children, then send a request per children to get the details about the notifications.
Edit: Can also add that when listing the inbox to the users, I prefer that the notification are sorted by the time they were sent. The notification thing inside the child resources has this information, however, this means that all child resources has to be fetched into memory before it can be sorted. Just seems like this approach scales poorly.
Your best option here is to have an automaton which listens to the inbox resource, and that would handle the process of listening for new appended resources and retrieving them, at which point it can sort the notifications and dispatch them in appropriate order.
The problem with fetching multiple children appears when user B logs into the app and wants to see his inbox. That’s when I have to fetch the inbox RDF to get the link to all the children, then send a request per children to get the details about the notifications.
One way to address this is store a cache using webworkers, since your model will support it. Since other users have append only, you just need to fetch it once then cache that request. The next time the user logs in and the request is executed, you only need to fetch the new ones while the old ones won’t be changed.
Sorry I didn’t get notification about your response.
I’m pleased to see that you’re interested in the inboxes! It’s a great way to send information between people (you might be familiar with Linked Data Notifications and ActivityPub). Just be mindful that the current lack of validation and provenance in specification (afaik) can open doors to spam - e.g. a bot appending gigabytes of crap to your inbox. I hope that will be addressed in the future.
So, your issue is that there are many child documents and you have to fetch them all?
To address it, another approaches could be:
process and delete the notifications from inbox
Have the app (e.g. when owner signs in), or some dedicated bot (whenever) process the notifications, save the relevant info somewhere permanent, and then delete the processed ones.
append resource with specific name (if it’s possible)
You said you couldn’t name the resource that you append to inbox, but it might be possible. Then your app could name the notifications using timestamp or something. I suspect you tried POST ./inbox/. Have you tried the following?
PUT ./inbox/[timestamp]-[some-unique-string].ttl - this might work if the file doesn’t exist. Additionally you can add header If-None-Match: * if it makes any difference.
PATCH ./inbox/[timestamp]-[some-unique-string].ttl with n3 patch with only inserts. This may create a new file if it doesn’t exist already.
I don’t see why Append-only permission should limit these requests when the file doesn’t exist. But idk for sure.
Currently, I’m using pods hosted through Inrupts PodSpaces. When appending resources, the resources are given a random UUID. Nothing I’ve tried enable us to change this behavior.
I have, however, found a solution, where the app starts a “cleanup” process whenever the user signs in to the application. This process is ran in the background. It takes all resources directly in the root of the inbox, sorts them by upload date, and redistributes them into sub-containers labeled by page indices. This allows for pagination in the client, fetching only one page at the time.
For PUT and PATCH, both URLs ./inbox/ and ./inbox/[file-name].ttl results in 403 forbidden.
POST to ./inbox/works, but the resource is given a random UUID
POST to ./inbox/[file-name].ttl results in 403 forbidden
After some digging, I found this fetch that adds a slug header. This seems to work:
const response = await session.fetch('./inbox/', {
method: 'POST',
headers: {
'Content-Type': 'text/turtle',
Slug: 'my-inbox-item', // Can define file name here
},
body: turtleData,
});
However, it is said that not all pod providers supports slug. If not supported, the name of the resource defaults back to a random UUID. I have not been able to verify this last behavior as it works with PodSpaces from Inrupt.
Edit: Can confirm that the Community Solid Server (CSS) also supports slugs. However, unlike the Enterprise Solid Server (ESS), which returns a 409 Conflict error when attempting to append a resource with a name that already exists, CSS appends the resource with a random UUID instead of the name specified in the slug.
I didn’t know the slug header. Very interesting. Thank you for sharing!
(On the second look, it seems to have been removed from the specification.)
I did some testing with PUTting the resource, and Community Solid Server. It turns out it only works when the acl:Authorization on the container has <...> acl:default <./> triple for the agent. If it’s missing, an attempt to PUT a child resource fails with 403. I suppose PATCH will behave similarly.
In case of inboxes, allowing appends to children doesn’t seem desirable though.
Testing:
I put my script in a gist in case you want to try it. It assumes you first install the dependencies, and start a Community Solid Server locally on port 3000 npx @solid/community-server@latest.