Twitter mining with Solid


#1

Hello Solid Community. I am new here. I have been collecting millons of Tweets over the past year as part of an independent research project. Somewhat by accident, this research has led me to develop a Python application for collection and interactive querying of large Twitter data sets.

I am interested in supporting user-ownership of data, generally speaking, and so as a mostly non-developer I am trying to understand possibly integrating Solid into something like my twitter collecting/analyzing app.

Intuitively it seems plausible to me that my app, instead of holding a database governed only by Twitter terms of use, could support each Tweet-author’s ability to opt in or out of my research. Does that sound right?

Based on my initial scan of Solid material, it seems like there are no immediate resources to help someone like me who is curating a (currently local SQL-based) collection of Tweets from tens of millions of independent users. Does that also sound right?

I am curious how the Solid vision interacts with the idea of something like the Twitter API that allows actors of many different stripes to do the kind of thing I am doing – collect other people’s content at scale for the purpose of research. It seems like Solid speaks to the idea of replacing Twitter’s terms of use with something decentralized, but it’s less clear to me how a computational sociologist (for example) would incorporate Solid into their use of the Twitter API.

I’m curious to hear perspectives on this. thanks!


#2

Disclaimer: I’m quite a noob to Solid just throwing a few ideas out there…

In theory, you could use the Twitter API to pull tweets and store them in a Solid pod as linked data. You could then query it through SPARQL which is similar to SQL - although I would guess not as efficient speed-wise as modern databases.

It’s definitely a breach of Twitter’s T&C to do that though (storing tweet data in your own database is not allowed) and I’m not sure what you’d actually gain out of it. It’s just copying data to another format.

Solid could be used for building an alternative to Twitter where ownership of your data is built-in, rather than assigning ownership to everything you tweet to the platform.