All my tweets are on Solid now

aveltens · June 20, 2020, 12:26pm

Thanks to GDPR all users can export their twitter data in a machine readable format. I did this, converted all my tweets to activity streams RDF and put them on my Pod. Here is the result.

All tweets are grouped in containers by year and month starting from: https://angelo.veltens.org/public/tweets/

An example tweet:

https://angelo.veltens.org/public/tweets/2020/01#1216690474544771072

In case you are interested in the code (wich is totally hacked together quickly): Convert tweets to activity streams ($1988617) · Snippets · Snippets · GitLab

Perhaps someone wants to build a cleaner solution, e.g a CLI tool to do this.

shuhei · July 16, 2020, 11:30am

Hi @aveltens ! I also try to export my data from twitter, facebook, google maps and so on, and put them on my Pod. I’m interested in your converted tweets data to RDF, and I have a question, although I’m a beginner of RDF.

A sample of your data on your Pod is as follows.

:1268797880401362945
    a n0:Note;
    n0:attributedTo c:me;
    n0:content
        "RT @codecentric: \ud83d\ude4c Den Softwerker gibt es jetzt auch als Event! Am 19.6. veranstalten wir die erste SoftwerkerKonf als Online Event, kosten\u2026";
    n0:published "2020-06-05T06:52:42Z"^^XML:dateTime;
    n0:url n1:.

Although you adopted the tweet’s ID like #1268797880401362945 as the subject of RDF that is not a global unique ID, I think the subject should be https://twitter.com/i/web/status/1268797880401362945/ as follows because URL is globally unique.

@prefix n1: <https://twitter.com/i/web/status/1268797880401362945/>.
...

n1
    a n0:Note;
    n0:attributedTo c:me;
    n0:content
        "RT @codecentric: \ud83d\ude4c Den Softwerker gibt es jetzt auch als Event! Am 19.6. veranstalten wir die erste SoftwerkerKonf als Online Event, kosten\u2026";
    n0:published "2020-06-05T06:52:42Z"^^XML:dateTime.

Why do you adopt tweet’s ID as a subject of RDF?
How do you think about my opinion? I’m looking forward to your answer.

aveltens · July 19, 2020, 7:48am

Hi @shuhei, first of all welcome to the Forum.

The tweet you cite actually has a globally unique identifier, it is:

https://angelo.veltens.org/public/tweets/2020/06#1268797880401362945

I used the original Tweet ID as a fragment as a pragmatic solution to mint a unique URI. I could also have genereated a UUID or just used an incremental counter per document. It does not really matter, the tweet is identitfied by the full URI.

So, why did I mint a new URI instead of using the original twitter URL https://twitter.com/i/web/status/1268797880401362945/ ?

Because the twitter URL does not identify the tweet. The URL identifies and resolves to a human readable HTML document citing the tweet. Twitter does not follow linked data principles. If it did, I would not have to extract my data at all, I would just link to it.

Note that I do not just abandon the twitter URL, I refer to it via as:url which is described as:

Identifies one or more links to representations of the object

Since the twitter page is a representation of the object (tweet), this seemed a proper choice.

happybeing · July 19, 2020, 11:02am

I think this is one of the things about LD that’s hard. When explained its simple, but so easy to forget unless you are actively building this stuff!

shuhei · July 20, 2020, 9:26am

@aveltens Thank you for the reply! I understand, and I have an another question.

The tweet you cite actually has a globally unique identifier, it is:
https://angelo.veltens.org/public/tweets/2020/06#1268797880401362945

@prefix : <#>.
...

:1268797880401362945
    a n0:Note;
    ...

I understand that @prefix : <#>. and :1268797880401362945 mean https://angelo.veltens.org/public/tweets/2020/06#1268797880401362945, but is it a common or standardized description that @prefix : <#>. indicate a URL of this file?

I think it also described as follows.

@prefix : <https://angelo.veltens.org/public/tweets/2020/06#>.
...

:1268797880401362945
    a n0:Note;
    ...

aveltens · July 20, 2020, 12:37pm

@prefix : <https://angelo.veltens.org/public/tweets/2020/06#>.

Yes, this would also work, but the relative form (@prefix : <#>.) is more flexible, when copying / moving files arround.

For more information about the turtle language in general you can consult https://www.w3.org/TR/turtle/. Of course feel free to ask questions in this forum nevertheless.

Topic		Replies	Views
A sketch of RDF about Solid environment	4	378	February 5, 2024
Twitter mining with Solid	1	470	November 4, 2019
Mapping Turtle files to Javascript/Typescript objects Linked Data	26	2079	June 9, 2022
New to Solid - Turtle	3	1027	April 2, 2019
A journey in the Solid Galaxy \| Un petit voyage dans la Galaxie Solid	5	559	January 21, 2020

All my tweets are on Solid now

Related topics