Thanks to GDPR all users can export their twitter data in a machine readable format. I did this, converted all my tweets to activity streams RDF and put them on my Pod. Here is the result.
Hi @aveltens ! I also try to export my data from twitter, facebook, google maps and so on, and put them on my Pod. I’m interested in your converted tweets data to RDF, and I have a question, although I’m a beginner of RDF.
A sample of your data on your Pod is as follows.
:1268797880401362945
a n0:Note;
n0:attributedTo c:me;
n0:content
"RT @codecentric: \ud83d\ude4c Den Softwerker gibt es jetzt auch als Event! Am 19.6. veranstalten wir die erste SoftwerkerKonf als Online Event, kosten\u2026";
n0:published "2020-06-05T06:52:42Z"^^XML:dateTime;
n0:url n1:.
Although you adopted the tweet’s ID like #1268797880401362945 as the subject of RDF that is not a global unique ID, I think the subject should be https://twitter.com/i/web/status/1268797880401362945/ as follows because URL is globally unique.
@prefix n1: <https://twitter.com/i/web/status/1268797880401362945/>.
...
n1
a n0:Note;
n0:attributedTo c:me;
n0:content
"RT @codecentric: \ud83d\ude4c Den Softwerker gibt es jetzt auch als Event! Am 19.6. veranstalten wir die erste SoftwerkerKonf als Online Event, kosten\u2026";
n0:published "2020-06-05T06:52:42Z"^^XML:dateTime.
Why do you adopt tweet’s ID as a subject of RDF?
How do you think about my opinion? I’m looking forward to your answer.
I used the original Tweet ID as a fragment as a pragmatic solution to mint a unique URI. I could also have genereated a UUID or just used an incremental counter per document. It does not really matter, the tweet is identitfied by the full URI.
Because the twitter URL does not identify the tweet. The URL identifies and resolves to a human readable HTML document citing the tweet. Twitter does not follow linked data principles. If it did, I would not have to extract my data at all, I would just link to it.
Note that I do not just abandon the twitter URL, I refer to it via as:url which is described as:
Identifies one or more links to representations of the object
Since the twitter page is a representation of the object (tweet), this seemed a proper choice.
@prefix : <#>.
...
:1268797880401362945
a n0:Note;
...
I understand that @prefix : <#>. and :1268797880401362945 mean https://angelo.veltens.org/public/tweets/2020/06#1268797880401362945, but is it a common or standardized description that @prefix : <#>. indicate a URL of this file?
I think it also described as follows.
@prefix : <https://angelo.veltens.org/public/tweets/2020/06#>.
...
:1268797880401362945
a n0:Note;
...
Yes, this would also work, but the relative form (@prefix : <#>.) is more flexible, when copying / moving files arround.
For more information about the turtle language in general you can consult https://www.w3.org/TR/turtle/. Of course feel free to ask questions in this forum nevertheless.