open source alternatives

hosting an open source 𝕏 archive that you control

It’s time to leave 𝕏 if you haven’t done so already. I stopped posting there and locked my account immediately after the US election when it became clear that the owner was a de facto US Government functionary. Now that the owner has de jure political power and is clearly a neo-Nazi, it’s definitely time to leave.

But what if you’re a librarian (or a former librarian like me) or someone else who has a deep-seated pathological need to archive and preserve? My 𝕏 archive is, to my shame, the largest corpus of text that I have ever produced and some of it at least is worth preserving. If you have an institutional 𝕏 account, you may have more serious policy and legislative reasons to preserve the archive and make it publicly accessible. This post outlines the ways that I have preserved my 𝕏 archive using both the Internet Archive and a tweetback website that I control.

downloading your archive

First you need to download your 𝕏 archive following the instructions at https://help.x.com/en/managing-your-account/how-to-download-your-x-archive. It typically takes approximately 24 hours for them to produce your archive and mine ended up being 3.33 GB.

Your archive will be a zip file containing several files and folders. The tweets are in ./data/tweets.js (and possibly ./data/tweets-part1.js in addition if you have a large archive).

The Internet Archive

Uploading your tweets to the Internet Archive is a good way to ensure your tweets are preserved in the collective cultural record. The Internet Archive is a US non-profit that preserves and provides free access to digitised media like music, books, and videos as well as born-digital material like websites. Their Wayback Machine has snapshots of billions of web pages for the historical record and, if you point them in the right direction, they will save your tweets in a way that ensures their long-term preservation and makes them publicly accessible.

The Internet Archive provides instructions for archiving tweets with the Wayback Machine at https://help.archive.org/help/how-to-archive-your-tweets-with-the-wayback-machine/ but essentially they ask you to use their interface to process your tweets.js file into a CSV file containing all the URLs for your tweets, save that CSV file as a Google Sheets spreadsheet, and then give them access to that spreadsheet for their automated processing to archive every URL in it.

tweetback

Giving my tweets to the Internet Archive was a good way to preserve them long-term but I also wanted more immediate access to my archive: a site that I was able to control and host myself that also allowed me to search my tweet archive and display them in a more easily-accessible way. So I’m hosting my own public 𝕏 archive at tweets.simonxix.com using tweetback. tweetback is open source software (MIT License) originally developed by Zach Leatherman that ingests your tweets and produces a static site to host the archive. If you have server space available or are a cultural institution who can host a website easily, this is a good option for preserving and making your tweets accessible.

tweetback is a little more technically involved than uploading to the Internet Archive and requires familiarity with using the command line and ideally Node.js. The full instructions are available in their README but the basic process involves running an import script on your tweets.js file to import the tweets into a small SQLite database then running a build script to get Eleventy to build a static website. This isn’t explained in the README but an issues discussion alerted me to the fact that you can also import tweets-part1.js after you’ve imported the first part of your archive by simply renaming it tweets.js and running the import process again before building the site. You can then move the site to wherever you choose to host it: tweetback has dedicated instructions for deploying to GitHub Pages; I’ve chosen to host on my own server by moving the whole _site directory that Eleventy built into a directory on my server and serving it to the web through Nginx (the basic Nginx configuration for that can be found here.

As well as a searchable archive of all my tweets, tweetback also offers some interesting statistics on my tweets and lists of my most recent and most popular tweets. There were some statistics that I didn’t want to include, namely those statistics relating only to the last 12 months so I simply commented out those sections in the index.html homepage of the generated site. I also renamed ‘Recent tweets’ as ‘Final tweets’ to underscore the completeness of this self-hosted archive.

conclusion

The whole point of this newsletter is to advocate for digital sovereignty: you should have control over your interactions with the digital and you should have a space online that you control. You should have a website. You should also have control over the archive of your own textual production whether that’s scholarly work or a decade of tweets. Whether it’s by archiving with the collective cultural record of the Internet Archive or hosting your own archive, if you want them preserved then you should archive your tweets so they’re not exclusively owned by a neo-Nazi.

I’ll leave off with my most popular tweet from over a decade on the platform. There has been so much more fascism than I expected in my 30s.

tweet reading 'So far the main difference between my 20s and my 30s is all the fascism. No-one tells you how much fascism there will be in your 30s.'