I wanted a way to browse and search my old tweets, archive them in case Twitter ever dies, and do this all on my own web server. There were a few roundabout ways and hacky approaches to doing this out there, but I felt there were too many steps involved for something so simple, so I created Archive My Tweets. Now I have a browsable and searchable archive of all my tweets.

Due to a limitation of Twitter’s API, only the latest 3200 tweets can be retrieved. If you haven’t started grabbing your tweets now, you might want to get started so you don’t have to manually copy them in later.

Twitter Display Guidelines and Terms of Service

(August 17, 2012 update) After reading through Twitter’s new Display Guidelines, the Developer Rules of the Road, and the Terms of Service, I’m not convinced that Archive My Tweets falls under the new display guidelines.

As defined by Twitter, a Twitter Client allows a Twitter user to login, view his timeline (meaning tweets from people he follows), create new tweets, retweet, etc. Archive My Tweets does none of that; it’s only personal backup software.

My understanding is that using Archive My Tweets to store ones own tweets is allowed by their TOS. First, as described in the TOS, the creator of the content (the tweet) still owns all rights to that content after posting to Twitter. Second, the default Archive My Tweets setup is created in such a way that you should only be storing tweets that you’ve created. The way I see it, you’re just taking content you already own back from Twitter, and displaying that content in a manner that you choose.

Comments

58 Responses to “Archive My Tweets”

  1. Hi Andrew,
    After searching for a while how to archive my tweets, i’m using your application. It’s great to have this wonderful app running fine.

    The only thing it doesn’t update my (new) tweets automatically after a cronjob set up in cPanel.

    Now cron keeps sending mail:
    /bin/sh: 0: command not found

    What does this mean?

    Kind regards,

  2. Samira, I’m not exactly sure what could cause that problem, but it looks like it has something to do with the way your web host is running the commands in cron jobs. You may want to contact your host or look in their help documents to see if someone else has had that problem with cron.

  3. Hiiiiiii :) )
    I’ve been using twitter since 2008 with multiple usernames, in the beginning it was to me like youtube where everyone has one or more accounts to create her/his own friends and subs to become (more) populair.

    Let’s say, my usernames have together between 14k and 15k tweets, is it possible to use this lovely application for multiple twitter accounts on the same website?

  4. Sorry, it doesn’t have support for multiple usernames. You could install it in multiple directories if you wanted, e.g. /tweets/username1/ and /tweets/username2/.

  5. Hi :) ,
    I forgot to fill in a name at the previous post.
    Tried it with 3 different- directories and 3 usernames (www.example.com/tweets/username/) and they all are showing exact the same tweets even when a user has tweeted less than others or hasn’t tweeted at all.

    All usernames cron secrets starting with ‘tw_ , i wondered if this has something to do with the cron secret.

  6. You need to set the DB_TABLE_PREFIX to something different for each user so the tweets are stored in separate tables. For example, “username1_”, “username2_”, and “username3_”.

  7. Hiii Emoticon die met de ogen rolt,
    If the (‘DB_TABLE_PREFIX’, ‘tw_’); is changed to (‘DB_TABLE_PREFIX’, ‘username_’); it will show the message below.
    Archive My Tweets is not yet installed. Please see the documentation.

    There should be a solution how to get this great app running more accounts.

    Thanks in advance

  8. That message is correct. Once you change the database table prefix you’ll need to go through the installation for each username. You’ll also need a separate cron job running for each user. The code doesn’t handle multiple accounts so installing it three times is a workaround, and it’ll take some extra work to get it all setup.

  9. It’s working properly, thanks for your answers :)

  10. Hi there. I found archive-my-tweets via Jon Stokes’ post at Wired Cloudline. Everything seemed to work out fine until I got to visit the cron.php file plus its secret key. At first it added the amt_tweets table, but after that all I got was an error message: “PHP Warning: date() expects parameter 2 to be long, string given in E:\home\pittsburgh\Web\tweets\classes\archivemytweets.php on line 141″ (I have not changed this file in any way.) No records were added to the table.

    Do you have any idea what I should change to have it working?

  11. After I added a port (:3306, same as example), error message was entirely changed:

    Retrieving 200 tweets per page.
    Exception: Invalid response.
    Got 200 results on page 1.
    Got 200 results on page 2.
    Exception: Invalid response.
    Got 200 results on page 3.
    Exception: Invalid response.
    Got 200 results on page 4.
    Exception: Invalid response.
    Got 200 results on page 5.
    Got 200 results on page 6.
    Got 200 results on page 7.
    Exception: Invalid response.
    Got 200 results on page 8.
    Exception: Invalid response.
    Twitter is being flaky. Too many exceptions! Try again later.

  12. It should all still work despite this warning. I wrote this code almost 2 years ago… some things with the Twitter API have changed and are causing some warnings to be issued. I’ll see if I can get around to fixing them.

  13. This looks fine, but sometimes the Twitter API doesn’t handle a lot of requests very well. Just give it a little time then try it again later.

  14. No luck yet. hehehehehe What I found out is that I get the long error message above (with small variations) when I hit enter after typing the URL, but I get the short one when refreshing the page. Thanks for the help anyway. I’ll keep trying, since the other commenters have made it this month.

  15. Andrew,

    I just wanted to thank you for donating your time on this project. I just thought I’d chime in to say that for the past few days, every attempt has resulted in:

    Retrieving 200 tweets per page.
    Exception: Invalid response.
    Exception: Invalid response.
    Got 200 results on page 1.
    Exception: Invalid response.
    Got 200 results on page 2.
    Exception: Invalid response.
    Got 200 results on page 3.
    Exception: Invalid response.
    Got 200 results on page 4.
    Exception: Invalid response.
    Twitter is being flaky. Too many exceptions! Try again later.

    I haven’t checked the DB directly, but nothing is getting posted when visiting http://www.mysite.com/tweets/.

    This isn’t a complaint or a request, just a friendly FYI.

    Thanks again!

    Rob

  16. Betcha lovin’ the publicity ;-)

    Thought I’d add to the pile:

    Retrieving 200 tweets per page.
    No tweets on page 1.
    0 new tweets over 1 query.
    API: (349/350 remaining) API count resets at 9:26pm1.
    ERROR INSERTING INTO DATABASE:

  17. I’ve had the same issue as Keith. It seems to be authenticating with twitter ok, because when I change any of the secure keys it fails.

    The thing that’s stumping me is the ERROR INSERTING INTO DATABASE: line. The table was created fine, so it appears to have all the correct permissions.

    I’ll monitor it to see if it’s just the API playing up.

  18. I never stumbled on that ERROR INSERTING INTO DATABASE message. I do know that no record was inserted into the database table, but the table itself was created automatically, and it was again after I dropped it and ran the script again.

  19. I posted version 0.2 on Google code that may fix some of these problems (http://code.google.com/p/archive-my-tweets/downloads/list). If you’ve installed directly from SVN on your web server, just do ‘svn update’. If you downloaded the zip file, then download the 0.2 source, unzip, and overwrite all of your current files (keeping your custom config.php file intact).

  20. Thanks for the update, Andrew. It didn’t work for me, though. The message changed, and I got the dreaded ERROR INSERTING INTO DATABASE message:

    Retrieving 200 tweets per page.
    No tweets on page 1.
    0 new tweets over 1 query.
    API: (347/350 remaining) API count resets at 9:22pm America/Sao_Paulo.
    ERROR INSERTING INTO DATABASE:

    Also, with the previous version it took always something like a minute for the messages to appear; now it’s almost instantaneous.

    Any clue? :)

  21. I made some more additions, updates, and fixes. Download version 0.3 here: http://code.google.com/p/archive-my-tweets/downloads/list. As always just replace all the files with the new ones except for your custom config.php.

    I fixed an API parameter bug which will hopefully fix the problem with everyone getting zero rows inserted into their databases. I added paging to the ‘all tweets’ view, cleaned up some error messages, removed some deprecated functions, and raised the twitter api exception limit to 25 from 5.

  22. Also, retweets are now included. Unfortunately, this will only apply to new tweets as they’re added.

    If you have less than 3200 tweets (the maximum that can be retrieved from the Twitter API), you can empty your database table an re-run your cron.php file. All your tweets will then be reimported.

  23. Andrew, you’re a champion. It’s all working now.

    Thanks for the update.

  24. Thanks (again), Andrew. But… it hasn’t worked again. :(

    The first time I run the page, I get this message: “PHP Fatal error: Maximum execution time of 90 seconds exceeded in E:\home\pittsburgh\Web\tweets\classes\twitter.php on line 473″. It does create the table on my database, though (I had dropped it).

    If I refresh the page, the message changes to this or some variation of it:

    Retrieving 200 tweets per page.
    Got 200 results on page 1.
    Got 200 results on page 2.
    Got 200 results on page 3.
    Got 200 results on page 4.
    Got 200 results on page 5.
    Got 200 results on page 6.
    Got 200 results on page 7.
    Got 200 results on page 8.
    Got 200 results on page 9.
    Got 200 results on page 10.
    Got 200 results on page 11.
    Exception: Invalid response.
    Got 200 results on page 12.
    Exception: Invalid response.
    Got 199 results on page 13. (Some tweets may have been filtered out.)
    Got 200 results on page 14.
    Exception: Invalid response.
    Got 200 results on page 15.
    Exception: Invalid response.
    Got 200 results on page 16.
    No tweets on page 17.
    3199 new tweets over 17 queries.
    API: (308/350 remaining) API count resets at 9:51pm America/Sao_Paulo.
    ERROR INSERTING INTO DATABASE: MySQL server has gone away

    Table remains empty after it: http://screencast.com/t/OvwxDLkOt

    I do have more than 3,200 tweets (4,074). Maybe that’s an issue?

    Again, your effort has been very much appreciated!

  25. Alexandre, I think you’re almost there. I added a slightly new version (0.4). It will add the tweets to your database after fetching every 200 results rather than saving the database insertion until the very end.

  26. Andrew, I can’t say enough good things about your help, so I don’t know how to say this, but… it didn’t work again. :)

    At first I got the same “maximum execution time” error message, but I refreshed the page and it seemed to start. Yet…:

    Retrieving 200 tweets per page.
    Getting tweets with an id greater than 154876345120342016.
    No tweets on page 1.
    0 new tweets over 1 query.
    API: (328/350 remaining) API count resets at 11:05am America/Sao_Paulo.

    154876345120342016 is the ID of my latest tweet. Maybe that’s it?

    (On a related-but-unrelated note, I started deleting many tweets in order to try to get to 3,000. I got from 4,075 to a little less than 4,000, but, boy, how it’s hard to delete an old tweet in the current format! As soon as you delete one, you get all the way back to almost top.)

    Thanks again, Andrew!

  27. Out of the blue, I went to the page created, and the tweets were there (and, of course, they are in the DB as well). So that must be why it was looking for something greater (more recent) than my latest tweet.

    It imported 1,919 of my 3,800+ tweets, and it got a few dates wrong (Dec. 1982, Oct. 1988, Oct. 2014, Mar. 2021, Mar. 2027 and Apr. 2028). Other than that, it was perfect. I’ll give it a try again, renaming the current table and generating a new one.

    I can’t say thank you enough, Andrew!

  28. Correction: it wasn’t the script’s fault that some dates were imported incorrectly: it was Twitter’s fault (http://twitter.com/#!/agiesbrecht/status/6459629815).

    Also, the tweets were imported in the first step, when I got the “maximum execution time” error.

    Second try gathered 1,794 tweets, but it went back the same length of time of the first try (10/1/2009). It’s just that I had deleted many more tweets in the mean time. Is this because of the Twitter API or something that’s coded?

    I’m not looking for a fix, just reporting it so someone might have their doubt answered here. Sorry that I have flooded the comments! :)

  29. Set it up in less than 25 minutes. Great exercise in using the Twitter API.

    Is there something additional that can be done to capture favorites and retweets?

  30. One thing I was thinking recently: What is the impact of leaving a link to my Cron.php page with its secret key on it? I’d do so in order to update my database more often. The way I see, even if someone clicks on it, the worst thing they’d do would be updating my database. Or am I missing something here? Is there anything bad that can be done with my secret key? (It has nothing to do with any other password I have, either on this system or any other.)

  31. Alexandre, the impact of having your cron.php file public is that it eats away at your Twitter API limits. You (your username) has a limit of 350 requests per hour. This limit is per username, and applies to ALL Twitter apps you use. If your cron.php link goes public, you might find that you’d go over your limit and be unable to use other Twitter clients to check your tweets. It’s possible that you wouldn’t reach the limit, but I recommend against making it public for the reasons above.

  32. I KNEW there was a reason! :) Thanks again, Andrew.

  33. Can you make a dummy’s guide on how to do this? Because I’m not much of an IT whiz.

  34. Hello Andrew,

    Is there a way to modify this to archive a specific #hashtag?

    Been playing with the code…

  35. Hi, works beautifully.

    Here are rewrite rules for Nginx. This might not be the prettiest way to do this, but at least it works on my server. These rules are for http://example.com/tweets/ and should be in server block of Nginx configuration. It took me some time to get these working, so I thought I might as well share these.

    server {

    rewrite ^/tweets/([0-9]+)/?$ /tweets/index.php?id=$1 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/([0-9]{2})/?$” /tweets/index.php?year=$1&month=$2&day=$3 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/?$” /tweets/index.php?year=$1&month=$2 last;
    rewrite “^/tweets/archive/([0-9]{4})/?$” /tweets/index.php?year=$1 last;
    rewrite ^/tweets/page/([0-9]+)/?$ /tweets/index.php?page=$1 last;

    }

  36. Andrew – thanks so much for this. I wish I had started this earlier. Now we need a way to extract > 3200 from the archives – not so much automated – but more one-time.

    I made one change to the .htaccess to make the rewrite rules work – it might be my PHP or my ISP (1and1) – but I could not make work without this:

    RewriteBase /twitter/

    Of course that is the URL where I placed the app.

    Again – thanks so much.

  37. Andrew –

    A lovely script with beautiful output. Very many thanks!

    Quentin

  38. Great code that works out of the box. Can it be modified to display the archived tweets of any twitter user and is this easy to do?

  39. Charles, if you need just for backup, I have found a service (called BackUpMyTweets) which does that: all my tweets have been downloaded in XML format, not just the last 3,200. I just haven’t found a way to insert them into this system, except manually, via the database, which I have done for a few that are important to me.

  40. What is the chron.php command?

  41. Worked perfectly first time! Thanks for the great script; exactly what I needed!

  42. Adrian, yes, it can archive any twitter user, but the account need to be public and not protected. The cron.php file just updates the database by checking if there have been new tweets posted since the last check.

  43. Mike Galloway

    08/09/2012 at 8:37 am

    Hi Andrew

    Great script, very well presented and documented.

    I’ve tried to set-up to get tweets for other user but despite having username in the config.php and a different db table I keep getting my tweets downloaded? The output appears with the other user name at the top of the page, but then my tweets. I’m clearly doing something wrong … but can’t find what!

    Mike

  44. Thanks for the awesome archive application, got it working great in a very short time. Do you have any plans to update it against the new display guidance from twitter regarding their api?

  45. Hi, Chris. I haven’t been able to dive into any of the new API considerations. I don’t have immediate plans to update it, but I’ll see what I can do.

  46. Update: see the top of this page for my thoughts on the new guidelines and how it impacts this software (spoiler: I don’t think the new guidelines impact it at all).

  47. This is awesome. Almost everything is working great for me. Only thing I can’t figure out is when browsing tweets on my own site’s new /twitter page, I get a 404 error when I click on the monthly archive links or the “older tweets” at the bottom of the first page. Searching brings up any old tweets just fine. Just can’t seem to browse by date or chron. Do I need to do something to create pages?

  48. Doh! Fixed it. I had neglected the hidden .htaccess file. Just needed to enable “show hidden files” in my ftp client and copy that file over.

  49. Thanks a lot. Perfect tool. Wish i had an as elegant solution for hashtags archiving.

  50. Hi Andrew,
    Twitter changed (again) it’s API and your beautiful work has been broken by that. Do you plan to follow their guidelines ? https://dev.twitter.com/docs/api/1.1/overview

  51. Hi Andrew, many thanks for making this tool available, as well as the straightforward implementation instructions.

    I’d been frustrated with Twitter Tools (and its dependence upon another plugin for the scheduling) as it only pulled in a fraction of my tweets after the API change earlier this year.

    So starting afresh, using your tool (and the handy instructions on getting the Twitter archive transferred over from Twitter.com), I’m up and running again.

    Signed, very happy non-expert web blog person.

  52. Hi Andrew,

    Thank you for developing this tool. I installed this 3 days ago and had issues getting the cron.php working. When I looked through my server log for the errors, this is what it said:

    PHP Warning: Unexpected character in input: ‘\’ (ASCII=92) state=1 in /path/to/the/cron.php on line 14
    PHP Parse error: syntax error, unexpected T_STRING in /path/to/the/cron.php on line 14

    I’m not a php expert, however I did google potential solutions. Many say it has to do with my version of PHP. After running PHPinfo, my version says it is 5.3.27.

    Would you happen to know what it is? Any help you could offer would be great.

  53. Karen, is the actual Archive My Tweets website running just fine, for example, you’re seeing the pages and your tweets?

    If it’s running fine on the web, but your cron.php isn’t working, your host might have a different version of PHP running on the command line vs. the web. The error your getting points to your cron.php file being run with a version of PHP that’s some version of 5.2 or less. Version 5.3.0+ is required.

    In the documentation, see the Setting Up a Cron Job section, and try using one of the alternate methods there. That will force the cron.php file to be run under the same version of PHP that your website is using.

  54. Yes, the actual website is fine.

    I tried putting in “/usr/bin/env curl –silent –compressed http://example.com/tweets/cron.php?secret=MY_SECRET” but the cron email returned that I was not authorized.

  55. OK, so it’s almost working. Double check that the MY_SECRET part of your URL there matches the TWITTER_CRON_SECRET you’ve setup in your config.php file.

    Instead of waiting for cron to run again, you can also test it out by visiting that URL in your browser, and making sure the output is something besides “not authorized”: http://example.com/tweets/cron.php?secret=MY_SECRET

  56. It works!!! Turns out I was using the wrong secret. I checked it in the config.php file and it worked.

    Thank you for helping me work this out. :) I really appreciate it.

  57. Roman O┼żana

    11/27/2013 at 8:33 am

    Here is ultimate lastet nginx configuration

    location /tweets {
    rewrite ^/tweets/([0-9]+)/?$ /tweets/index.php?id=$1 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/([0-9]{2})/?$” /tweets/index.php?year=$1&month=$2&day=$3 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/([0-9]{2})/page/([0-9]+)/?$” /tweets/index.php?year=$1&month=$2&day=$3&page=$4 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/?$” /tweets/index.php?year=$1&month=$2 last;
    rewrite “^/tweets/archive/([0-9]{4})/([0-9]{2})/page/([0-9]+)/?$” /tweets/index.php?year=$1&month=$2&page=$3 last;
    rewrite “^/tweets/archive/([0-9]{4})/?$” /tweets/index.php?year=$1 last;
    rewrite “^/tweets/archive/([0-9]{4})/page/([0-9]+)/?$” /tweets/index.php?year=$1&page=$2 last;

    rewrite ^/tweets/client/(.*)/$ /tweets/index.php?client=$1 last;
    rewrite ^/tweets/client/(.*)/page/([0-9]+)/?$ /tweets/index.php?client=$1&page=$2 last;
    rewrite ^/tweets/page/([0-9]+)/?$ /tweets/index.php?page=$1 last;

    rewrite ^/tweets/favorites/?$ /tweets/index.php?favorites=1 last;
    rewrite ^/tweets/favorites/page/([0-9]+)/?$ /tweets/index.php?favorites=1&page=$2 last;
    rewrite ^/tweets/stats/?$ /tweets/index.php?method=stats last;

    port_in_redirect off;
    try_files $uri /tweets/index.php?$query_string;

    }

  58. If your hosting provider doesn’t provide cron job, you may use easycron.com. It allows setting cron job that run every 10 minutes for free.

Leave a Reply