UPDATE June 19, 2013 Since twitter has killed their version 1 API, there is no longer a public RSS feed provided for twitter activity. Expect new solutions to emerge, one that is usable now is this method from labnol to convert the new JSON feeds to RSS using a Google Script. This method works in FeedWordpress.

I’m growing more and more and more and more (more?) interested in building out more syndication architectures like we have done in ds106, at a range of scales from te 600 feeds we crunch for ds106 to the 40 or so we did for the Project Community Class down to the 2 I do for my own self syndication.

Leaning towards the bigger end, I have been working to set this up for the ETMOOC thing Alec Couros (and about 90 other people it seems) are launching soon. It’s been a great chance to stretch some WordPress chops with FeedWordPress in place for the syndication engine.

Below I outline how I created the site that is archiving the tweets – http://etmooc.org/tweets

The ETMOOC site itself is set up as a WordPress multisite (using directory structure for additional blogs). The main site is set up with a basic Twenty Eleven theme with a big bold header. I have set up two additional sites, one the Hub or syndication of participant blogs (I’ll write ore on that one later), and the one I just tinkered with yesterday set up as a site to archive the tweets tagged #etmooc.

It makes sense to theme these two sites in a similar fashion, using the header from the main blog, I edited it to add title,a nd some graphic fun. I was thinking of first using some of the twitter bird graphics, but took s swing into the flickr creative commons, and landed on a great image I have used before:


cc licensed ( BY NC ) flickr photo shared by TarikB

I cropped it a bit, cranked up the contrast, and moved the bottom line and bird to be level with the top ones, as one long line with birds on it – this played out nicely:

etmooc-tweets

I have to start with a decent graphic 😉

I enabled the Feedwordpress plugin; it is really just being used for one feed for the twitter search result on #etmooc.

Now twitter does not make it easy to find the RSS feed- its out there, but not very accessible. From past experience, I knew how to construct the URL

http://search.twitter.com/search.rss?q=%23etmooc

For more on this see The Socialable How to get your Twitter RSS feed using this simple hack.

What you get is, well okay, but not ideal. The twitter RSS feed puts the full tweet in both the title and the content:

etmooc tweers 20122

My first idea was to try the WordPress P2 Theme which made for a slightly better output, but I could see confusion as theme is really meant to build a twitter like environment in WordPress (which is neat and useful) but mixes things up as it has a post box at the top. Plus, the icons would be blank as they represent the faux user accounts Feedwordpress creates. I thought about doing some theme hacking, but it seemed to make for a more cohesive site to make a mod of the Twenty Eleven theme on the main site.

It would be basically just using for output the title of the RSS feed, since that is the tweet, and maybe parsing out the twitter user name to make a link.

That seemed doable too.

But it was when I was out for a walk yesterday that the idea sprung on me- that was both easier coding and a better output.

You see in the recent versions of WordPress, they have built in a series of automatic embed technology, so if you just put a URL on a blank line for a flickr page, or a youtube page, or in our case, a single tweet’s URL– wordpress will automatically turn that into an embedded media when published.

So if I have a highly meaningful tweet at say

https://twitter.com/cogdog/status/287647396089974784

Just added to my blog, I get

So here is the thing, the URL for every tweet we get arrives in RSS as the link value, and when published via Feedwordpress is technically the permalink url for the post.

My light bulb was I just had to rewrite the wordpress loop to just use that and the function that does the embed (the capability can be called from PHP).

Any time I do a mod of a WordPress theme, I set it up via the approach of CHild Theming. This means I make a new theme that just overlays tiny changes on the “parent” theme, but I never edit those files. Ideally this helps if themes get updated, as you can just update the parent, and thecghild inherits the changes. In practice, I cannot say I have ever had an issue where a theme stopped working and required updating. The theme I use on this blog is at least 4 years old.

It is however, just better code practice to theme in this manner, Especially as I am going to have to make two versions for each of the sites I use.

etmooc themes

The basic idea is to create a new folder for my theme, I call it etmooctweets at the same level (/wp-content/themes) as the parent. The only thing you need to start is a styles.css file which creates the relationship:

/*
Theme Name: 	Son of ETMOOC Tweets
Theme URI:      http://etmooc.org
Description:    Child theme of Twenty Eleven for etmooc
Author:         Alan Levine
Template:       twentyeleven
Author URI:     http://cogdogblog.com/
Version:        0.1.2
*/

@import url(../twentyeleven/style.css);

The name here listed for Templat is important as it identifies the parent theme by name, as is the @import statement which just says “load all of the CSS from the parent”). This means your site gets all the style sheet info from the main theme.

Any CSS you add can override to augment this CSS, so you just put what styles are different. The screenshot.png (a 320px wide PNG) file is handy so it provides a preview in the theme viewer area.

Now the real beauty comes in when you decide you need to modify a theme file, say a footer template or a page template. You just make a copy of that from the parent theme directory, five it the same name in the child directory, and it replaces the original. So instead of editing the core theme files, you make modified copies.

I don’t know about you, but to me that is genius.

It takes a bit of knowledge of how themes work to take the next steps. The core of the site is driven in the index.php template, where The Loop resides, which is where WordPress iterates for doing things like showing the 10 most recent posts, or all posts in a categoty, etc. You find this in Twenty Eleven, but the work is actually parceled out to another template:



  

 

The work for each post it prcesses is done in another template, content.php The original has all kinds of stuff we dont need- it posts the title, the content, the author, date published, tags, categories, comments… For what I need to do it turns out to be grandly simplified:

>

Basically, the only thing output is tossing the permalink (which is the link to the tweet) into the wp_oembed_get() function and echoing the output.

I came up with one more modification- we are going to collect thousands of tweets, and to make it easier to go page back in time, it helps to change the default navigation from “previous/next” to something that allows you to access any set of ten tweets via paged navigation. I installed the Prime Strategy Page Navi plugin, and added the extra CSS it needs to my child theme’s CSS.

So next I copied the index.php template from the parent theme, and replaced the occurrences of


   :
   :
   :

with

I changed some of the CSS to make the selected page black background, and the hover class to be a bit more obvious.

The last idea I wanted was to have the site display the total number of tweets it has archived, namely how many posts it has published; as awlays WordPress provides a handy function for this. I added it below the new navigation code:



archiving publish; ?> tweets since Dec 21, 2012

The CSS class allows me to do some styling.

So now the site is rocking…

etmooc tweets themed

It looks good, but now I see problems. You see FeedWordpress by default refreshes once an hour. If there are more than 20 tweets per hour, we will miss some. For now I am cutting the time to every 10 minutes, which is technically a bit abusive of RSS requests.

In the long run, fetching tweets by RSS is doable for a medium or low activity search, but is going to have hundreds of tweets. I’m thinking for an improvement O might have to figure out how to access the twitter API directly, where I can fetch up to 1500 at a time.

In the meantine, I have set up the amazing tool created by Martin Hawksey to archive tweets in a Google Spreadsheet. His genius is that it updates itself, and there are bult in tools to summarize and generate visualizations of the twitter activity.

I have a copy of that spreadsheet running now for ETMOOC. I am thinking an easy way to populate the Twitter archive site is just to update from the data in that spreadsheet, e.g. publishing the summary as a CSV, and parsing that for new tweet URLs.

There are always things to tinker with…

If this kind of stuff has value, please support me by tossing a one time PayPal kibble or monthly on Patreon
Become a patron at Patreon!
Profile Picture for CogDog The Blog
An early 90s builder of web stuff and blogging Alan Levine barks at CogDogBlog.com on web storytelling (#ds106 #4life), photography, bending WordPress, and serendipity in the infinite internet river. He thinks it's weird to write about himself in the third person. And he is 100% into the Fediverse (or tells himself so) Tooting as @cogdog@cosocial.ca

Comments

  1. Really interested to see where you go with this one. Something that occurred to me was categorising the tweets when imported. You could just extract any additional tags in the tweet or maybe go a bit further and categorising tweets containing ‘?’ as questions.

    Another thing to be aware of is Twitter is killing RSS access in March as part of their new API (I’m sure a number of API 1.1 to RSS solutions will popup).

    One last thought was where you could use the spreadsheet as a way to feed wordpress. Maybe croning somthing like the CSV importer plugin http://wordpress.org/extend/plugins/csv-importer/ ?

    1. Brilliant gems are your ideas (as usual), Martin, thanks. The tweets archive is not really meant to replace the functionality of your spreadsheet, its more of a public facing interface to work with the web site. I see both working together.

      I pondered the idea of turning twitter hashtags into wordpress tags; I think it would be easy with a WP filter on publishing. The question idea is interesting; I guess we could also marked with 😉 as “whimsy”?

      I really did not see RSS as the best vehicle for grabbing the tweets (it will likely fail when the stream gets very active), it was a first step in.

      The thing with importing the CSV is that its going to keep growing in time; I can see making a new sheet that constructs the CSV headings that that plugin can use. I am not sure how that importer works with duplicate content.

      I could see a PHP script that marches down the spreadsheet public published as CSV, and processing down until it hits a tweet it has already processed (matching on something like date or maybe the id). Or I might try the python script tony shared to at least pull down the tweets as local data, and make another script to process.

      1. love the idea of a ‘whimsy’ category. You should also have ‘WTF’ for Interrobangs (?!)

        I had a look at hooking a XML-RPC call to the spreadsheet script to push new tweets but didn’t get very far. Might revisit.

        Chris Zarate has come up with this https://github.com/mlaa/tags-viewer which is an interesting take (might be ideas in this you want to reuse)

  2. If you want the tweets to be archived directly from the stream API there is an easy way: Just add it to https://chatdir.kneaver.com as if it was a TwitterChat (I’m the maintainer of the site). Then with a simple script I can expose it as an RSS feed on times we agree, or just a list of tweets links to embed in WP as I do for http://recaps.kneaver.com/

    Like this the page will continue to grow 🙂 and #etmooc will continue to be archived

    1. Thanks Bruno for sharing your Open Chat Directory, that’s a nifty service to recommend.

      I should have clarified by statement about twitter’s (insert cussing) (un-necessary) (I think) nuking of RSS feeds; I did find a solution I have used regularly on about 10 sites after ETMOOC- this approach from Digital Inspiration uses a Google Script to convert any twitter search (hashtag, user, keyword) into an RSS feed; it’s been super reliable http://www.labnol.org/internet/twitter-rss-feed/28149/

      I could have gone back and added it for ETMOOC (and probably should) but I wanted to leave that there as a statement so people understand what happens what a resource they count on pulls out the rug from beneath their feet. There is no simple way to go back and collect the missing tweets, but maybe I will add it in, since ETMOOC is still going, going, going…

      1. Got it Alan,

        I found this post hoping from posts to post on a long trail this morning. I read your posts on static web publishing, read the work of Dave Winer, the Google app to collect twitter statements afterward.

        Catching up old #etmooc is surely possible but tricky, not sure it’s worth it.

        Raise two questions.

        Should one update olds posts? My take is yes because we change our perspectives, learned more and reached more comprehensive conclusions. I asked the question on twitter two weeks ago and add different responses. The historic view has also its benefit, if only for learning lessons and this could be one.

        All the services are free but this is apparent only. We pay by giving away our contents, our data, our attention. Based on this I pested against Twitter but they are on their strict rights. They started to warn that search may not be reliable (return all tweets), and streaming will only be a sampling. Again they are on their rights. Free and public services are two different things. It’s very annoying, but we should be aware of it, a bit prepared.

        Sadly RSS will soon be a thing of the past. It’s going to be slowly changed to json. More compact, faster to read, native in many languages. Twitter just anticipated it, as did GoogleReader. For them weeding features is a matter of saving server time (energy) and support overload.

        Let’s beat it, be creative and share our own answers 🙂

Leave a Reply to Bruno Winck Cancel reply

Your email address will not be published. Required fields are marked *