cc licensed ( BY NC ) flickr photo shared by Thomas Hawk

Last week I wrote about the ideas and content changes I had in mind for the next semester of ds106. The car polish is looking shiny. Over the last few weeks I had out my WordPress wrenches, calipers and engine pulls to do some work under the hood. The current site is coming up on two years old, and has been a great example of growth by accretion, experimenting, adding things on. That is all part of running experiments.

But it also got a bit wobbly last semester; no one likes their web site going down, but we were sure ringing the buzzer frequently at our hosts Castiron Coding for server restarts. It wasn’t clear if it was the database, the demands on the server processes, but I took it as a mission to keep happy the unicorns that run the server room

server room unicorns

First of all, our site is a WordPress multi site that not only servers up ds106 covering numerous sections taught over the last 2 years, but also the Assignment bank, the Daily Create, the Assignment Remix Machine, in[SPIRE], plus the archive of Camp Magic Macguffin, and (just unearthed) the May 2011 class taught by Martha Burtis. We also run a MediWki install as a content engine for our documentation (using the Wiki Embed plugin).

The database was and is over 600 Mb. More on that later. There was a long list of plugins, active and inactive, quite a few were not in use.

I’ve done quite a bit and hope I can remember it all! An off the cuff summary…

Before anything was done, I made a duplicate of the entire database. We have full backups done elsewhere by our hosts, but having my own on the database meant I could revert a table if I really munged things up.

Moving the Blog Flow Off the Front Page This was part of the redesign, but by outting the chronological flow of posts to an interior page (http://ds106.us/flow) likely can help for people who visit the site lust looking for info or a link elsewhere. To view blogs, thats a series of database requests for each view.

So Long BuddyPress I like the concept of BuddyPress, and until Martha created the new registration system that adds blogs to FeedWordPress, we needed it to collect profile info. But we made rather light to little use of the community features. but we found that we ask students and participants to connect in so many other spaces (Twitter, blog comments, flickr, youtube, etc) that it is not reasonable to have them come to yet another place to connect.

And BuddyPress has overhead on the site, so I de-activated it- and exported/deleted the database tables. It took some digging to change up the login/registration page of our Salutation theme, which was designed (and selected) to work with BP. Sorry Boone. We may come back at a future date.

I bet this helped a fair amount.

Prune the Categories The way we have Feedwordpress set up now to deal with tags/categories is not exactly how I would do it from a new start. Basically all tags and categories that people use on their blogs are converted to categories on ds106.us . But we also use categories to organize some of our content, like the assignments for my current class. And because we ingest a lot of blog tags and categories, the database was chocked with 20,000 categories. FWIW we use WP tags to mark the content of incoming feeds, so we can do things like organize just the posts from my current class.

I do not think it made an impact on the public site (especially once we took off a category cloud widget), but any editing interface that listed categories (posts view, edit post) was choking slow because it was reading in 20,000 terms.

Now the tags/categories people use are interesting to study, but once we import a post, and for assignments they are then re-syndicated to the Assignments site, we do not even need these anymore.

I struggled a bit to find a way to archive/delete the user imported categories from the ones we needed to keep (about 4 in all). This ended up a manual job! I went to Posts -> Categories. After exporting the 3 ket tables (wp_terms, wp_taxonomy, wp_term-relationships) in the admin screen options, you can change the number of items you view from 20- I bumped it to 100. Then viewing at 100 per time, check all (an uncheck key ones I needed to keep) and deleted them. It made for something to do while watching a movie.

But then I noticed that in a few days, the old categories were coming back, and they climbed quickly to 3000. I realized that Feedwordpress checks feeds and updates local copies to match, so it was undoing all of that work! The solution turned out to be adding the FeedWordPress add on to Limit Posts By Date – this allows me to universally keep FeedWordPress from checking feed items older than a week. That means if you mis a tag in a post, you have a week to correct it. It seems like a reasonable balance.

Plus if someone adds a blog, it keeps the site from reaching too far back in its history.

I did not see much change in site performance, but the admin interface got a lot more usable.

Still, there is a fascinating world of tags people use, I saved a few screen caps of ones that made me chortle.

tags

tags2

If anybody really wants to explore these, I saved the exported tables.

For anyone who managed to get this far, my current strategy for setting up FeedWordPress would be to keep the internal terms separate from imported ons

  • Convert all inbound categories and tags to become WP tags.
  • Use WP categopries to organize the content on my site, e.g. course info, announcement, etc
  • And I would start off with FWP Limit POst By Date as a rule just to keep you from grabbing anyone’s posts that are older then of interest.

Nixing our Twitter RSS Subscribing
We had Feedwordpress set up to subscribe to the twitter search results for #ds106 with the idea of archiving tweets. Initially these were brought in as posts, and last year I managed to separate them out as their own custom post type. We did not really do anything with this, and I heard the Twitter’s RSS might be going away. Besides, we have one of Martin Hawksey’s magical spreadsheets set up to archive tweets, and it does a better job of grabbing them.

So I nixed that feed, archived the post data, and removed them.

Minor change.

Pruning the PLugins
I wish I counted how many plugins we had installed, it was at least 70, maybe 80. There are some 53 now active on the network, with usage by site:

  • ds106 (19)
  • assignments (11)
  • daily create (9)

Figuring out what was used or not was a bit of just seeing which ones were activated, in many places guessing at the functionality. I know they can be just be de-activated. So I did that first, and if I saw no adverse effects later, I deleted the plugin (that may not be necessary, but I like a clean house).

I found a few more by looking at the source of the web sites and looking for extra style sheets or js libraries that plugins load. I noted typekit, a font rendering one, that we did not sue at all, and I stopped using Cufon because while it made pretty text titles, I hated how a user could not copy/select the text on those pages.

We had a few generations of Wiki include plugins, and quite a few widget making ones not in use.

Updating WordPress and plugins. We were only about One or two decimal places behind. This was the easiest. One click. No problems seen.

Deleting Old Spam Comments
Who needs ’em? Akismet usually takes care of this, but a few sites did not have the setting right.

Killing the Comment Count
I was still noticing some really long page load times when we were displaying the syndicated blog posts. I realized it was the nifty code that Martha had created to access the number of comments per aggregated blog post.

Closing a Small Spam Account Door
I changed the WP network options to Disable registrations. Since we are not using Buddypress, and all of our accounts are created by the registration scrips, there is no need to have a “Create Account” link on the wp login screens. We saw “people” creating these accounts; they could not do anything as subscribers, but they were look having roach poop in te kitchen.

It is clever, it looks to the feed if there is a feed for comments (which you get with most, but not all blogs). But it means for every post, you are making a feed request (and HTTP request) and parsing the results to count. That means to display 10 posts, you are doing an additional 10 http outward requests.

Just by taking out this code, the page display almost lept out of the page.

Now I like having this fature on our site. I am thinking it needs to be recoded in some method where this checking is done in the background as a cron process, and maybe only for posts 2-3 weeks old, and then it caches the results as post meta data. Well, that’s what I see, some sort of caching of this info needs to be done.

That’s most of what’s been done. The real test will be in a few weeks when our students are banging the server hard and the aggregation amount spikes.

There are still more things to consider:

  • Experiment with caching plugins. We are not running any now, I speculated if it was feasible given how dynamic the site needs to be.
  • Cleaning out the Syndication Bus I might see 20-30 feeds that are broken or rdefunct. They need to be checked and likely turned off.
  • Clean Post Revisions This is not a big deal on ds106 since 99% of the posts are syndicated. But its worth a peek.
  • Clean the wp_options tables. These are some of the largest ones on the database, the ds106 one is over 200 Mb. Apparently the bloat is from settings of un-used plugins (?) I have read some of the approaches, a database query, that I need to run some tests on.
  • Clean the wp_term_relationships tables. These remain large too, even with deleting all of those categories. I am betting there is a query to remove relationships for terms that no longer exist.
  • strong>Clean the wp_postmeta tables. Same thing, there is probably post meta for posts that no linger exist.
  • Clean un-needed user accounts A bit of a longer story; our script is not quite exactly associating feeds with suer accounts, so it ends up creating subscriber user accounts for blog feeds. I am still trying to sort this out.

There are probably all kinds of performance enhancers (not those kind!) that someone who really knows their WordPress / server chops can do. I am using a big old club on the engine and guessing where to bang it. And probably a whole raft of things could be done with more AJAXy transactions.

The site does feel more light and responsive, but maybe that is just my optimism. I hope we are off and racing.


cc licensed ( BY NC ND ) flickr photo shared by kaneda99

Profile Picture for Alan Levine aka CogDog
An early 90s builder of the web and blogging Alan Levine barks at CogDogBlog.com on web storytelling (#ds106 #4life), photography, bending WordPress, and serendipity in the infinite internet river. He thinks it's weird to write about himself in the third person.

Leave a Reply

Your email address will not be published. Required fields are marked *