This is a second casted blog post– the first nicely composed, well crafted, punctually perfect one gotten eaten as the wireless network of this hotel decided it did not want to post. It’s not letting up upload photos to flickr, and I left it on “publish” when I hit the sack at 3:00 AM.
Okay, backtrack to yesterday. It was raining when our flight left Phoenix (60 minutes late) and raining when we landed in Atlanta (90 minutes late), so it must by one massive cross country downpour. Arrived to a relatively downtown area at the CNN center, the center of something. Met up with my co-presenter Heid to rehearse, and then got to the priority, catching up with colleagues in the bar for food, drink, and conversation (in reverse order?)…. It was after 1:00 Am when I got upstairs.
Waiting were a few emails letting me now some of our NMC WordPress sites were down. A few times in the past, I have had to kick start the mySQL server, but it would not turn over from the WebMin interface. Going to command line, I was unable to do anything do to error messages scrolling like:
Message from syslogd@colo at Sun Jan 21 22:26:16 2007 ... colo kernel: CPU#0: Temperature above threshold Message from syslogd@colo at Sun Jan 21 22:26:16 2007 ... colo kernel: CPU#0: Running in modulated clock mode
Oi. I contacted our hosting company via their web support form and requested a reboot of the server, and was pleasantly surprised when a tech called about 15 minutes later to let me know it was done.
Okay, the temperature warnings were gone. But I was still not able to restart mySQL. I could not do it from command line- I got the informative response… “Error!”
I was reaching now into the far limits or over the edge of my meager server admin expertise, managing after a while to find the error logs -getting finally:
070121 23:35:15 InnoDB: Flushing modified pages from the buffer pool... 070121 23:35:15 InnoDB: Started; log sequence number 0 4182731552 070121 23:35:15 [ERROR] /usr/sbin/mysqld: Error writing file '/var/lib/mysql/xxxxxxxpid' (Errcode: 28) 070121 23:35:15 [ERROR] Can't start server: can't create PID file: No space left on device
“No space on device”?? PID is a process ID file, and it definitely was not where it should be– it was not there at all. Like most of the time, I had some luck with running a google search on an error message “Can’t start server: can’t create PID file: No space left on device” confirming that this was the case- our disk was full. Running the lovely rememberful “df” command, it did confirm the situation.
Deleted a bunch of monstrous backup files and other un-needed big media, and we were back in business.
I’m rather lucky. We’re getting ready to move our whole web server to a bigger, faster, box, and it is not too soon to do so.
So that was the late night at the database bar.
(Psst – Alan. You messed up closing the “pre” after the error log stuff. Don’t worry, I don’t think anybody noticed)
Ouch, I hate running out of space. I have a mysql backup script that mails the backups to my gmail account. I noticed my gmail space growing rapidly, and I checked to db attachments and my WordPress one was huge. I checked it out via phpmyadmin and saw I had a ton of spam comments from spamkarma that were not being deleted. After deleting those, I also had about 500 spam identified by Akismet. They get deleted when they’re older than 15 days, but I figured why not, so I deleted all of them. After optimizing the wp_comments table I saved several meg’s. Unfortunately I get so much spam I could probably afford to do it every day.
I’ve also had problems with logfiles growing too big. On one server, the apache access log reached 2 gig’s, on another the tomcat logs ate up all the space. So if your system doesn’t provide automatic log rotation, you might check on that too.
Good luck with the move.
Todd – yeah, I’ve seen the same thing happen with logfiles. The webserver at my school has been inundated with spam e-mail, so the log files get pretty big. There have been a few occasions when the web site has been down, webmin was down and we couldn’t even secure shell in to the server to check on things. That reminds me – I need to go check the free partition space on the server!