Member Rara Avis
'Twas an interesting evening.
The forums went down Thursday night, about 10:38 EST as best I can tell, and remained down until the wee hours of the morning (see time on this post). We can all thank Nan for letting me know there was a problem (though she blamed the problem on me "doing something"), because we would have lost a LOT of data if I hadn't shut the forums down immediately.
As it is, I think we lost exactly one thread, from Open #36. It was number 003109, but I don't know the title or the author. We owe someone an apology, but I don't know who?
The short story is that we ran out of disk space on the server.
That has the potential to do a lot of damage, because the server keeps right on trucking, trying to write data to a disk that won't accept it. Every time it tries to update an existing data file, like thread 003109, it instead mangles it horribly. That's why shutting down the forum right away was so incredibly important. We were darn lucky Nan emailed me, and darn lucky I was still at the keyboard to get her message.
The long story is a bit more technical.
Every time someone responds to an existing thread, the server saves the post to a database file and then updates the HTML page with the same post. Essentially, everything is saved twice, once in a database and once as a web page. The big exception to that rule is our Archives section. Because those typically get a lot less traffic, the system only builds an HTML page when someone actually clicks on a link. Most of the data is thus only stored once, instead of twice, and that savings equates to massive gigabytes of disk space.
Would you believe someone went through the forums this week and clicked on every single thread in every single archive? That's nearly 200,000 clicks, each one of which created a new HTML page. Ergo, we very quickly ran out of disk space.
Ironically, when a Linux box is out of disk, it's difficult to do much of anything. Even deleting a file requires a few bytes of swap space, and the bigger the file you want to delete, the more swap space is needed. It took me almost two hours of zapping little files to get the disk from 100% used to 99% used, giving me enough free disk to actually be able to do something useful. Getting rid of all those extra HTML files in the Archives took another hour, but quickly took the disk down to only 52% used (which is about where we were last week before our mysterious culprit went click-crazy in the Archives).
Once I found some elbow room, I had to rebuild virtually every single file in the forums. I'm about four hours into the process as I write this, with probably another two hours to go. Yea, it's been a real interesting night.
So, who went crazy in the Archives and created almost 200K files in the process?
Blame it on Google.
p.s. Please, everyone, keep your eyes open for potentially corrupt data files or web pages. I *think* everything is fixed, but a few extra sets of eyes sure won't hurt.