Sorry for the Dead Air

In addition to the approaching holiday, which always means more work, I had one of my data centers lose power the other day and the generator barfed. When I tried to manually start it, sparks were flying. It’s a very old generator that came with the building, so it had to be replaced by a new one that periodically tests itself and all that happy stuff. But I’m having to troubleshoot a whole other host of issues that went along with the outage, like how it didn’t get detected until the machines went down when the UPS battery ran out. Turns out I am missing some redundancy I didn’t think about. We actually lost one of the UPSes, and that killed Internet, which isn’t supposed to happen. Either way, I have to fix all this quickly, hence very little time for the blog.

6 Responses to “Sorry for the Dead Air”

  1. Zermoid says:

    Do what you gotta do, ain’t like we are paying for your services here!

    I appreciate the time you do spend on the blog.

  2. harp1034 says:

    Get the business fixed then enjoy your time off. We will catch you later.

  3. John L says:

    who needs 5 nines uptime anyway…

    • Sebastian says:

      5 nines is a ghost people chase :) I don’t think there’s anyone that can really pull it off. Even Amazon goes down more than 5 minutes a year.

      • Patrick says:

        I’ve seen it done, but not with off-the-shelf tech. Everything was custom built, and mostly dead-simple. No operating systems, no extraneous code. Bare metal running assembly with some kind of cross=-linked CPU and memory (every instruction ran through two CPUs/MMUs at the same time in case one had an issue).

        In other words: a unicorn in the wild. It ran for something like 20 years when it was replaced by a modern system that goes down about once an hour, I suspect.

        Good luck with your datacenter. What you experienced happens to the best. Rackspace lost their entire Dallas center (huge, fwiw) a few years back and it sounds exactly like what you saw: redundant systems didn’t play the way they planned. I think their fancy generators never got the signal to start, and the UPS actually turned off instead of supplying power. I got a refund for services that month even though I didn’t complain.

      • Ian Argent says:

        You do it with redundancy. Lots and lots of redundancy.