Saturday, October 07, 2006

It has been a while

Seems that I never managed to get solid time to sit down and get this blogging going again. Things have been crazy at work with a new Supply Chain implementation project starting and we actually lost complete power to our server room twice in Sept. This is on top of the one on July 5th, details here. On Sept 11th (yes, 9/11 - talk about karma), an electrician working on installing new power outlets, short-circuited one of the new outlets which in turn created a loopback or something like that which caused our main GFCI (Ground Fault Circuit Interrupter) to kick in before our breakers could. This in turn caused a surge to our centralized UPS which shut down in order to protect the equipment downstream. No power and no UPS, down goes the servers. We eventually got power restored and managed to have all systems up and running by around 9pm (about 12 hrs later). And while recovering from that outage, three days later, an accident on the street took out the transformer and we again lost power to the whole building including the server room. Thank goodness, the UPS kicked in this time and we are able to gracefully shut down all systems/servers until power was restored. Things came up pretty well and we were done by around 3pm. We still got to finalize the recovery process as everything was still pretty much in flux. One thing that worked pretty well was to designate someone as the "incident commander" and have this person drive the whole recovery process which included communication to the various affected business areas.

Hopefully that was it for us for a long time. I will try and keep regular updates and I think for the Oracle Open World, it might be best if I do blog each day so that I don't forget what was done as I will need to debrief our IT guys when I get back from the conference.

