2 Servers 1 UPS, Windows 2012 Edition

In a previous post I showed how to shutdown two servers safely using just one UPS with a single communications port. It was pretty straight forward with the comms port connected to a Windows Server 2003 machine.

But doing the same with Windows Server 2012 is much more difficult, since Microsoft decided to remove the ability to run a program on a low battery event from its power management settings. To make things worse I discovered that a bug in Server 2008 and later meant that issuing a Shutdown command from the native power settings would not perform a clean shut down, instead killing the power in a few seconds. This is not good news for RAID arrays and data integrity.

Time for a new solution, and since Microsoft are of no use, help would need to come from a 3rd party. After research and testing answer came from Shutter, a small program that runs as a trigger and event type program for a variety of different scenarios, with battery discharging status being one. Luckily two instances of the program could be run, one to shut down the remote servers and another for the host machine. Importantly the program can also be run as a Windows service, but more on this in the walk through.  here is how it is done:

Continue reading “2 Servers 1 UPS, Windows 2012 Edition”

2 Servers, 1 UPS

2 Servers, 1UPSWith the new server up and running it seemed fitting to connect it to my UPS, and thanks to the low power consumption of the HP Microservers I still get around 40 minutes of battery only time with both servers running before the UPS runs out of juice.

Unfortunately, the UPS in use only has one monitoring port which is connected to my original server, meaning during a spell of prolonged power outage, the new server will not know when to shut down safely and will continue to run until the UPS runs out of battery supply, leaving it vulnerable to data corruption that affected the original server prior to battery backup.

But with a simple script and some setup both servers can shut down safely before the batteries run out.

Continue reading “2 Servers, 1 UPS”

UPS Investment

From the last post, the idea of having UPS in my home may put me off forever, but to put it into context the unit had been installed before I was employed over 12 years ago, and over the past 2 years it was beeping intermittently to indicate a fault that a convenient press on any button would silence.

The post before that however had more gravitas, with my server being offline for close to a month all due to a one second power cut made me feel vulnerable to another downtime incident over something I couldn’t control. It was time to look into a Uninterruptable Power Supply to protect my server from power cuts that could knock my RAID out of sync.

Continue reading “UPS Investment”

Spectacular UPS Failure

A bit off topic but I should document what happened in work today. Got called to investigate a burning smell in one of the offices that house all the servers and network head end. The request was placid enough not to cause alarm but when I got to the room the smell hit you as soon as the door opened. Narrowing the smell down, it was coming from a caged off area underneath a desk that held the servers: An ancient IBM RS600 with UPS and two HP Proliant ML350 G5 with a shared UPS in two modules, along with what seemed decades of dust, discarded cables and old computer hardware that had accumulated over the years.

Servers claimed by years of dust
Servers claimed by years of dust

Once I got down there and started to fathom out what cables were in use and what could be safely isolated without stopping operations, the small wafts of smoke could be seen drifting up from under the desk. At this point it was obvious that any timescales for diagnosing the issue was getting smaller along with the grace period before the smoke detectors trigger the fire alarms and clears the store.

On the initial look, I noticed that one of the Proliant servers had a flashing LED next to a power symbol, two and two went together and thought that a power supply had failed spectacularly, so chose to switch it off, knowing the server was just for redundancy.

A minute passed and no let up of the smoke, by this time a CO2 extinguisher, pin pulled, was close at hand. Out of ideas I pulled all plugs from the wall, the RS6000 UPS failed immediately, the Proliants carried on under battery juice with 105mins left according to their UPS display (1 was still powered off). I left it another minute to rule out a problem with an input to the UPS, and with nervous relief the smoke subsided, a few back office systems went down with the RS6000 but the customer end Proliant stayed online.

With the batteries keeping customer facing systems online for a further hour or so, it was a safe time to find the culprit. An extensive sniff test and the UPS for the RS6000 was pointed out as the source of the incident, possibly why it failed as soon as power was cut. It was taken out of commission and bypassed to get the IBM machine back online.

Failed UPS, I'm not so trusting of you anymore
Failed UPS, I’m not so trusting of you anymore

A rather eventful day compared to the normal, mundane non-IT job. I haven’t opened up the failed UPS to see what went wrong, nor would I want to thinking about what state the (probably) lead cells are in.