Review: Gigabyte Brix GB-BXBT-3825

Let’s hope three servers is a charm, as its time for a new server. But this time I’m moving away from the HP Microserver. Why? Well the new server is destined to be a dedicated web server for my sites, ever concerned with security and protecting my network I thought it wise to separate the public facing websites physically from my data, adding an extra layer of security.

The choice was to go for a NUC based machine or Nettop, their small footprint allows them to be placed out of the way, plus they are in keeping with my low power requirements and often fan-less design keeps them quiet. As it’s to be a web server only, the restrictions on a device this size such as space for multiple hard drives, graphics performance and upgradability are not an issue.

Gigabyte Brix GB-BXBT-3825 Continue reading “Review: Gigabyte Brix GB-BXBT-3825”

Spectacular UPS Failure

A bit off topic but I should document what happened in work today. Got called to investigate a burning smell in one of the offices that house all the servers and network head end. The request was placid enough not to cause alarm but when I got to the room the smell hit you as soon as the door opened. Narrowing the smell down, it was coming from a caged off area underneath a desk that held the servers: An ancient IBM RS600 with UPS and two HP Proliant ML350 G5 with a shared UPS in two modules, along with what seemed decades of dust, discarded cables and old computer hardware that had accumulated over the years.

Servers claimed by years of dust
Servers claimed by years of dust

Once I got down there and started to fathom out what cables were in use and what could be safely isolated without stopping operations, the small wafts of smoke could be seen drifting up from under the desk. At this point it was obvious that any timescales for diagnosing the issue was getting smaller along with the grace period before the smoke detectors trigger the fire alarms and clears the store.

On the initial look, I noticed that one of the Proliant servers had a flashing LED next to a power symbol, two and two went together and thought that a power supply had failed spectacularly, so chose to switch it off, knowing the server was just for redundancy.

A minute passed and no let up of the smoke, by this time a CO2 extinguisher, pin pulled, was close at hand. Out of ideas I pulled all plugs from the wall, the RS6000 UPS failed immediately, the Proliants carried on under battery juice with 105mins left according to their UPS display (1 was still powered off). I left it another minute to rule out a problem with an input to the UPS, and with nervous relief the smoke subsided, a few back office systems went down with the RS6000 but the customer end Proliant stayed online.

With the batteries keeping customer facing systems online for a further hour or so, it was a safe time to find the culprit. An extensive sniff test and the UPS for the RS6000 was pointed out as the source of the incident, possibly why it failed as soon as power was cut. It was taken out of commission and bypassed to get the IBM machine back online.

Failed UPS, I'm not so trusting of you anymore
Failed UPS, I’m not so trusting of you anymore

A rather eventful day compared to the normal, mundane non-IT job. I haven’t opened up the failed UPS to see what went wrong, nor would I want to thinking about what state the (probably) lead cells are in.

Major Outage

It had to happen, after years of reliability (apart from an ISP related failure), I had my first hardware related downtime, caused by a power cut that lasted all of one second.

In the electric free event, only a few electricals switched off, my HP Microserver was one that lost power and restarted. Looking at the headless unit after boot, all lights were on, HDD light on full and the network lamp was flickering away as normal. However trying to access the server, even down to a simple ping, it was unresponsive.

Time for investigation, and it was ripped out of its kitchen cupboard home and connected to a TV along with a keyboard and mouse. From there it was painfully apparent that the RAID mirror had been corrupted and the BIOS couldn’t find the OS. The OS drive was in a RAID 1 mirror so I took out the primary master disk (first on the BIOS boot priority) and tried to boot the remaining mirror. This time it started Windows. All was back on track it seemed, waited for the other mirror holding data to re-sync then changed the boot priority in Windows (not BIOS) to use the good OS drive first. A restart to plug the un-synced HDD in and it booted fine, no SMART errors reported with the removed drive and it started to rebuild the system mirror.

Things then took an ugly turn, using the desktop would be as normal for around 90 seconds, then the system would freeze, apart from the mouse for minutes at a time, before coming back to life and displayed applications requested before the freeze. It seemed as if the system was having big problems trying to read from the disk, it would run fine simply moving the mouse around, but when selecting a program it would freeze, and depending on what you requested to load it could be up to 20 minutes. While in this state of freeze, the HDD lamp on the Microserver would be solid, so naturally it pointed to either a bad hard drive or the RAID mirror was having problems.

The HP Microserver in a state of repair
The HP Microserver in a state of repair

Not finding a solution, I admitted defeat and did a fresh install of windows, but still wanted to get the latest backup, the data mirror was easy to recover by just removing the drives as they can be imported on a new install. The OS drive was a bit trickier, the system would freeze if I tried to copy files as is. Luckily I had the old 250GB drive that came with the Microserver, it had Windows 2003 on it and ran on the system until more capacity was required, it was swapped out for a 1TB drive. Not so lucky was that the only software found to copy files from a Foreign RAID mirror cost £50, I shelled out this money as my data was more valuable than the asking fee. Along with an extra 1TB drive to hold the data while I juggle drives it ended up costing me a few quid.

All this from a 1 second power cut.

RAID 1 on a system disk:

Research says this is not a good idea. While it will run with no issues during normal operation, after an unexpected shutdown a RAID controller just can’t tell the difference between a good file and an un-synced, corrupt file, so the controller will either guess, which could restore an out of data file, or create a mismatch of current and out of date files that ultimately brings your OS to a halt.

Major Internet Outage

Last week my websites suffered their first major outage since I got my new server in April 2011. Luckily it wasn’t the server itself, but twas the internet connection that let me down.

I took delivery of a Netgear FVS318N router to replace a basic hub, installed it and did a bit of cable management which involved unplugging my Sagem F@st 2504 that I use as a modem.

However upon powering up the Sagem after tidying cables, it has no life, apart from this strange arrangement of light on the front:

Power Supply failure on Sagem F@st 2504
Power Supply failure on Sagem F@st 2504

I called Sky (my ISP) support who happily informed me that there is a common issue with the power supply to the Sagem router that caused them to fail. Wanting to get back on the net immediately and conversation about a replacement power supply giving vague delivery lead times, I opted to purchase the new Sky branded router (dubbed the Sky Hub):

P1010439

The outage lasted 5 days as I waited for delivery of the new modem. An annoyance of this is that I had a spare, working ADSL modem but this could not be used as Sky does not give out the credentials to log on to their network, instead choosing to pre-load them on the modem before shipping.

Overall its an example of the unexpected issues that can arise when running a home server on a budget.

BOOTNOTE:

It has been mentioned in many Sky internet forums that using an unapproved Sky router, i.e. one not supplied by Sky, will be in breach of the Terms & Conditions. However whilst on the phone to Sky broadband technical support the representative told me that it was acceptable to use a 3rd party router if the user was confident and acknowledged that no support would be given unless a Sky provided router was used.

The case may be that you still need to hand over the cash to Sky for one of their routers and keep it to hand, but after that the choice is yours!

RPi kills my internet

It was all going so well, got my Raspberry Pi and after the initial fiddle with Debian Squeeze I got another SD card and put Raspbmc on it, things were great!

Only niggle in my head was that the card I put Raspbmc on was 8GB, and that bigger card would be put to better use in my camera that was using a 4GB card. I thought it would be no problem to reformat cards and swap them over?

Wrong!

The 8GB in the camera was fine, and I used the Raspbmc installer as before to load it on the new SD card. The trouble was that when first booted up the Pi, it seemed to freeze on the

Sending HTTP request to server

No problem I thought, hop on my laptop and find out if other users experienced the same. But low and behold the internet on my laptop ceased to to work, with strange requests for proxy passwords to sites like Facebook and even the Weather gadget on Win 7!

First thoughts were that I cooked my router, as I been downloading a lot and on a warm day to (yes there was a warm day … I think!). But after it was off for as long as I could stand, powered it back on and normal service was resumed.

After rebooting all network equipment it finally dawned that the internet would go down for everything connected to my network when the Pi was powered up! I had never experienced this before and could not for the life of me fathom it out. I thought that it had a defect in the Pi meant that some sort of power surge was knocking out the system? This was quickly dismissed as local traffic was unaffected, meaning the network hardware was operating normally.

A quick glance at my Sky broadband supplied Sagem F@ST 2504 modem showed the internet connection had failed, with the internet indicator glowing orange with a red pulse every second. Stranger still, upon unplugging the Raspberry Pi, connection to the net restored within  seconds!

Drawing4

So how can a network device have the ability to target and destroy an internet connection? Its my understanding that a Pi has no ability to retain settings other than whats stored on a SD card, but this issue continued when using two different memory cards.

Drilling down to an extreme form of troubleshooting, all network devices, including my second switch/access point was disconnected from the Sagem router. leaving just the Pi connected. Then from Midori on Debian Squeeze (remembering that the internal network was unaffected) rebooted the router using the web interface.

Suddenly the Pi could connect, attaching my whole network back together I found that everything was back to normal,

Laptop, Pi, iPhone, everything!

And this is the worst thing, I don’t know what caused this, and what I specifically did in the reboot process that solved it?

So I would love to hear if this has happened to you, and if there was something you can pinpoint as the issue? This one has got me completely stumped!