HA stands for High Availability and was at first a computer design approach. Then it evolved into a more complex paradigm and a Linux-based solution. We’ll talk about both of those aspects.
Everything started with the urging needs of more and more available for applications: banking, science, cloud computing, e-mail, gaming and so one. The goal is to get as close as possible from 100% available, which means 0 interruption of service since the starts of the appliance. So far the only known 100% available device is the Global Positioning System network.
Be careful, uptime and availability are not the same things: what if your NAS is still running (so no downtime) and the link has been severed? Your system is up but unavailable.
A design approach: will my architecture resists to disasters?
To guarantee a high availability rate, you have first to look where a failure can potentially arrive. Here are some examples with a way to avoid it; you’ll notice the keyword is “redundancy’”. All those solutions are part of a global HA approach when you design your datacenter or network.
– Power shortage: redundant power sources, backup batteries
– Network connection failure: MPIO, multiple switches
– Hardware failure (excluding HDD): stand-by redundant appliance, Dual DOM
– HDD failure: RAID protection, frequent disk scan to search for errors
– Natural disaster (earthquake, typhoon, snow storm and so on): second distant site, backup on protected support, cloud computing
– Human mistake: training programs, standardized procedures
– Firmware bug: keep your firmware up to date
– The list can go very long; the only limit is your imagination!
The bottom line is no system is 100% safe. But with technological progresses included in HA approach, the availability rates are getting closer and closer from the elusive 100%.
Linux-HA: an Open Source-High Availability Software
Let’s start with a quote from the official website http://www.linux-ha.org/: “The Linux-HA project maintains a set of building blocks for high availability cluster systems, including a cluster messaging layer, a huge number of resource agents for a variety of applications, and a plumbing library and error reporting toolkit.”
What does that mean? That’s actually easy, even if all the programming behind is really a piece of work. A cluster is a group of linked computers working together. But from an outside point of view, you only have one computer: a single IP address, one monitor output, and so one. A cluster contains several units called nodes and Linux-HA helps managing a HA cluster trough several solutions such as Heartbeat. The principle is as following: for the most basic HA cluster, two nodes, one appliance (let’s say a N16000) is active while the other one is on stand-by, synchronized with the first one. If the first N16000 can’t do his job for any reason (natural disaster, network failure, power shortage, whatever you can imagine), the second one automatically starts doing his job, in a heartbeat. This way the service continues without any interruption.
You may want to read this: