IP failover - High Availability explained

A popular feature available at Slicehost is the ability to have a failover IP 'shared' between Slices.

This article outlines what that means. Further articles will explain how to set this up on your Slice(s) to allow you to create a High Availability setup.


Standard Slice setup

A standard Slicehost account may contain, typically, one or two Slices.

Let's take a look at one Slice and see how it might be setup to serve a simple website.

The owner installs a webserver and creates a virtual host. After a few minutes of hair pulling, he remembers he needs to create a DNS zone for his domain. Once all that is done, he is very happy as he is now serving his website and all is well.

When a request for his domain comes in, the IP address for the domain is matched to his Slice and the request is forwarded to the Slice. On receipt of the request, the web server (whether it is Apache or Nginx, etc) looks at it's virtual host files and then delivers the content for the domain.

Of course, it is slightly more complicated than that but the basics of the system are that simple.

Works well. Everyone is happy.

Slice issue

But what if there is a Slice issue? Perhaps there is rogue process which uses all the resources, or the server has to be rebooted (even we have hardware issues on occasion!).

Once the Slice is not available, the domain's website is not available. The same basic process is completed in that the IP address of the domain is matched to the Slice, but nothing is delivered as the Slice is not available.

Eeek.

Although not a common issue, this situation could cost you money and time repairing the Slice while you have customers yelling at you.

Failover IPs

This is where Failover IPs come in. You can actually 'share' an IP between two Slices so when one Slice is not available the other takes over the IP address.

For this you need two Slices. Let's keep it simple and call one the 'Master' Slice and one the 'Slave' Slice.

The Master Slice is setup just as described above (with a web server and virtual host or however you want the Slice setup to be).

Next is the Slave Slice. This is actually a mirror of the Master Slice. We need it to be the same as it will take over the duties of the Master Slice is the Master Slice is not available for some reason.

Not automatic

The failover system is not automatic. You need to install an application to allow the failover to occur. There are a couple of ways of approaching this but we will use what is known as 'Heartbeat' for this (full details and instructions on how to install Heartbeat are in the next article).

Heartbeat runs on both the Master and Slave Slices. They chat away and keep an eye on each other. If the Master Slice goes down, the Slave Slice notices this and brings up the same IP address that the Master Slice was using.

This ensures that even though the Master Slice is down, the website will still be served - only this time from the Slave Slice.

High Availability

What this comes down to is creating a High Availability network with your Slices. Your site won't go down. It doesn't matter if one of the two Slice is down as the other one will take over.

Pretty cool.

Multiple Slice setups

Most sites won't need this sort of High Availability setup (for a start you need a minimum of two Slices), but those sites that do often have a Load Balancing front end.

What that means is that a small Slice is used as the front end to a cluster of 'Application' Slices. All requests for the website come to the front end Slice. That Slice then proxies the request to larger Slices running in the backend of the network.

These can be very large Slices running multiple mongrels (for example). Having four or five 4G Slices means a lot of power is available for the site. Often the database is also on another Slice.

All these requests go through the front end Slice.

Have you noted the issue here? It is the same as with the most basic Slice setup. There is still a 'single point of failure'.

If the front end Slice goes down then then whole site goes down. It doesn't matter that the multiple 4G Slices are up and running, if the front end is not available, then the site is kaput.

In comes the failover IP system. You can have two 'front end Slices'. If the main front end Slice goes down, the other Slice simply picks up where the main one left off.

No downtime, no issues.

Summary

There is one thing I would like to say in that it is rare for a server to be rebooted. However, it does happen. We live in the real world and there is the odd hardware failure.

Most sites would not really need this sort of failover IP setup.

However, for those that do need HIgh Availability, this system offers peace of mind. Not only that, it is very easy to setup.

The next article shows how to setup two Slices with failover IPs and Heartbeat.

PickledOnion

Article Comments:

Nadav BBlum commented Tue Feb 24 08:19:55 UTC 2009:

Cool. But what about the db server ? It also needs ip faileover mechanism + replication. Right ?

Michael Glass commented Fri Apr 03 07:30:07 UTC 2009:

Is there a way to guarantee that two slices aren't no the same hardware?

PickledOnion commented Sat Apr 04 18:46:27 UTC 2009:

Michael,

All Slices are automatically placed on different servers within the datacenter.

We can't actually put them on the same server if we are requested to do so.

However, feel free to get in touch at support@slicehost.com and we can check for you.

PickledOnion

Want to comment?


(not made public)

(optional)

(use plain text or Markdown syntax)