The purpose of this guide is to answer one of the questions I receive most often:
How do I host my web applications in the cloud in a way that is redundant but also inexpensive?
Before you begin reading the guide, try to keep the following things in mind:
- Try to understand what an application is doing before blindly configuring it as the guide states. This helps in two ways: it allows you to begin thinking about ways you can improve your configuration for your specific needs and it will give you tools to fix things when they break later.
- Stay lean. There may be some portions of the guide which may not apply to your application’s needs. Instead of wasting your time on additional daemons that you don’t need, skip over any parts of the guide that don’t apply to your specific application. On the other hand, if you find that your application needs more functionality than this guide provides, be sure to add in extra functionality carefully. See the previous bullet point to understand what I’m talking about.
- This is not the only way to configure a redundant cloud environment. This guide covers the configuration that I like best. If you don’t like a particular daemon or Linux distribution mentioned in the guide, use what you’re most comfortable with or what you prefer.
- Cloud is what you make of it. Don’t be afraid to forge your own path.
- Give me feedback. If you spot something that’s incorrect, or if you find a more efficient way to handle a particular problem, let me know! I’ll be glad to consider it for the guide and you’ll receive proper attribution.
With that out of the way, let’s begin the guide.
The high-level overview
To get an idea of the end result, review the diagram shown below:
There are three main service groups that I need to host my applications:
- Load balancing layer: Two needs are fulfilled at this layer – the distribution of load as well as redirection of traffic away from problematic web nodes.
- Web service layer: As you could imagine, this layer is the workhorse of the entire configuration. This is where web content is served and where web content is stored in a clustered filesystem.
- Database/caching layer: Without this layer, the configuration would grind to a halt. The applications running on the web services layer depend on this layer for rapid storage and retrieval of information.
In order to follow this guide, you’ll need the following:
- Stable Linux distribution – pick whichever one you prefer, but I’ll be using Fedora
- Six virtual machines – anything less than six will get a bit tricky and it reduces your redundancy
- Public and private network interfaces on each virtual machine – not required, but it’s highly recommended
- One extra IP address – this will be your virtual IP address for load balancing (you will need more if you’re hosting multiple sites with SSL, unless you want to use SNI)
- Ability to share an IP between multiple virtual machines – this will be a requirement for LVS-TUN (if you can’t share IP’s, you can try using LVS-NAT, but I wouldn’t recommend it)
- Kernel modules – you’ll need a few kernel modules, or the ability to compile and use them with your running kernel
- Linux kernel 2.6.27 or later – there are some great performance improvements for virtual machines and the fuse module in these kernels (not a strict requirement, but highly recommended)
Step by step
I’ve broken the guide up into functional pieces to allow you to build your configuration and test it along the way. Click on the title of each step to see detailed instructions, diagrams and explanations:
- Step 1: Setting up a redundant database/caching layer
- Includes: setting up MySQL with drbd and heartbeat, installing memcached, testing failover
- Step 2: Communication between web nodes and the database/caching layer
- Includes: configuring haproxy, testing failover
- Step 3: Configuring LVS-TUN and monitoring of web service nodes
- Includes: ldirectord and heartbeat installation on the load balancers, tunnel configuration on web nodes
- Step 4: Wrapping up
- Includes: security tightening and final adjustments
What’s the total cost?
Right now, I’m hosting this configuration with Slicehost with the following setup:
- load balancers: two 256MB instances (2 x $20/month)
- web nodes: two 1024MB instances (2 x $70/month)
- database nodes: two 512MB instances (2 x $38/month)
That adds up to $256 per month for the entire configuration at Slicehost. That price also includes 2.1TB of public bandwidth (since the bandwidth is pooled between instances). The only large consumers of bandwidth are the web nodes since they send out a lot of traffic. The load balancers simply receive requests on the public interface and shuttle them to the web nodes over the private network. The database servers would only talk to the public network for package updates.
If you wanted to host the same configuration with Rackspace’s Cloud Servers, you could do it for as little as $153.30 per month, but your bandwidth would be billed at the utility rates. For low traffic sites, this may be the better priced option.