Auto Scaling with Elastic Load Balancing

Along with the ability to Setup Elastic Load Balancing, which I showed you how to setup in my previous post. Amazon provides the ability to auto scale your instances. Auto Scaling groups allow you to set up groups of instances that will scale up and down depending on triggers you can create. For example you can set up a scaling group to always have 2 instances in it, and to scale up another server if the CPU utilization of the servers grows over a certain threshold. This is extremely helpful when you receive unexpected traffic and you are unable to react in time to add new instances. The beauty of Auto Scaling in conjunction with Elastic Load Balancing is that it will automatically assign the new instances to the load balancer you provide.

Creating an Auto Scaling Launch Config

The first step in setting up Auto Scaling is to create a launch config. The launch config is used to determine what ec2 image, and size (small, medium, etc) will be used to setup a new instance for your Auto Scaling group. To setup a launch config you will call the as-create-launch-config. For example to create a new launch config called auto-scaling-test that would launch the image ami-12345678 of size c1.medium you would run the following command:

as-create-launch-config auto-scaling-test --image-id ami-12345678 --instance-type c1.medium

Create an Auto Scaling Group

The next step to enabling Auto Scaling is to setup an Auto Scaling Group. An Auto Scaling group tells Amazon what zones you want your instances created in, the minimum and maximum number of instances to ever launch, and which launch config to utilize. To create an Auto Scaling group you will call the as-create-auto-scaling-group command. For example if you wanted to create a new group with a name of auto-scaling-test using the availability zones of us-east-1a with a minimum number of instances being 2 and a maximum of 4 using our newly created launch config you would run:

as-create-auto-scaling-group auto-scaling-test --availability-zones us-east-1a --launch-configuration auto-scaling-test --max-size 4 --min-size 2

When this command is executed 2 new instances will be created as per the directions of the launch config. the as-create-auto-scaling-group can also take be linked to a load balancer. Thus if we wanted to have this group setup with the load balancer we created in the previous article, you would run:

as-create-auto-scaling-group auto-scaling-test --availability-zones us-east-1a --launch-configuration auto-scaling-test --max-size 4 --min-size 2 --load-balancers test-balancer

After execution this would setup 2 new instances as per the instructions of the launch config, and register these instances with the load balancer test-balancer.

Creating Auto Scaling Triggers

Triggers are used by Auto Scaling to determine whether to launch or terminate instances within an Auto Scaling Group. To setup a trigger you will use the as-create-or-update-trigger command. Here is an example using the auto scaling group we created earlier:

as-create-or-update-trigger auto-scaling-test --auto-scaling-group auto-scaling-test --measure CPUUtilization --statistic Average --period 60 --breach-duration 120 --lower-threshold 30 --lower-breach-increment"=-1" --upper-threshold 60 --upper-breach-increment 2

Lets walk through what this command is doing. Basically what this command is saying is, create a new trigger called auto-scaling-test. This trigger should use the auto-scaling group called auto-scaling-test. It should measure the average CPU utilization of the current instances in the auto scaling group every 60 seconds. If the CPU utilization goes over 60% over the period of 120 seconds launch 2 new instances. Alternatively if the CPU utilization drops below 30% over the period of 120 seconds terminate 1 of the instances. Remember that the trigger will never terminate more instances than the minimum number of instances and it will not launch more instances than the maximum number of instances as defined in the Auto Scaling Group.

Shutting Down an Auto Scaling Group

Initially shutting down an Auto Scaling group can be a bit tricky as you cannot delete an Auto Scaling Group until all the instances are terminated or deregistered from the group. The best way to terminate an Auto Scaling group, it’s triggers and launch config is to do the following steps:

  • Delete all triggers
  • Update the Auto Scaling group to have a minimum and maximum number of instances of 0
  • Wait for the instances registered with the Auto Scaling group to be terminated
  • Delete the Auto Scaling group
  • Delete the launch config

To do this with the examples we used above we would issue the following commands:

as-delete-trigger auto-scaling-test --auto-scaling-group auto-scaling-test

as-update-auto-scaling-group auto-scaling-test --min-size 0 -max-size 0

as-delete-auto-scaling-group auto-scaling-test

as-delete-launch-config auto-scaling-test

With those 4 commands you can easily delete your Auto Scaling group as well as any launch configs or triggers that are associated with it.


Auto Scaling provides an easy and efficient way to grow and shrink your hosting solution based on your current traffic. In events where you are hit with unexpected traffic Auto Scaling can provide a failsafe by automatically launching new instances and scaling up your solution to meet this new traffic. When the traffic subsides Auto Scaling will scale down your solution so that you are not wasting money by running more instances than you require.

One thing to note is that if you know before hand that you will be receiving a traffic spike at a specific time, it may be more benefitial to launch new instances manually before the spike. This will save your system from getting hammered before having the Auto Scaling launches new instances to cope with the additional load. If you rely on Auto Scaling alone in this scenario you could see many requests at the start of the heavy traffic timeout or fail as the minimum number of instances likely won’t be able to handle the traffic load.

Load Balancing with Amazon Web Services

In my previous post I talked about load balancing with HAProxy. In this post I am going to discuss load balancing with Amazon Web Services, using Elastic Load Balancing.

Elastic Load Balancing vs HAProxy

Elastic Load Balancing (ELB) was introduced by Amazon Web Services in May of 2009. Elastic Load Balancing distributes traffic across multiple EC2 instances and scales up as your traffic increases. This all sounds like a perfect solution for your load balancing. However it does present a few issues. The rate at which ELB scales up is often a little slow. This means that if your website typically does not get a lot of traffic, when it does it a giant spike it will likely take the load balancer some time to catch up to the demand of the new traffic. If your traffic increases are more gradual however, ELB appears to scale up a lot smoother. For more information on tests of Elastic Load Balancing’s performance see this article:

HAProxy on the other hand will handle large traffic spikes very handily. In addition HAProxy gives you the power to fine tune your load balancing, as well as additional features such as error pages, real time statistics, and the use of acl’s. However not everyone requires these features, so you will need to decide for yourself if you want to go with the easy solution, or the more robust route. I will describe how you can setup Elastic Load Balancing, as well as how to auto scale with ELB or by using HAProxy.

Setting Up Elastic Load Balancing

Setting up Elastic Load Balancing is as easy as running a single command using the Amazon API Tools. To create a new load balancer you will run the elb-create-lb command. When you run the command you will specify the name of the new load balancer, the availability zones you want to use, the incoming port and the protocol to use for it, and the port the traffic should be sent to on the instance. For example if we want to create a new load balancer called test-balancer, in the us-east-1a zone, that is listening on port 80 for http traffic, and will send traffic to port 80 on our instances we would run:

elb-create-lb test-balancer --availability-zones us-east-1a --listener "lb-port=80,instance-port=80,protocol=http"

Executing this command will return the DNS name of the new load balancer that was created. You will use this DNS name to point traffic to so that it gets distributed by your load balancing. Unfortunately the recommended way to point this traffic is via a DNS CNAME. The reason this is a problem is that you cannot use a CNAME with a root domain. For example I would not be able to point to the load balancer. I could however point Using a sub domain appears to be the common work around people are utilizing. The other option is to do a lookup of the DNS name returned from the command and using the IP address. This approach you will constantly have to monitor the DNS name from Amazon however, as they are not utilizing static IP’s for load balancing.

Once you have load balancer setup you will need to associate running EC2 instances with the load balancer. To do this you will run the command elb-register-instances-with-lb. This command allows you to register an instance with a specific load balancer. For example if we had 2 instances with instance id’s of i-123456 and i-123457 that you wanted to associate with the test-balancer load balancer you would run:

elb-register-instances-with-lb test-balancer --instances i-123456 i-123457

In the same manor you can use the elb-deregister-instances-with-lb to remove instances from a given load balancer. Finally if you wish to view all the load balancers you currently have setup you can utilize the elb-describe-lbs

Ultimate it is that easy to setup load balancing with Amazon Web Services. If you do not require all the features of HAProxy, and want something quick and easy to setup, Elastic Load Balancing cannot be beat.

Load Balancing with HAProxy

There comes a time for many websites when their single hosting solution just doesn’t provide enough power to handle the amount of traffic they are receiving.

Scaling Basics

One solution is to simply get a more powerful server. However this is generally expensive, and doesn’t allow you to scale your solution nicely when you reach your next plateau of performance. This is an example of vertical scaling. Typically you want to avoid scaling out in this way as it is usually less cost effective, and harder to continue to scale.

The alternative and preferred method is horizontal scaling. Rather than add more hardware to your single server, you instead add additional servers. To do this you will need one server (or two or more if you want to be redundant) to act as a load balancer, and then several other servers to host the content (lets call these servers the application servers). In this way as you continue to grow your website and you need more power, you simply add another application server, and modify the load balancers configuration. It is that simple.

Configuring Load Balancing with HAProxy

HAProxy is a lightweight, high performance TCP/HTTP load balancer. It can be run on a budget server without any problems. Under extreme traffic you are more likely to saturate your network before your server running HAProxy runs out of memory or hits a performance wall.

Setting up HAProxy is pretty easy. Under most current Linux distributions you can simply install the application via the package manager (apt, yum, etc). If you have to compile HAProxy it is a simple ./configure, make, make install.

After installing the application you will want to edit your configuration file. The configuration file can be broken up into 4 parts. Global settings, Default settings, Frontend settings and Backend settings.

Global Settings

In your global settings you are going to setup options such as the user and group to run the application as, where to log to, whether to run as a daemon or not, and so on. Here is a typical global section that I use:

        log   local0
        log   local1 notice
        maxconn 50000
        user haproxy
        group haproxy
        chroot /var/chroot/haproxy

The first 2 lines are setting up logging to the syslog daemon. The next line sets “Sets the maximum per-process number of concurrent connections”. It is important to set this number high so that users don’t go unserviced. The next two lines are setting the user and group. Finally the last two lines say to run the server as a daemon, and to chroot the server to /var/chroot/haproxy for security reasons.

Default Settings

A “defaults” section sets default parameters for all other sections following
its declaration. So you can use this section to setup common configurations before fine tuning your frontends and backends. Here is a typical defaults section I use:

        log       global
        mode    tcp
        option   httplog
        option   forwardfor
        retries   2
        maxconn       50000
        timeout connect    10000
        timeout client        30000
        timeout server       60000
        stats uri /ha_stats
        stats realm Global\ statistics
        stats auth myusername:mypassword

The first line tells HAProxy that all logging will be the same as the global settings. This ensures that all requests will get logged.
The next line sets the default connection mode to be tcp.
The option httplog will give you more in depth logging compared to the very sparse logs that are produced otherwise.
The option forwardfor will enable the X-Forwarded-For header to get sent to your application servers. This is important if you want to log the remote ip address of your requester on your application servers.
The retries option sets the number of attempts HAProxy will try when connecting to an application server. Best to keep this to a low number of 2 or 3.
The redispatch option will allow the final connection attempt to be made to a different application server. This is helpful if one of your application servers is under high load, as the request could then be sent to another server.
The maxconn line is the same as the global section.
The next 3 lines specify timeouts. timeout connect is the amount of time before making the initial connection to the application server times out, timeout client is the time for a client to make a response before they get timed out, and finally the timeout server is the time for the application server to respond before it is timed out. You may want to set a higher timeout server if you are running longer requests from your server. For example I know in drupal, running the drupal cron can take a long amount of time before the server sends any response back data, so to ensure the request doesn’t time out a high value here is needed.
Finally the last 3 lines specify a web accessible area where you can view HAProxy statistics. The uri states what uri to access the stats at (you want this to be unique so that it won’t collide with any webpages you are load balancing), the realm is simply the realm of the htpasswd that will be asked, and finally the auth specifies the user and password.

Frontend Settings

The frontend settings are used to describe sets of listening sockets that will accept connections from a client. A typical frontend section I like to use is:

frontend www *:80
        maxconn 40000
        mode http
        default_backend www_farm

The initial declaration gives the frontend a name (www) and sets it to listen on all ip’s on port 80.
The maxconn option is similar to above. As with the global setting you will want to set this setting extremely high as you won’t want clients to timeout while trying to connect.
The mode I am now setting to http. By setting the mode to http additional checks are made to make sure the request is a valid http request.
Finally the default_backend specifies what backend set of application servers will have this traffic load balanced to. Note that a single backend section can have multiple application servers.

Backend Settings

The backend settings are used to specify a set of application servers that will handle the requests being load balanced through the frontend. Here is a typical backend section:

backend www_farm
        mode http
        balance roundrobin
        server backend-server-1 maxconn 250 check
        server backend-server-2 maxconn 250 check

The backend option determines the name of the backend you are defining. Note you use this name in the frontend section.
The next line is once again setting the mode to http.
The balance option sets the type of load balancing you want to do. I find roundrobin to be the most effective. It will send each new request to a different application server in order. There are many different balancing types you can set, and it is best to view the documentation to find the one that suits you best.
Finally the server options are used to specify each application server in this backend that will receive traffic. You give the server a name, then specify the IP and port, finally you can add some optional settings. I like to set a maxconn which is the maximum number of connections the application server can receive at a given time (this will help prevent your application servers from getting overrun with requests). I also like to add the check setting. The check setting will periodically check the application server to see if it is up and healthy. If the application fails this check 3 times it is removed from the rotation of load balancing. This is also helpful for allowing applications to cool down when you are getting hit by a lot of requests.

The Full Config

As you can see HAProxy is really easy to setup and configure. It has a lot of features that I didn’t talk about, and I suggest checking out the documentation on their website at: Here is the full config from above:

        log   local0
        log   local1 notice
        maxconn 50000
        user haproxy
        group haproxy
        chroot /var/chroot/haproxy

        log       global
        mode    tcp
        option   httplog
        option   forwardfor
        retries   2
        maxconn       50000
        timeout connect    10000
        timeout client        30000
        timeout server       60000
        stats uri /ha_stats
        stats realm Global\ statistics
        stats auth myusername:mypassword

frontend www *:80
        maxconn 40000
        mode http
        default_backend www_farm

backend www_farm
        mode http
        balance roundrobin
        server backend-server-1 maxconn 250 check
        server backend-server-2 maxconn 250 check

Advanced Features of HAProxy

The above configuration is a pretty basic configuration for HAProxy and is enough to get a pretty solid load balancer up and running. There are a few additional settings I find useful however, which I will outline below.

Error Files

The errorfile option in HAProxy allows you to specify an html page to display to a client when an application server returns an error code. If you have ever seen those Twitter whale messages saying their servers are over capacity then you will know what I am talking about. An error file can be defined in any section of the HAProxy config file. I typically set it up under the default section unless I want different error pages for different sites. You can setup custom messages for the following codes: 400, 403, 408, 500, 502, 503, and 504. Basically what the directive looks like is:

        errorfile 400 /path/on/haproxy/server/400.http

Where 400 is the error code, and the 400.http is the http response file to load. It is important to note that the file should contain the full http response back to the user. For example here is a sample 400.http file:

HTTP/1.0 400 Bad request^M
Cache-Control: no-cache^M
Connection: close^M
Content-Type: text/html^M
<html><body><img src="" /><h1>400 Bad request</h1>
Your browser sent an invalid request.

It is important that you include the proper carriage returns for an HTTP response. Also it is highly important not to display images that are hosted on your load balancer! The reason is, if you have an over capacity error for example, and all your application servers are at very high load, if you display an image that is hosted on your load balancer, then you will be sending another request to your already over capacity application servers, causing even more load. That is why it is important to host these images off site, such as on a CDN or a free file sharing site like imageshack.


ACL’s are what make HAProxy so versatile as a load balancer. With ACL’s you can define tests and send traffic to different backends based on those tests. For example you may have a set of application servers running Apache/PHP which are hosting a set of websites. Then you have another application server running Ruby on Rails for a different set of websites. Rather than have an application server running both Apache/PHP and Ruby on Rails, you could have two separate groups of application servers. 1 group running Apache/PHP the other running Ruby on Rails. Finally rather than having 2 load balancers to balance the traffic to these applications, you could setup an ACL in your HAProxy config to send traffic to the correct servers, and have all the traffic go through your single HAProxy server.

So lets pretend we have 4 servers. 2 servers are running Apache/PHP with the IP’s of and The 2 other servers are running Ruby on Rails with IP’s of and We will setup 2 backends in our configuration for these servers. It will look something like this:

backend apache_php_farm
        mode http
        balance roundrobin
        server apache_server0 maxconn 250 check
        server apache_server1 maxconn 250 check

backend ruby_on_rails_farm
        mode http
        balance roundrobin
        server ruby_server0 maxconn 250 check
        server ruby_server1 maxconn 250 check

We now have our backend’s defined. We now need to setup an ACL to send the traffic to the correct backend. Lets say in our example that you have 2 website that are on Ruby on Rails called, and Lets assume the rest of your websites are running on Apache/PHP. What we can do is setup an ACL for the ruby domain, and then send all other traffic to the Apache/PHP servers. We would do this in the frontend section. It would look something like the following:

frontend www *:80
        maxconn 40000
        mode http
        acl ruby hdr_sub(host)
        use_backend ruby_on_rails_farm if ruby
        default_backend apache_php_farm

With 3 lines in the config we have now separated out our Ruby traffic from our Apache/PHP traffic. Lets dissect the configuration lines.

The acl line (under the mode http) starts by giving the acl a name, we gave our acl a name of ruby. The next option sets the criteria for matching. In our case we used hdr_sub(host). This criteria will return true if the host being accessed contains one of the strings that follow it. Finally we list all the domains that we want to match the host with.

On the next line in the config we tell the load balancer to use the Ruby on Rails backend if our acl matched true. Finally we set the default backend to be our Apache/PHP servers.

There are a lot of different criteria you can use for ACL’s so I highly advise looking at the HAProxy Documentation for for further clarification.

Wrapping Up

HAProxy is an easy load balancer to setup and configure. It is fast, lightweight, and extremely powerful. I will show you in a future blog post how with a simple cron script you can auto-generate your HAProxy configuration such that you can auto-scale your application using Amazon Webservices.

Resyncing a Slave to a Master in MySQL with Amazon Web Services

My latest blog post on Resyncing a Slave to a Master in MySQL made me think of how I could achieve the same result in the cloud using amazon web services (AWS). Of note, the previous guide would work just fine in the cloud to resync a slave and a master, however there is an alternative way using the tools the AWS provides.

Lets assume that in the cloud we had two ec2 instances setup running mysql in a master slave configuration, and that each server was using Elastic Block Storage (EBS). Essentially the procedure would be very similar, apart from the transfer of data from one machine to the other.

Amazon Web Services provides the ability to create snapshots of an EBS volume. Creating a snapshot is quick, and once it has been created, you can mount the snapshot on any of your instances. This makes the backup portion even easier.

From my previous article you would proceed with steps 1-6. Then instead of step 7 you would first create a snapshot of the slaves EBS volume. You can either do this via the firefox ElasticFox plugin or the command line tools as follows:

ec2-create-snapshot VOLUME_ID -d "Mysql snapshot for resync of slaves"

Once the snapshot is created you then need to create a new volume for the snapshot. Again in ElasticFox this is as easy as right clicking the snapshot and selecting “create a new volume from this snapshot”. From command line:

ec2-create-volume --snapshot SNAPSHOT_ID -z ZONE

Finally our last step is to attach this volume to the master MySQL instance and mount it. In ElasticFox you right click the new volume and select “attach this volume”. From command line you would run:

ec2-attach-volume VOLUME_ID -i INSTANCE_ID -d DEVICE

Once the volume is attached you can them mount it on the master server to a temporary folder, copy over all of the folders you need to the mysql directory, then proceed with step 8 of my previous guide. When everything is working correctly you can then unmount the newly created volume from the instance, and delete it.

Overall the two approaches are very similar, and both will work. File transfer between instances in the cloud is extremely quick, however so is creating new snapshots and attaching them to instances. Ultimately it is up to you to decide which route to take.

Resyncing a Slave to a Master in MySQL

Recently one of my clients had a problem with their MySQL setup. They were running a two server setup where one server was acting as a master server that was replicating to the second server which was acting as a slave. Unfortunately they ran a really bad query on their master server which pushed the load extremely high and made the server unresponsive. They had their hosting provider restart the master server, which unfortunately caused table corruption, so when the server restarted their database was in an unusable state.

This is where I came in. I shutdown the MySQL instance running on that server and began a myisamchk to correct the corruption. However I quickly realized that this was going to take a very long time, as they had an extremely large database. So we agreed to switch their application over from using the master database to the slave to limit the amount of downtime they would have.

Unfortunately when you switch to a slave your master and slave become out of sync. This was known when we did the switch, and was done to retain their uptime. So how do you re-sync your slave to the master, and get the replication up and going again? Here is my quick guide.

  1. Since the re-syncing of the servers will cause downtime, it is important to schedule a time to do the sync, and notify any major clients on your server that will be affected by taking down the database. A maintenance page should also be created if your load balancer doesn’t automatically display one.
  2. When it is time to do the sync you should setup your maintenance page to display from all sites that utilize the database.
  3. On the slave server log into mysql and make sure to reset the slave, before issuing a read lock.


  4. Shutdown mysql on the slave server (shutdown the master server if it was running).
  5. If your application has a config file pointing to the database IP address, it is best to change that back to the master so that there is no chance writes start getting sent to the slave after you start back up the servers.
  6. Prepare the new mysql data directory on the master. What I generally do is backup the old data directory then create a new folder called mysql. I then copy the mysql database into this folder.
  7. Next you will want to dump/copy all your databases from the slave to the master. Mysqldump is the safest way to do this (yes you will need to start the slave for this), however with large datasets this usually is not a very time efficient, so I generally will just transfer all the database files. When transferring the files, make sure you don’t send the file, any binary log or relay log files, as well as the mysql database. I find the fastest way to send data from one server to another is to use netcat. Here is a sample of how to use netcat to send a listing of files:On the Slave (from within the mysql data directory):

    tar c database1 database2 database3 … databasexx | nc -l 7878

    On the Master (from within the mysql data directory):

    nc slave-ip 7878 | tar xv

    This will tar all the databases and files then send to the master via netcat.

  8. Once this transfer is complete you can now start your master. Once it is started you will need to reset the master then gets the new coordinates for when you start the slave (if there are issues with the replication).


    Record the information from the SHOW MASTER STATUS.

  9. Start the slave server. Then check to see if the server is replicating data. If it is replicating fine you are set, if not you will need to run the following command on the slave:


  10. Release the table lock on the master server:

    unlock tables;

  11. Remove the maintenance page on your webservers and return your application to a usable state.

The step that takes the longest amount of time is generally the transfer of the database files. To estimate the amount of downtime you will have you can simply calculate how long it will take you to transfer the database from one server to another. Then add in a buffer in case of any problems that may arise in the process.

With the steps above you should be able to quickly and easily re-sync a master and slave, and have replication working between the two servers again.