Reducing Writes on Amazon RDS

Recently I transitioned a client from using an ec2 based MySQL server to Amazon RDS. After the migration I noticed a large number of writes to the system which was degrading the performance of MySQL. Inserts and updates were taking multiple seconds to complete, and this was on a larger sized instance than the ec2 solution.

I initially found this article that suggested changing the innodb_flush_log_at_trx_commit variable to either 0 or 2 could help solve the problem. However even after this change the system was still writing extensively to disk.

I dug a little deeper and found in MySQL a large majority of queries where writing temporary tables which would explain the extensive writing to disk. After analysing the previous my.cnf file and comparing to the RDS instance I realized I failed to mirror all the variables I previously had setup. Specifically the tmp_table_size, max_heap_table_size and query_cache_size. Of the 3 the one that had the most dramatic affect on the performance and writs was the query_cache_size. After setting this variable to what the ec2 instance was using the CPU load and system writes drop substantially.

See the following charts:

So if you run into a similar problem try tweaking your query_cache_size and see if that affects your system writes as dramatically as it did for me.

Resyncing a Slave to a Master in MySQL with Amazon Web Services

My latest blog post on Resyncing a Slave to a Master in MySQL made me think of how I could achieve the same result in the cloud using amazon web services (AWS). Of note, the previous guide would work just fine in the cloud to resync a slave and a master, however there is an alternative way using the tools the AWS provides.

Lets assume that in the cloud we had two ec2 instances setup running mysql in a master slave configuration, and that each server was using Elastic Block Storage (EBS). Essentially the procedure would be very similar, apart from the transfer of data from one machine to the other.

Amazon Web Services provides the ability to create snapshots of an EBS volume. Creating a snapshot is quick, and once it has been created, you can mount the snapshot on any of your instances. This makes the backup portion even easier.

From my previous article you would proceed with steps 1-6. Then instead of step 7 you would first create a snapshot of the slaves EBS volume. You can either do this via the firefox ElasticFox plugin or the command line tools as follows:

ec2-create-snapshot VOLUME_ID -d "Mysql snapshot for resync of slaves"

Once the snapshot is created you then need to create a new volume for the snapshot. Again in ElasticFox this is as easy as right clicking the snapshot and selecting “create a new volume from this snapshot”. From command line:

ec2-create-volume --snapshot SNAPSHOT_ID -z ZONE

Finally our last step is to attach this volume to the master MySQL instance and mount it. In ElasticFox you right click the new volume and select “attach this volume”. From command line you would run:

ec2-attach-volume VOLUME_ID -i INSTANCE_ID -d DEVICE

Once the volume is attached you can them mount it on the master server to a temporary folder, copy over all of the folders you need to the mysql directory, then proceed with step 8 of my previous guide. When everything is working correctly you can then unmount the newly created volume from the instance, and delete it.

Overall the two approaches are very similar, and both will work. File transfer between instances in the cloud is extremely quick, however so is creating new snapshots and attaching them to instances. Ultimately it is up to you to decide which route to take.

Resyncing a Slave to a Master in MySQL

Recently one of my clients had a problem with their MySQL setup. They were running a two server setup where one server was acting as a master server that was replicating to the second server which was acting as a slave. Unfortunately they ran a really bad query on their master server which pushed the load extremely high and made the server unresponsive. They had their hosting provider restart the master server, which unfortunately caused table corruption, so when the server restarted their database was in an unusable state.

This is where I came in. I shutdown the MySQL instance running on that server and began a myisamchk to correct the corruption. However I quickly realized that this was going to take a very long time, as they had an extremely large database. So we agreed to switch their application over from using the master database to the slave to limit the amount of downtime they would have.

Unfortunately when you switch to a slave your master and slave become out of sync. This was known when we did the switch, and was done to retain their uptime. So how do you re-sync your slave to the master, and get the replication up and going again? Here is my quick guide.

  1. Since the re-syncing of the servers will cause downtime, it is important to schedule a time to do the sync, and notify any major clients on your server that will be affected by taking down the database. A maintenance page should also be created if your load balancer doesn’t automatically display one.
  2. When it is time to do the sync you should setup your maintenance page to display from all sites that utilize the database.
  3. On the slave server log into mysql and make sure to reset the slave, before issuing a read lock.

    STOP SLAVE;
    RESET SLAVE;
    FLUSH TABLES WITH READ LOCK;

  4. Shutdown mysql on the slave server (shutdown the master server if it was running).
  5. If your application has a config file pointing to the database IP address, it is best to change that back to the master so that there is no chance writes start getting sent to the slave after you start back up the servers.
  6. Prepare the new mysql data directory on the master. What I generally do is backup the old data directory then create a new folder called mysql. I then copy the mysql database into this folder.
  7. Next you will want to dump/copy all your databases from the slave to the master. Mysqldump is the safest way to do this (yes you will need to start the slave for this), however with large datasets this usually is not a very time efficient, so I generally will just transfer all the database files. When transferring the files, make sure you don’t send the master.info file, any binary log or relay log files, as well as the mysql database. I find the fastest way to send data from one server to another is to use netcat. Here is a sample of how to use netcat to send a listing of files:On the Slave (from within the mysql data directory):

    tar c database1 database2 database3 … databasexx | nc -l 7878

    On the Master (from within the mysql data directory):

    nc slave-ip 7878 | tar xv

    This will tar all the databases and files then send to the master via netcat.

  8. Once this transfer is complete you can now start your master. Once it is started you will need to reset the master then gets the new coordinates for when you start the slave (if there are issues with the replication).

    RESET MASTER;
    SHOW MASTER STATUS;

    Record the information from the SHOW MASTER STATUS.

  9. Start the slave server. Then check to see if the server is replicating data. If it is replicating fine you are set, if not you will need to run the following command on the slave:

    CHANGE MASTER TO
    MASTER_HOST='host',
    MASTER_USER='user',
    MASTER_PASSWORD='password',
    MASTER_LOG_FILE='recorded_log_file_name',
    MASTER_LOG_POS=recorded_log_position;

  10. Release the table lock on the master server:

    unlock tables;

  11. Remove the maintenance page on your webservers and return your application to a usable state.

The step that takes the longest amount of time is generally the transfer of the database files. To estimate the amount of downtime you will have you can simply calculate how long it will take you to transfer the database from one server to another. Then add in a buffer in case of any problems that may arise in the process.

With the steps above you should be able to quickly and easily re-sync a master and slave, and have replication working between the two servers again.