It has been a while since I wrote on my blog. Thought this would be a good addition to the knowledgebase. I came accross this online backup tool that I believe is worth writing about. Incremental backups are taken care of and base backups are compressed and sent across to S3. No more writting shell scripts when it comes to shipping them to S3 anymore which I thought was pretty neat. I did struggle a bit to get it installed on a CentOS box and did not find much online help for it. Hopefully this will help someone get this module up and going in no time.
python (>= 2.6)
And ofcourse we are talking about postgres database here so any postgres version >=8.4 should work with it
First we have to get python26 or greater which for some reason I could not get it from the CentOS repo even after a yum update.
1. Download python version from the below link
wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz Extract the file and compile it with ./configure make sudo make install NOTE: if you are having trouble xtracting the tar.xz file install xz wget http://tukaani.org/xz/xz-5.0.5.tar.gz ./configure make sudo make install
2. We will need Setuptools as well to have our module compiled
wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py Then install it for Python 2.7 that you installed above. sudo /usr/local/bin/python2.7 ez_setup.py
NOTE: Here I was set back again because the ez_setup.py script was trying to download setuptools with a certificate check. All I did was added –no-check-certificate in the script where it was doing a wget like:
def download_file_wget(url, target): cmd = ['wget', url, '--no-check-certificate', '--quiet', '--output-document', target] _clean_check(cmd, target)
3. Install pip using the newly installed setuptools:
sudo easy_install-2.7 pip
4. Install virtualenv for Python 2.7
pip2.7 install virtualenv sudo pip install wal-e
Now that we have wal-e installed we are going to make a couple of configuration changes that will enable wal-e to work with S3.
1. Create an environment directory to use wal-e
mkdir -p /etc/wal-e.d/env chown -R postgres:postgres /etc/wal-e.d echo "secret_key_goes_here"> /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY echo "access_id_for_s3_goes_here"> /etc/wal-e.d/env/AWS_ACCESS_KEY_ID echo 's3://specify_bucket_name/directory_if_you_have_created_on_in_the_bucket'> /etc/wal-e.d/env/WALE_S3_PREFIX
2. Since this is going to be an incremental backup setup we would have to turn archiving on.
wal_level = archive archive_mode = yes archive_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-push %p' archive_timeout = 60
NOTE: you would have to restart your postgres database so that these changes can be read by postgres
Thats it! Now you can start making a base backup and forget about the incremental as wal-e automatically ships those wal files to S3 :). Reason is the archive command we have setup in the postgresql.conf file.
1. To take a base backup:
su postgres envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /path to your datadir You can always list the backups that you have on S3 by: envdir /etc/wal-e.d/env /usr/bin/wal-e backup-list name last_modified expanded_size_bytes wal_segment_backup_start wal_segment_offset_backup_start wal_segment_backup_stop wal_segment_offset_backup_stop base_00000001000000AD000000C7_00000040 2014-03-06T17:51:26.000Z 00000001000000AD000000C7 00000040
2. Deleting or retaining number of backups is easy as well.
If you want to delete a specific backup wal-e delete [--confirm] base_00000004000002DF000000A6_03626144 Or you can just delete backups older than a base backup by using the before clause: wal-e delete [--confirm] before base_00000004000002DF000000A6_03626144 Retaining number of backups as: wal-e delete [--confirm] retain 5
3. Restoring using backup-fetch
To restore the complete database on a seperate box: envdir /etc/wal-e.d/env wal-e backup-fetch Wal fetch can also be accomplished with wal-e: envdir /etc/wal-e.d/env wal-e wal-fetch
There are a couple of more things that can be done with wal-e like using encryption on backups, managing tablespace backups(this is imp if you have user defined tablespaces in your database) controlling I/O of base backup, increasing throughput of wal-push etc. You might want to check into those options before putting this in production as base back I/O’s can take a decent amount of CPU overhead if not configured properly. Here is the link that will help with further information on this module.
Feel free to ask questions and hope this helped.