Archive for category Backups

WAL-E (Incremental backups with S3 support)

It has been a while since I wrote on my blog. Thought this would be a good addition to the knowledgebase. I came accross this online backup tool that I believe is worth writing about. Incremental backups are taken care of and base backups are compressed and sent across to S3. No more writting shell scripts when it comes to shipping them to S3 anymore which I thought was pretty neat. I did struggle a bit to get it installed on a CentOS box and did not find much online help for it. Hopefully this will help someone get this module up and going in no time.

Dependencies:

python (>= 2.6)
lzop
pv
And ofcourse we are talking about postgres database here so any postgres version >=8.4 should work with it

First we have to get python26 or greater which for some reason I could not get it from the CentOS repo even after a yum update.

1. Download python version from the below link

wget http://python.org/ftp/python/2.7.6/Python-2.7.6.tar.xz

Extract the file and compile it with
./configure
make
sudo make install

NOTE: if you are having trouble xtracting the tar.xz file install xz
wget http://tukaani.org/xz/xz-5.0.5.tar.gz
./configure
make
sudo make install

2. We will need Setuptools as well to have our module compiled

wget https://bitbucket.org/pypa/setuptools/raw/bootstrap/ez_setup.py

Then install it for Python 2.7 that you installed above.
sudo /usr/local/bin/python2.7 ez_setup.py

NOTE: Here I was set back again because the ez_setup.py script was trying to download setuptools with a certificate check. All I did was added –no-check-certificate in the script where it was doing a wget like:

def download_file_wget(url, target):
cmd = ['wget', url, '--no-check-certificate', '--quiet', '--output-document', target]
_clean_check(cmd, target)

3. Install pip using the newly installed setuptools:

sudo easy_install-2.7 pip

4. Install virtualenv for Python 2.7

pip2.7 install virtualenv

sudo pip install wal-e

Now that we have wal-e installed we are going to make a couple of configuration changes that will enable wal-e to work with S3.

1. Create an environment directory to use wal-e

mkdir -p /etc/wal-e.d/env
chown -R postgres:postgres /etc/wal-e.d
echo "secret_key_goes_here"> /etc/wal-e.d/env/AWS_SECRET_ACCESS_KEY
echo "access_id_for_s3_goes_here"> /etc/wal-e.d/env/AWS_ACCESS_KEY_ID
echo 's3://specify_bucket_name/directory_if_you_have_created_on_in_the_bucket'> /etc/wal-e.d/env/WALE_S3_PREFIX

2. Since this is going to be an incremental backup setup we would have to turn archiving on.

wal_level = archive
archive_mode = yes
archive_command = 'envdir /etc/wal-e.d/env /usr/local/bin/wal-e wal-push %p'
archive_timeout = 60

NOTE: you would have to restart your postgres database so that these changes can be read by postgres

Thats it! Now you can start making a base backup and forget about the incremental as wal-e automatically ships those wal files to S3 :). Reason is the archive command we have setup in the postgresql.conf file.

1. To take a base backup:

su postgres
envdir /etc/wal-e.d/env /usr/local/bin/wal-e backup-push /path to your datadir

You can always list the backups that you have on S3 by:
envdir /etc/wal-e.d/env /usr/bin/wal-e backup-list

name last_modified expanded_size_bytes wal_segment_backup_start wal_segment_offset_backup_start wal_segment_backup_stop wal_segment_offset_backup_stop
base_00000001000000AD000000C7_00000040 2014-03-06T17:51:26.000Z 00000001000000AD000000C7 00000040

2. Deleting or retaining number of backups is easy as well.

If you want to delete a specific backup
wal-e delete [--confirm] base_00000004000002DF000000A6_03626144

Or you can just delete backups older than a base backup by using the before clause:
wal-e delete [--confirm] before base_00000004000002DF000000A6_03626144

Retaining number of backups as:
wal-e delete [--confirm] retain 5

3. Restoring using backup-fetch

To restore the complete database on a seperate box:
envdir /etc/wal-e.d/env wal-e backup-fetch

Wal fetch can also be accomplished with wal-e:
envdir /etc/wal-e.d/env wal-e wal-fetch

There are a couple of more things that can be done with wal-e like using encryption on backups, managing tablespace backups(this is imp if you have user defined tablespaces in your database) controlling I/O of base backup, increasing throughput of wal-push etc. You might want to check into those options before putting this in production as base back I/O’s can take a decent amount of CPU overhead if not configured properly. Here is the link that will help with further information on this module.

Feel free to ask questions and hope this helped.

Advertisements

1 Comment

pg_rman install issues

pg_rman is a tool to take incremental backups on Postgres. Although I do not see any recent development on this tool since june 2011. I somehow wanted to try installing it and try to use it in a test environment to  justify its usage. I was facing some issues when I tried to install it and somehow could not find an answer to the errors I saw during installation.

pg_ram can be downloaded from here.

Once downloaded just untar using:

tar -xvzf pg_rman-1.2.4.tar.gz

cd pg_rman-1.2.4

sudo make USE_PGXS=1

Got the following error message and started looking at google for help. All I could find is set your PATH for postgres binaries. Which I saw was correct. The issue seemed to me that the pg_rman Makefile would not find pg_config.

Also you should ensure that you have the postgresql-devel package installed as that’s the one that gets the pg_config in there.

make: pg_config: Command not found
cc -c -o backup.o backup.c
In file included from backup.c:10:0:
pg_rman.h:12:25: fatal error: postgres_fe.h: No such file or directory
compilation terminated.
make: *** [backup.o] Error 1

I then modified the Makefile with the full path to the pg_config and it just worked. Below is the change:

PG_CONFIG = /opt/PostgreSQL/8.4.9/bin/pg_config  (path to your pg_config could be different. Append it accordingly)

, ,

2 Comments

%d bloggers like this: