Archive for category PostgreSQL upgrade

PGREPLAY (replicating production load in a test environment)

Ever wondered why all the testing passes ok but when it comes to real production load things start to look bad on the database. Only if we knew how to replicate the database load in real time on a testing environment we would have some data to analyze. I was given the same task and pgreplay came to rescue.
I find this tool very helpul when:

–> I want to test a new version of PostgreSQL database
–> Or even when I need to have some statistics around how the database would behave on a new hardware.
–> Also, if I make a configuration change to the database how would it behave in a real time load scenario.

All of the above scenarios are worth spending time and you can have a clear idea how your database will behave when its live in production.

About the module:

pgreplay reads the SQL statements in a PostgreSQL log file, extracts them and executes them in the same order and relative time against a PostgreSQL database cluster.

Installation:

Assuming you have a test box with the same data as production and are using postgres as your database.
Download and install pgreplay from the below link(I would install this on a test box and not production)

su - postgres
wget http://pgfoundry.org/frs/download.php/3345/pgreplay-1.2.0.tar.gz
tar -xvf pgreplay-1.2.0.tar.gz
./configure --with-postgres=/opt/PostgreSQL/9.2.4/bin
make
sudo make install

Configuration:

pgreplay needs to read a postgres log file and that log has to be in a specific format. The change would need a restart of you postgres database. Make a copy of the postgresql.conf file before you go ahead with this change.

log_destination = 'stderr'
logging_collector = on
log_directory = '/somemountpoint/pg_log' (this is the location where your postgres log file will be generated. Make sure postgres user has write privs)
log_min_duration_statement = 100 (this value is set to milliseconds. If you want to log all statements put 0)
log_min_messages = error  (or more)
   (if you know that you have no cancel requests, 'log' will do)
log_min_error_statement = log  (or more)
log_connections = on
log_disconnections = on
log_line_prefix = '%m|%u|%d|%c|'  (if you don't use CSV logging)
log_statement = 'all'
lc_messages must be set to English (the encoding does not matter)
bytea_output = escape  (from version 9.0 on, only if you want to replay
                        the log on 8.4 or earlier)

Thats it! restart postgres and logging will start the way pgreplay can parse the statements.

Usage:

There are two ways to replay the postgres logs to another postgres database

1. pgreplay [<parse options>] [<replay options>] [<infile>] 

This will parse a PostgreSQL log file and replays the
statements against a database.

2. pgreplay -f [<parse options>] [-o <outfile>] [<infile>]

This will parse a PostgreSQL log file and writes the
contents to a “replay file” that can be replayed with -r

I like the second option better because then I can just copy the parsed file to any database and do whatever testing is necessary
Now that you have a couple of postgres log files in the log location you specified in postgresql.conf file all you have to do is copy the logs over to a test box that has pgreplay installed and run the below procedure:

I first create a parse file:

pgreplay -f -o test.replay /somemountpoint/pg_log/postgresql.log
notice you will find the test.replay log file in the current working directory

Now just replay the test.replay parse file and watch the postgres database replay all the statements from production to a test box:

pgreplay -r -j test.replay

There are a lot more options that you can try out with pgreplay I will not be explaining them all but feel free to ask any questions about the module. More information on pgreplay is available here.

Advertisements

3 Comments

Upgrade PostgreSQL using pg_upgrade

Overview:

Moving forward from the 8.x release, 9.1 is a major release and provides a wide range of features, bug fixes and enhancements in terms of performance and manageability of the PostgreSQL database. A list of all the improvements in this feature and what to expect is given in the link below:

http://www.postgresql.org/docs/9.1/static/release-9-1-3.html

I have listed some of the major bug fixes that you can benefit from in 9.1 as going through the huge list is not what everyone would want to see in this documentation.

* Fix btree index corruption from insertions concurrent with vacuuming (Tom Lane)

An index page split caused by an insertion could sometimes cause a concurrently-running VACUUM to miss removing index entries that it should remove. After the corresponding table rows are removed, the dangling index entries would cause errors (such as “could not read block N in file …”) or worse, silently wrong query results after unrelated rows are re-inserted at the now-free table locations. This bug has been present since release 8.2, but occurs so infrequently that it was not diagnosed until now. If you have reason to suspect that it has happened in your database, reindexing the affected index will fix things.

* Fix handling of data-modifying WITH subplans in READ COMMITTED rechecking (Tom Lane)

A WITH clause containing INSERT/UPDATE/DELETE would crash if the parent UPDATE or DELETE command needed to be re-evaluated at one or more rows due to concurrent updates in READ COMMITTED mode.

* Fix CLUSTER/VACUUM FULL handling of toast values owned by recently-updated rows (Tom Lane)

This oversight could lead to “duplicate key value violates unique constraint” errors being reported against the toast table’s index during one of these commands.

* Avoid crashing when we have problems deleting table files post-commit (Tom Lane)

Dropping a table should lead to deleting the underlying disk files only after the transaction commits. In event of failure then (for instance, because of wrong file permissions) the code is supposed to just emit a warning message and go on, since it’s too late to abort the transaction. This logic got broken as of release 8.4, causing such situations to result in a PANIC and an unrestartable database.

* Recover from errors occurring during WAL replay of DROP TABLESPACE (Tom Lane)

Replay will attempt to remove the tablespace’s directories, but there are various reasons why this might fail (for example, incorrect ownership or permissions on those directories). Formerly the replay code would panic, rendering the database unrestartable without manual intervention. It seems better to log the problem and continue, since the only consequence of failure to remove the directories is some wasted disk space.

* Fix planner’s ability to push down index-expression restrictions through UNION ALL (Tom Lane)

This type of optimization was inadvertently disabled by a fix for another problem in 9.1.2.

* Fix planning of WITH clauses referenced in UPDATE/DELETE on an inherited table (Tom Lane)

This bug led to “could not find plan for CTE” failures.

* Fix GIN cost estimation to handle column IN (…) index conditions (Marti Raudsepp)

This oversight would usually lead to crashes if such a condition could be used with a GIN index.

* Fix I/O-conversion-related memory leaks in plpgsql (Andres Freund, Jan Urbanski, Tom Lane)

Certain operations would leak memory until the end of the current function.

Deployment methods:

PG_UPGRADE module: this module has been a part of contrib and has been well tested since 8.4 to 9.0, 9.1 and will be a part of the upgrade process for all postgresql major upgrades. There are 2 ways to use this module with the upgrade plan.

a. PG_UPGRADE without in place upgrade: this method does a quick scan on all the data files, updates the headers and copies them over to the 9.1 data directory. This upgrade takes around 1/3 rd the time than it takes to do a regular pg_dump or restore. The drawback is it needs double the space on the server where the upgrade is happening.

b. In place upgrade with pg_upgrade module. This is the fastest way to do a major upgrade. The process behind this upgrade path is that the pg_upgrade module uses the same data directory and converts the data header files to be 9.1 compatible. There is one data directory and no copy process, hence reducing the amount of disk space required for an upgrade. Drawback of this process is if there is an error during the upgrade process (which is unlikely if all the steps are followed) the data directory is in an unusable state and clean up is required to restart the upgrade. The best approach here would be to take a backup of the database just prior to starting the upgrade. I will list the various options in the upgrade section below.

Upgrading PostgreSQL Server to 9.1:

Prerequisites:

* Install PostgreSQL 9.1.3 using source as:

./configure '--prefix=/opt/PostgreSQL/9.1.3 '--with-openssl' '--with-perl' '--with-python' '--with-tcl' '--with-pam' '--with-krb5' '--enable-thread-safety' '--with-libxml' '--with-ossp-uuid' '--docdir=/opt/PostgreSQL/9.1.3/doc/postgresql' '--with-libxslt'

make

make install

* After postgresql server has been installed go back to the source location and cd to contrib/pg_upgrade

compile pg_upgrade so that it is available in the 9.1.3 bin directory

make

make install

* Initialize the 9.1.3 cluster

./initdb –lc-ctype=en_US.UTF-8 –lc-collate=en_US.UTF-8 –encoding=UTF8 –xlogdir=/postgres/translogs/pg_xlog -D /datadirectorypath

Change port of the 9.1.3 database in the postgresql.conf file to anything other than the 8.4 db port

port=2525

* Now would be a good time to start a full backup of the 8.4 cluster

Upgrading inplace:

Before we start the upgrade we have to make sure both the instances 8.4 and 9.1 are shut down and then just issue the commands below:

Check for cluster compatibility: The below command will first check if the two clusters 8.4 and 9.1 are compatible for the upgrade.

./pg_upgrade -b /opt/PostgreSQL/8.4.9/bin/ -B /opt/PostgreSQL/9.1/bin/ -c -d /opt/PostgreSQL/8.4.9/data/ -D /opt/PostgreSQL/9.1/data/ -p 6543 -P 2525 -k -l /opt/PostgreSQL/9.1/log/upgrade.log -v

After the check is passed we can go ahead and start the upgrade:

./pg_upgrade -b /opt/PostgreSQL/8.4.9/bin/ -B /opt/PostgreSQL/9.1/bin/ -d /opt/PostgreSQL/8.4.9/data/ -D /opt/PostgreSQL/9.1/data/ -p 6543 -P 2525 -k -l /opt/PostgreSQL/9.1/log/upgrade.log -v

In case there are any issues with the upgrade the upgrade.log file can be checked for errors.

No inplace upgrade:

The above commands can be run without using -k and this will start the copy process of the data files once the ./pg_upgrade command is invoked.

Post installation:

Once the upgrade has been done successfully. Following steps will ensure we are using 9.1 instead of the old binaries:

  • Update the /etc/init.d/postgresql script to include the path for 9.1 instead of 8.4.9
  • Change postgresql port of 9.1 in postgresql.conf file to 6543
  • Update /home/postgres/.bash_profile (or any other location where you have set the postgres OS user profile file) to set the environment variables for 9.1
  • Make sure new changes have been implemented source .bash_profile ( or log out and login)
  • Start postgres with the service file sudo service postgresql start

,

1 Comment

%d bloggers like this: