Archive for May, 2012

pg_top

pg_top is an utility just like the UNIX top command but it has much more to offer and I find it very useful to monitor a PostgreSQL database.

It is tedious to run a top and then grep the user postgres and look for the commands running. Then also have another session of pg_stat_activity to figure out what the queries are doing. Instead it is easier to just run pg_top and get all the below with just one command

View currently running SQL statement of a process.

View query plan of a currently running SQL statement.

View locks held by a process.

View user table statistics.

View user index statistics.

Lets first install the tool:

--> wget http://pgfoundry.org/frs/download.php/1780/pg_top-3.6.2.tar.bz2

--> tar -xjvf pg_top-3.6.2.tar.bz2

--> cd pg_top-3.6.2

--> ./configure --prefix='path to where you want it installed'

--> make

--> make install

If you have environment variables configured for the postgres user it should pick up the pgport and binaries

If not you can use pg_top as

pg_top -p 5432 -U postgres -d database -W

Where ‘-d’ specifies the database to connect to, ‘-U’ specifies the database user, and ‘-W’ will prompt for the password if a password is required to connect.

Use the ‘Q’ key and enter a PID once pg_top is started.

To see the Query Plan use the ‘E’ key and enter a PID once pg_top is started. The ‘E’ commands runs ‘EXPLAIN’ and ‘EXPLAIN ANALYZE’ can be used by using the ‘A’ key followed by a PID.

Locks Acquired

Use the ‘L’ key and enter a PID once pg_top is started.

Table Statistics

Use the ‘R’ key to display user table statistics. Use ‘t’ to toggle between displaying cumulative and differential statistics.

Index Statistics

Use the ‘X’ key to display user index statistics. Use ‘t’ to toggle between displaying cumulative and differential statistics.

Update for Postgres 9.2: I tried to use pg_top for postgres 9.2 and ran into an issue where pg_top does not work well with this version of postgres(mainly because some of the columns in pg_stat_activity have been changed eg: procpid changed to pid, current_query changed to query and state has been added). I took the latest version and still no luck so figured I search this on google and found a link worked for me. Here is what I did to make it work:

downloaded the latest source of pgtop http://pgfoundry.org/frs/download.php/1780/pg_top-3.6.2.tar.bz2

untar it and used the following patch before compiling it:

curl -L http://ln-s.net/+6lb|gzip -dc|patch -p1 -d pg_top-3.6.2

then use the nomal configure make and make install

Advertisements

,

13 Comments

Warm_standby

For users that are on previous versions of Postgres 8.x there is an option for warm standby that is available. In case there is an issue with the primary server the standby server can be activated to be used as primary and data loss is prevented. In order to setup warm standby follow the steps below:

Steps to setup warm standby

–> Configure the Standalone Db server
–> Start the Back Up Mode
–> Take The Base Back Up
–> Stop The Back Up Mode
–> Configure the Stand By Db server
–> Start the Start By Server
–> View The Logs
–> Switching to The Stand by server

 

Configure the Stand Alone DB server
1) Edit the postgresql.conf file to activate the archive mode of the server and define
where to archive the transactional logs

postgres@px:~> vi data/postgresql.conf
archive_mode = on
archive_command = 'cp -i %p /d0/data/archive_backups/%f'
archive_timeout = 300s

2) Start the DB server

postgres@px:~> pg_ctl -D
server starting
data start

3) Create some objects

postgres@px:~> psql
Welcome to psql 8.3.4, the PostgreSQL interactive terminal.
[local]:5432:postgres# \l
List of databases
Name
| Owner
| Encoding
-----------+----------+----------
postgres | postgres | UTF8
template0 | postgres | UTF8
template1 | postgres | UTF8
(3 rows)
[local]:5432:postgres# create database newdb ;
CREATE DATABASE
[local]:5432:postgres# \c newdb
You are now connected to database "newdb".
[local]:5432:newdb# CREATE TABLE sample as select * from
pg_class,pg_namespace ;
SELECT
[local]:5432:newdb# \dt
List of relations
Schema | Name | Type | Owner
--------+--------+-------+----------
public | sample | table | postgres
(1 row)
postgres@px:~/data> psql
[local]:5432:postgres# CREATE TABLE classproc
pg_class natural join pg_proc ;
SELECT
as select * from

 

Start the Backup Mode

[local]:5432:postgres# select pg_start_backup('L1') ;
pg_start_backup
-----------------
0/C621394
(1 row)

Take the Base Backup

postgres@px:~> cd basebkp
postgres@px:~/basebkp> cp -r /home/postgres/data .
postgres@px:~/basebkp>
total 12
drwxr-xr-x 3 postgres users
drwxr-xr-x 5 postgres users
drwx------ 11 postgres users
l
4096 2008-12-29 15:28 ./
4096 2008-12-29 15:21 ../
4096 2008-12-29 15:28 data/
postgres@px:~/basebkp> du -sh
369M
data
data
postgres@px:~/basebkp> mv data/ standby
postgres@px:~/basebkp> ls
standby

 

Stop the Backup Mode

[local]:5432:postgres# CREATE TABLE chktable as select *
from pg_tables ;
SELECT
[local]:5432:postgres# CREATE TABLE chk12 as select * from
pg_tables natural join pg_class ;
SELECT
[local]:5432:postgres# select pg_stop_backup() ;
pg_stop_backup
----------------
0/1547E284
(1 row)

 

Configuring the Stand By Server

postgres@px:~/basebkp> vi standby/recovery.conf
postgres@px:~/basebkp> cat basebkp/standby/recovery.conf
restore_command = 'pg_standby -l -d -s 2 -t
/tmp/pgsql.trigger.5432 /d0/data/archive_backups %f %p %r'
postgres@px:~/basebkp> vi standby/postgresql.conf
port = 5433 ##change port
postgres@px:~/basebkp/standby> rm -f postmaster.pid
<strong>Start the Stand By Server</strong>
postgres@px:~/basebkp&gt; pg_ctl -D standby start
server starting
postgres@px:~/basebkp&gt; psql -p 5433 -l
psql: FATAL: the database system is starting up

 

View LOG files of the Stand By Server

postgres@px:~&gt; less ~/basebkp/standby/pg_log/postgresql-
2008-12-29_162601.log
[3263 4958acc1.cbf 1]LOG: database system was interrupted; last
known up at 2008-12-29 15:27:28 IST
[3263 4958acc1.cbf 2]LOG: starting archive recovery
[3263 4958acc1.cbf 3]LOG: restore_command = 'pg_standby -l -d -s
2 -t /tmp/pgsql.trigger.5432 /d0/data/archive_backups %f %p %r'
Trigger file
: /tmp/pgsql.trigger.5432
Waiting for WAL file
: 00000001.history
WAL file path
:
/d0/data/archive_backups/00000001.history
Restoring to...
: pg_xlog/RECOVERYHISTORY
Sleep interval
: 2 seconds
Max wait interval
: 0 forever
Command for restore
: ln -s -f
"/d0/data/archive_backups/00000001.history"
"pg_xlog/RECOVERYHISTORY"
Keep archive history
: 000000000000000000000000 and later
running restore
: OK

 

Stop Stand Alone Server and Use the server as a Live DB server

postgres@px:~&gt; touch /tmp/pgsql.trigger.5432
postgres@px:~/basebkp/standby/pg_log&gt; less postgresql-
2008-12-29_163930.log
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...trigger
file found
[12107 4958afea.2f4b 22]LOG: could not open file
"pg_xlog/00000001000000000000001A" (log file 0, segment 26): No
such file or directory
[12107 4958afea.2f4b 23]LOG: redo done at 0/1960BAB8
[12107 4958afea.2f4b 24]LOG: last completed transaction was at
log time 2008-12-29 16:32:41.142195+05:30
Trigger file
: /tmp/pgsql.trigger.5432
Waiting for WAL file
: 000000010000000000000019
WAL file path
:
/d0/data/archive_backups/000000010000000000000019
Restoring to...
: pg_xlog/RECOVERYXLOG

Now lets check if everything is in place and you are all set

,

Leave a comment

Monitoring_PostgreSQL

Lets talk about the most widely used monitoring system for PostgreSQL. There are other tools available, some paid and do a really good job but I would like to use something that is free for the general public and reliable. Nagios!!
Nagios is an open source system of widely used network monitoring, which monitors the equipment (hardware) and services (software) that you specify, alerting them when their behavior is not desired.

Configuring the monitoring server:

Requirements

Web Server (This manual assumes that you use Apache)

Php (On the web interface)
Gcc
Libgd (a graphics library is needed to display the statusmap)
Perl
Mail Server

Installation:

Create a user nagios

useradd-m nagios
passwd nagios

Install nagios

wget <a href="http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.1/nagios-3.4.1.tar.gz">http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.1/nagios-3.4.1.tar.gz</a>
tar -xzvf nagios-3.4.1.tar.gz
cd nagios-3.4.1
. / Configure - with-command-group = nagios
make all
make install
make install-init
make install-config
make install-commandmode

Configuration:

Edit the file / usr / local / nagios / etc / objects / contacts.cfg
This change is to indicate to email alerts notify each set.
Replace “nagios @ localhost” by the post of administrator

Configure the web interface:

make install-webconf
htpasswd-c / usr / local / nagios / etc / htpasswd.users nagiosadmin
nagios.nagios chown / usr / local / nagios / etc / htpasswd.users
chmod 664 / usr / local / nagios / etc / htpasswd.users
service httpd restart

Install Nagios Plugins:

wget <a href="http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download">http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download</a>
tar -xzvf nagios-plugins-1.4.15.tar.gz
cd nagios-plugins-1.4.15
. / Configure - with-nagios-user = nagios - with-nagios-group = nagios - without-pgsql
make
make install

Start Nagios:

cd / etc / init.d
chkconfig - add nagios
chkconfig nagios on

To check that all is well
/ Usr / local / nagios / bin / nagios-v / usr / local / nagios / etc / nagios.cfg

Finally:

service nagios start

At this point it is possible to enter Nagios, opening the browser and giving as address: http://localhost/nagios
Enter nagiosadmin user and a key that when you set the web interface.
If I install on the same machine that will be monitored to installation of the plugin for PostgreSQL, otherwise continue with the next section.

Set the machine to be monitored:

Requirements

PostgreSQL (The instructions in this manual were tested using version 8.3.7)
Xinetd
Gcc
Perl

Create a user Nagios
useradd-m nagios
passwd nagios

<strong>Install Nagios Plugins:</strong>
The PostgreSQL plugin then install it manually to make sure to use the latest version.

wget <a href="http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download">http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download</a>
tar -xzvf nagios-plugins-1.4.15.tar.gz
cd nagios-plugins-1.4.15
. / Configure - with-nagios-user = nagios - with-nagios-group = nagios - without-pgsql
make
make install

nagios.nagios chown-R / usr / local / nagios

Install the module Nagios nrpe:

wget <a href="http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.13/nrpe-2.13.tar.gz">http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.13/nrpe-2.13.tar.gz</a>
tar -xzvf nrpe-2.13.tar.gz
cd nrpe-2.13
. / Configure
make all
make install-plugin
make install-daemon
make install-daemon-config
make install-xinetd

Configure the module nrpe:

Edit the file / etc / xinet.d / nrpe
modify the value of field only_from "by the server ip to be monitored
Edit the file / etc / services
add the following line "nrpe 5666/tcp # NRPE"
Configuring the server monitoring services to check on the remote machine

Test the module nrpe:

/ Usr / local / nagios / libexec / server check_nrpe-H
Edit the file / usr / local / nagios / etc / objects / commands.cfg

Add the following lines at end of file
define command (
command_name check_nrpe
command_line $ USER1 $ / HOSTADDRESS check_nrpe-H $ ARG1 $-c $ $
)
We must tell that we will monitor Nagios another machine by editing the file / usr / local / nagios / etc / nagios.cfg Add the following line in the section "OBJECT CONFIGURATION FILE (S)":

cfg_file = / usr / local / nagios / etc / objects / dbhost.cfg

Creating the definition of a file host dbhost.cfg

define host (
use linux-server
dbserver host_name
pgsql alias 8.3
IP.del.servidor.PostgreSQL address
)

service nagios restart

Install PostgreSQL plugin to monitor (on the server to be monitored)

Download the file from check_postgres.tar.gz http://bucardo.org/check_postgres
check_postgres.tar.gz gunzip-dc | tar xvf -
cd check_postgres
check_postgres.pl cp / usr / local / nagios / libexec /.
perl / usr / local / nagios / libexec / check_postgres.pl-symlinks
To install checks must create an entry for each service that we will monitor the file / usr / local / nagios / etc / nrpe.cfg similar to (must be added in the section "COMMAND DEFINITIONS"):

command [check_postgres_locks] = / usr / local / nagios / libexec / check_postgres_locks-w 2-c 3
Where the command can be taken http://bucardo.org/check_postgres/check_postgres.pl.html

Then you should specify the services to monitor, creating entries like the following in / usr / local / nagios / etc / objects / dbhost.cfg in the monitoring server:

define service (
use generic-service
dbserver host_name
PGSQL service_description locks
check_command check_nrpe! check_postgres_locks
)
define service (
use generic-service
dbserver host_name
service_description CPU Load
check_command check_nrpe! check_load
)
define service (
use generic-service
dbserver host_name
Current users service_description
check_command check_nrpe! check_users
)
define service (
use generic-service
dbserver host_name
Total service_description Processes
check_command check_nrpe! check_total_procs
)
define service (
use generic-service
dbserver host_name
service_description Zombie Processes
check_command check_nrpe! check_zombie_procs
)

, ,

Leave a comment

Bloats

Bloats can cause some serious performance impact on queries and you will see server load rise from what it was at an earlier point in time. The reason for bloats in a table/index could be due to various reasons. Updates rollbacks failed transactions etc. What is important here to identify these bloats and remove them before things get too bad.

I have simplified the process of identifying these bloats using the below script. The result set what you will see is:

relation | reltuples | relpages | otta | bloat | wastedpages | wastedbytes | pwastedbytes | relbytes | prelbytes
---------------------------------+------------+------------+------+-------+------------------+-----------------+-------------------+------------+-------------------------------------
account.activation_return_policy | 973111 | 164186 | 5240 | 31.3 | 158946 | 1302085632 | 1242 MB | 1345011712 | 1283 MB
billing.search_bill_info                   | 688832 | 6754 | 3372 | 2.0 | 3382 | 27705344 | 26 MB | 55328768 | 53 MB
(2 rows)

Here you see that there are two tables that have 31.3% and 2.0% bloats respectively. It is a good idea to run pgreorg or cluster command depending on the database availability requirement to remove these bloats.

NOTE: CLUSTER command will lock the table for the time it removes the bloat and is slower than pgreorg. Make a practical choice here by looking at the options you have. If you need more information on pgreorg , look though my post about pgreorg. 

The sql that I have used to get this information is given below. It is a good idea to put the sql in a crontab and probably send out the result set via email so that you are aware of the bloated tables and can work on reindexing /clustering or using pgreorg at a later time.

SELECT
schemaname||'.'||tablename as relation, reltuples::bigint, relpages::bigint, otta,
ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS bloat,
CASE WHEN relpages &lt; otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,
CASE WHEN relpages &lt; otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
pg_size_pretty((CASE WHEN relpages &lt; otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END)::bigint) AS pwastedbytes,
bs*relpages::bigint as relbytes,
pg_size_pretty((bs*relpages::bigint)::bigint) as prelbytes
FROM (
SELECT
schemaname, tablename, cc.reltuples, cc.relpages, bs,
CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
FROM (
SELECT
ma,bs,schemaname,tablename,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
schemaname, tablename, hdr, ma, bs,
SUM((1-null_frac)*avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac&lt;&gt;0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
) AS nullhdr
FROM pg_stats s, (
SELECT
(SELECT current_setting('block_size')::numeric) AS bs,
CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
JOIN pg_class cc ON cc.relname = rs.tablename and cc.relkind = 'r'
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname &lt;&gt; 'information_schema'
) AS sml
WHERE sml.relpages - otta &gt; 128
AND ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) &gt; 1.2
AND CASE WHEN relpages &lt; otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END &gt; 1024 * 100
ORDER BY wastedbytes DESC;

, , ,

2 Comments

Table partition with foreign keys

When to partition:

Many a times I have seen people saying oh the table is not so big in size and does not need partitioning. Let me clarify a little bit here. Its never about the size of a table but the number of distinct rows in a table when making a decision on when to partition. The main reason we want to partition a table is to increase the query performance that might be taking a long time.

First things first lets not rush into conclusions and step back and think about the queries itself. Can the query be tuned to perform better? That is the question you should ask yourself and check the explain plan to probably add an index? Increase work_mem? Or just rearrange the joins or even remove some joins that are not necessary(I will write another post about how to tune queries soon). Once you are convinced about the fact that the query cannot be improved further then its time to make a decision to partition the table. The way I do this is to pick up every query that is referencing the table and check the best way moving forward. For eg: if your application queries are running off of id column and you make a partition on the date column. You are in for bigger problems and the performance will degrade. Find the column on which you would want the partition to be based on and then move ahead with the testing and implementation. Personally I would create the partition on a test box and run some serious tests using pgbench. That would give you information if its worth partitioning the table or not.

The process: 

I started working on table partitioning on postgres and thought this might come in handy when we are dealing with foreign keys that are not recognized when dealing with partitions. The standard method of partition will give us information on how to create and manage partitions but when it comes to foreign keys we have to go an extra mile to make sure foreign keys are maintained when implementing partition.

I have an example below with two tables that I want to partition (accounts and billing). Billing table has a foreign key on the billing_id column referencing accounts_id as below:

billing:

Table "public.billing"
Column | Type | Modifiers
--------------+---------+-----------
billing_id | integer | not null
billing_type | boolean |
dt_billing | date |
Indexes:
"billing_pkey" PRIMARY KEY, btree (billing_id)
Foreign-key constraints:
"bill_id_fk" FOREIGN KEY (billing_id) REFERENCES accounts(account_id) ON UPDATE RESTRICT ON DELETE RESTRICT

accounts:

Table "public.accounts"
Column | Type | Modifiers
--------------+------------------------+-----------
account_id | integer | not null
account_name | character varying(100) |
dt_account | date |
Indexes:
"accounts_pkey" PRIMARY KEY, btree (account_id)
Referenced by:
TABLE "billing" CONSTRAINT "bill_id_fk" FOREIGN KEY (billing_id) REFERENCES accounts(account_id) ON UPDATE RESTRICT ON DELETE RESTRICT

Lets start by setting up partition on both of these tables with the date field.

NOTE: I have created a shadow table for accounts to keep track of the foreign keys. The reason for this is that foreign keys can be created from one partitioned table to another table, or to a specific partition, but not to the top level of a partitioned table because the parent table has no data. In many cases you are not going to partition using that foreign key column and then table 2 (billing) can’t see the respective column in table1 (accounts). For partitions to work with foreign keys I simply created a shadow table with the account_id column and reference that table for all the child billing tables. Also, in order to maintain this shadow table I am creating a rule that will update the shadow table when the primary accounts table is updated. You can call it a hack but it works well.  If you don’t care about the foreign keys then just drop them and follow steps without creating shadow table and rule.

-- Creating accounts parent table

CREATE TABLE accounts
(
account_id integer NOT NULL,
account_name character varying(100),
dt_account date,
CONSTRAINT accounts_pkey PRIMARY KEY (account_id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE accounts OWNER TO postgres;

--Creating accounts child tables

CREATE TABLE accounts_2012
(
account_id integer NOT NULL,
account_name character varying(100),
dt_account date,
CONSTRAINT accounts_2012_pkey PRIMARY KEY (account_id ),
CHECK ( EXTRACT(YEAR FROM dt_account) = 2012))
INHERITS (accounts);
ALTER TABLE accounts_2012 OWNER TO postgres;

CREATE TABLE accounts_2011
(
account_id integer NOT NULL,
account_name character varying(100),
dt_account date,
CONSTRAINT accounts_2011_pkey PRIMARY KEY (account_id ),
CHECK ( EXTRACT(YEAR FROM dt_account) = 2011))
INHERITS (accounts);
ALTER TABLE accounts_2011 OWNER TO postgres;

CREATE TABLE accounts_2010
(
account_id integer NOT NULL,
account_name character varying(100),
dt_account date,
CONSTRAINT accounts_2010_pkey PRIMARY KEY (account_id ),
CHECK ( EXTRACT(YEAR FROM dt_account) = 2010))
INHERITS (accounts);
ALTER TABLE accounts_2010 OWNER TO postgres;

--Creating shadow table for primary and foreign keys

CREATE TABLE accounts_shadow
(
account_id integer NOT NULL,
CONSTRAINT accounts_shadow_pkey PRIMARY KEY (account_id )
)
WITH (
OIDS=FALSE
);
ALTER TABLE accounts OWNER TO postgres;

--Creating parent billing table

CREATE TABLE billing
(
billing_id integer NOT NULL,
billing_type boolean,
dt_billing date,
CONSTRAINT billing_pkey PRIMARY KEY (billing_id ),
CONSTRAINT bill_id_fk FOREIGN KEY (billing_id)
REFERENCES accounts_shadow (account_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT
)
WITH (
OIDS=FALSE
);
ALTER TABLE billing
OWNER TO postgres;

--Creating child billing tables

CREATE TABLE billing_2012
(
billing_id integer NOT NULL,
billing_type boolean,
dt_billing date,
CONSTRAINT billing_2012_pkey PRIMARY KEY (billing_id ),
CONSTRAINT bill_id_2012_fk FOREIGN KEY (billing_id)
REFERENCES accounts_shadow (account_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT,
CHECK ( EXTRACT(YEAR FROM dt_billing) = 2012))
INHERITS (billing);
ALTER TABLE billing OWNER TO postgres;

CREATE TABLE billing_2011
(
billing_id integer NOT NULL,
billing_type boolean,
dt_billing date,
CONSTRAINT billing_2011_pkey PRIMARY KEY (billing_id ),
CONSTRAINT bill_id_2011_fk FOREIGN KEY (billing_id)
REFERENCES accounts_shadow (account_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT,
CHECK ( EXTRACT(YEAR FROM dt_billing) = 2011))
INHERITS (billing);
ALTER TABLE billing OWNER TO postgres;

CREATE TABLE billing_2010
(
billing_id integer NOT NULL,
billing_type boolean,
dt_billing date,
CONSTRAINT billing_2010_pkey PRIMARY KEY (billing_id ),
CONSTRAINT bill_id_2010_fk FOREIGN KEY (billing_id)
REFERENCES accounts_shadow (account_id) MATCH SIMPLE
ON UPDATE RESTRICT ON DELETE RESTRICT,
CHECK ( EXTRACT(YEAR FROM dt_billing) = 2010))
INHERITS (billing);
ALTER TABLE billing OWNER TO postgres;

After creating the tables its time to create the functions and triggers like so:

--Function for billing table

CREATE OR REPLACE FUNCTION billing_trig() RETURNS TRIGGER
AS
$$
BEGIN
IF TG_OP = 'INSERT' THEN
IF EXTRACT (YEAR FROM NEW.dt_billing) = 2012 THEN
INSERT INTO billing_2012 VALUES(NEW.*);
ELSIF EXTRACT (YEAR FROM NEW.dt_billing) = 2011 THEN
INSERT INTO billing_2011 VALUES(NEW.*);
ELSIF EXTRACT (YEAR FROM NEW.dt_billing) = 2010 THEN
INSERT INTO billing_2010 VALUES(NEW.*);
ELSE
RAISE NOTICE 'Wrong Year in NEW.dt_billing Values%', NEW.dt_billing;
END IF;
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER insert_billing_tg
BEFORE INSERT ON billing
FOR EACH ROW
EXECUTE PROCEDURE billing_trig();

--Function for accounts table

CREATE OR REPLACE FUNCTION accounts_trig() RETURNS TRIGGER
AS
$$
BEGIN
IF TG_OP = 'INSERT' THEN
IF EXTRACT (YEAR FROM NEW.dt_account) = 2012 THEN
INSERT INTO accounts_2012 VALUES(NEW.*);
ELSIF EXTRACT (YEAR FROM NEW.dt_account) = 2011 THEN
INSERT INTO accounts_2011 VALUES(NEW.*);
ELSIF EXTRACT (YEAR FROM NEW.dt_account) = 2010 THEN
INSERT INTO accounts_2010 VALUES(NEW.*);
ELSE
RAISE NOTICE 'Wrong Year in NEW.dt_account Values%', NEW.dt_account;
END IF;
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
CREATE TRIGGER insert_account_tg
BEFORE INSERT ON accounts
FOR EACH ROW
EXECUTE PROCEDURE accounts_trig();

Finally lets create a rule on the accounts table to insert new values into the shadow table:

CREATE RULE 'accounts_id_rule' AS ON INSERT
TO accounts
DO INSERT INTO accounts_shadow VALUES
(
NEW.account_id
);

, ,

2 Comments

pgbench

Today I am going to write about a bench marking tool that has been part of the postgres contrib module for a couple of years now and i have found it very useful for a couple of reasons:

–> If you have a specific batch of queries that you would like to benchmark it can be done using pgbench.

–> Upgrading to the new postgres version and would like to set a couple of benchmark jobs to check the performance.

–> Thinking about changing memory parameters and would like to run some tests before you can implement it in production.

–> Testing new hardware

First lets talk about the install process:

I am assuming you have postgres setup via source and source is available for install of pgbench.

Change working directory to contrib/pgbench

postgres@debian:~$make

postgres@debian:~$make install

Once you have pgbench installed its time to initialize pgbench on the database you would like to run pgbench on.

postgres@debian:~$pgbench -i test

-i will initialize pgbench as in it creates tables and functions in the pgbench schema that will be used for bench marking tests.

next we can tell pgbench to use number of transactions(say 10), number of clients(10) to connect for this test.

postgres@debian:~$pgbench -c 10 -t 10 test

if you are getting the error

ERROR:  relation "pgbench_branches" does not exist at character 22

you might need to initialize pgbench for the database you are running the test on.

By the end of the test you will receive the result set as below:

number of clients: 10
number of transactions per client: 10
number of transactions actually processed: 100/100
tps = 964.822569 (including connections establishing)
tps = 1145.317940 (excluding connections establishing)

A more realistic result would be by increasing the number of transactions and the clients connected with the number that matches your current production standards. Run the test a couple of times and take an average to find what kind of results you can expect.

Apart from the standard test using pgbench you can also use it to test some of your current queries:

postgres@debian:~$pgbench -f testscript.sh test

test script can have any production sql in there and pgbench will execute them and provide the time taken to run the statements.

There are many more options like how long the test should last. If vacuum should run before the benchmark test or not, scale factor, select only tests etc. You can use any of those options to get the desired test results.

If you have any questions then leave a comment and I will be sure to get back.

,

Leave a comment

Slony replication using perl tools

I have been working on Slony for a couple of years now and have started to feel very comfortable with this replication solution. For beginners it might not be so user friendly but as long as you understand the working of this replication module things will start becoming easy. I am going to list out easy ways to setup, configure and maintain this replication module.

Environment:

I have 2 CentOS boxes that I am going to setup replication on. You can have as many as you want its as simple as adding another node and subscribing the data to it. PostgreSQL version I am using is 8.4.9. Same steps would follow in any other version of postgres. Slony version I have is 2.1.0. It is always a good idea to read the release notes of Slony version to check the compatibility with the postgres server version.

There are two ways to use Slony. Shell or Perl, I prefer perl its easier and reliable option. Although I do not see much information on the web about using perl tools.

Installation:

I assume you have postgres installed on both the servers and and the Master node has the database setup and ready to be replicated. I also assume that the tables you are going to replicate have primary keys on them as this is a requirement for setting up replication.

1. Download slony 2.1.0 source from the following link:

http://slony.info/downloads/2.1/source/slony1-2.1.0.tar.bz2

2. This is a source install and we would need to configure with the current install location in this case it is ( /opt/PostgreSQL/8.4 ) use prefix option for it and then make and make install. Eg:

./configure '--prefix=/opt/slony/slony2.1'
'--with-perltools=/opt/slony/slony2.1'
'--with-pgconfigdir=/opt/PostgreSQL/8.4/bin'

make
make install

The above step needs to be performed on all nodes (master + slaves)

NOTE: This would be a good time to add a schema only copy of the database to all the slony slaves. If you already have some data in the slave tables it will be deleted when we start the slony subscribe process.

3. Create slony configuration file. This is the file that will list information on all the master and slave nodes and tables and sequences that will be added to replication. Sample copy of the config file is given below for reference.

if ($ENV{SLONYNODES}) {
require $ENV{SLONYNODES};
} else {
$CLUSTER_NAME = 'slony_test';
$LOGDIR = '/postgres/tracelogs/slony_log';
$MASTERNODE = 1;
add_node(node =&gt; 1,
host =&gt; '127.0.0.1',
dbname =&gt; 'test',
port =&gt; 6543,
user =&gt; 'postgres',
password =&gt; );
add_node(node =&gt; 2,
host =&gt; '127.0.0.1',
dbname =&gt; 'test',
port =&gt; 6543,
user =&gt; 'postgres',
password =&gt; );

#(you can add more nodes here by copying the above node)

}

$SLONY_SETS = {
"set1" =&gt; {
"set_id" =&gt; 1 ,
"table_id" =&gt; 1 ,
"sequence_id" =&gt; 1 ,
"pkeyedtables" =&gt; [
"public.test",
"public.test1",
"public.test2",
"public.test3",
],
"keyedtables" =&gt; {},
"serialtables" =&gt; [],
"sequences" =&gt; [

#sequences go in this block for the 1st set
"public.test_id_seq",
"public.test1_id_seq",
],
},
};
if ($ENV{"SLONYSET"}) {
require $ENV{"SLONYSET"};
}
1;

4. Initialize the slony cluster using the config file we created above:

./slonik_init_cluster -c slon_test.conf |slonik

5. Now we start slon process as:

./slon_start -c slon_test.conf 1

./slon_start -c slon_test.conf 2

What we are doing above is starting slon processes on node 1 and node 2. A good practice here would be start these processes on the slave box so that they use memory on the slony slave.

6. Then we need to create the set:

./slonik_create_set -c slon_test.conf 1|slonik

7. Last step is to subscribe the set for one node at a time :

./slonik_subscribe_set -c slon_test.conf 1 2|slonik

Make no mistake here or your master node will be empty by the time subscribe set completes.

slon_test.conf 1 2 above me that subscribe set 1 to node 2 if you make a mistake by doing slon_test.conf 2 1 slony will truncate tables on the master node and you will loose data.

8. Now its waiting time. let the initial sync complete. Its a good idea here to look at the log files for any errors and also the sl_status view in the slony schema in your database to check how things are moving along.

Maintaining slony replication:

The easiest part about maintaining slony replication is to remember never ever do a DDL on slony nodes without slonik_execute or taking the table out of replication. Reason is that slony does not replicate DDL statements to nodes it only does DML.

slonik_execute_script is a utility that will let you alter a table without breaking replication

if you have multiple DDL statements just put them in a script and call it using slonik_execute_script

execute_script [options] set# full_path_to_sql_script_file
execute_script [options] -c SCRIPT set#

Executes the contents of a SQL script file on the specified set.
The script only needs to exist on the machine running the slon
daemon.

set# The set to which this script applies.

-c SCRIPT Pass the SQL to be executed via the command line instead
of as a file.

-n NUM
--node=NUM Override the set origin specified in the configuration
file.

If you need to add a node or drop a node or a table or do much more about managing slony. There are a bunch of scripts available in the slony install directory. I am not going to explain them as they are pretty straight forward. Refer the below document for further details on it and if you need help with slony just leave a comment and I will be sure to write back.

http://slony.info/adminguide/2.1/doc/adminguide/slony.pdf

, , ,

3 Comments

%d bloggers like this: