Once in a while we get into the situation where we are given the task to work on a new database. Without knowing the design of the database it is hard to write and explain queries and gather the desired results. There is one opensource tool that can come in very handy before you dive into understanding the database design.
So I figured I will write something on Schemaspy which works very well on a majority of databases. But we will only talk about postgresql for now. Configuration and installation is pretty much the same for all other database types.
Ok so lets get started with the install process. This install is on Ubuntu 12.04 but will be pretty much the same for any other linux base OS.
Install the following dependencies:
1. vik@debian$ sudo apt-get install graphviz 2. vik@debian$ wget http://jdbc.postgresql.org/download/postgresql-9.1-902.jdbc4.jar (place the jar where you can access it with schemaspy) 3. vik@debian$ sudo apt-get install openjdk-7-jdk openjdk-7-jre
vik@debian$ wget http://sourceforge.net/projects/schemaspy/files/schemaspy/SchemaSpy%205.0.0/schemaSpy_5.0.0.jar/download -O schemaSpy_5.0.0.jar
Lets fire it up:
vik@debian$ java -jar schemaSpy_5.0.0.jar -t pgsql -host localhost:2525 -db test -s testschema -u postgres -p password -dp /home/vik/plugins/postgres/postgresql-9.1-902.jdbc4.jar -o /home/vik/dbgui What does the above mean? -t pgsql (selects pgsql as dbtype to connect) -host (obviously where your postgresql database is use :port#) -db (database to connect to) -s (schema name) -dp (this is the location where the jdbc plugin is) -o (this is the location where schemaspy will put all the gui representation files in)
If you want your gui representation to work on your browser from any box. Get apache2 and create a symlink like so:
vik@debian$ sudo apt-get install apache2 vik@debian$ cd /var/www vik@debian$ sudo ln -s /home/vik/dbgui test (test is the dbname you can have your own there)
Thats it access schemaspy:
Here is a schema sample output from the schemspy website:
During the course of database and application maturity there are times when we try hard to improve the performance of the database to meet customer expectations in accordance to the agreed standards. These are the times when we think lets add an index on that old table to make it work faster.
There are a couple of things you should keep in mind when you are looking at adding an index to your table. Ask yourself the question “will this really help in performance”? Consider a scenario where you have a table that is pretty busy with inserts all day. You have a requirement for a report from that table during the night and add an index to speed up that report. Well works perfect for your little report running during off hours. How about the inserts happening during the day on the table that is being used excessively. You will notice there is a fall in performance there. Reason being you added another index and that index also needs to be updated with every insert/update. Increasing the time it takes and reducing the performance.
Make sure you run the below query once a month to get a report on the unused indexes on your database. This can be pretty useful when trying to boost your database’s performance. Also you will find that you have freed up quite some space that was lying around not being used by the database.
I would recommend running this off of crontab and sending the output via email every month.
select indexrelid::regclass as index, relid::regclass as table from pg_stat_user_indexes JOIN pg_index USING (indexrelid) where idx_scan = 0 and indisunique is false;
NOTE: don’t just blindly drop the indexes listed in here. Make sure they are useless first. Also, running this is useless after a fresh restore or after calling pg_stat_reset(). If your stats have reset run at least one business cycle on the database before running the sql queries.
Index maintenance also covers removing duplicates. A lot of times I have seen indexes on a table with a unique key and then the same column has a primary key constraint. Primary keys are unique. Run the below SQL and identify the duplicate indexes that can be removed from the database. This will also help in improving performance and reclaiming lost space.
SELECT pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS size, (array_agg(idx)) AS idx1, (array_agg(idx)) AS idx2, (array_agg(idx)) AS idx3, (array_agg(idx)) AS idx4 FROM ( SELECT indexrelid::regclass AS idx, (indrelid::text ||E'\n'|| indclass::text ||E'\n'|| indkey::text ||E'\n'|| coalesce(indexprs::text,'')||E'\n' || coalesce(indpred::text,'')) AS KEY FROM pg_index) sub GROUP BY KEY HAVING count(*)>1 ORDER BY sum(pg_relation_size(idx)) DESC;
If you ever thought about finding the foreign key columns that are missing indexes. Below is something that might help. Again check if you really need to add the index don’t just assume based on the query below:
select src_table, dst_table, fk_name, pg_size_pretty(s_size) as s_size, pg_size_pretty(d_size) as d_size, d from ( select distinct on (1,2,3,4,5) textin(regclassout(c.conrelid)) as src_table, textin(regclassout(c.confrelid)) as dst_table, c.conname as fk_name, pg_relation_size(c.conrelid) as s_size, pg_relation_size(c.confrelid) as d_size, array_upper(di.indkey::int, 1) + 1 - array_upper(c.conkey::int, 1) as d from pg_constraint c left join pg_index di on di.indrelid = c.conrelid and array_to_string(di.indkey, ' ') ~ ('^' || array_to_string(c.conkey, ' ') || '( |$)') join pg_stat_user_tables st on st.relid = c.conrelid where c.contype = 'f' order by 1,2,3,4,5,6 asc ) mfk where mfk.d is distinct from 0 and mfk.s_size > 1000000 order by mfk.s_size desc, mfk.d desc;
pg_top is an utility just like the UNIX top command but it has much more to offer and I find it very useful to monitor a PostgreSQL database.
It is tedious to run a top and then grep the user postgres and look for the commands running. Then also have another session of pg_stat_activity to figure out what the queries are doing. Instead it is easier to just run pg_top and get all the below with just one command
View currently running SQL statement of a process.
View query plan of a currently running SQL statement.
View locks held by a process.
View user table statistics.
View user index statistics.
Lets first install the tool:
--> wget http://pgfoundry.org/frs/download.php/1780/pg_top-3.6.2.tar.bz2 --> tar -xjvf pg_top-3.6.2.tar.bz2 --> cd pg_top-3.6.2 --> ./configure --prefix='path to where you want it installed' --> make --> make install
If you have environment variables configured for the postgres user it should pick up the pgport and binaries
If not you can use pg_top as
pg_top -p 5432 -U postgres -d database -W
Where ‘-d’ specifies the database to connect to, ‘-U’ specifies the database user, and ‘-W’ will prompt for the password if a password is required to connect.
Use the ‘Q’ key and enter a PID once pg_top is started.
To see the Query Plan use the ‘E’ key and enter a PID once pg_top is started. The ‘E’ commands runs ‘EXPLAIN’ and ‘EXPLAIN ANALYZE’ can be used by using the ‘A’ key followed by a PID.
Use the ‘L’ key and enter a PID once pg_top is started.
Use the ‘R’ key to display user table statistics. Use ‘t’ to toggle between displaying cumulative and differential statistics.
Use the ‘X’ key to display user index statistics. Use ‘t’ to toggle between displaying cumulative and differential statistics.
Update for Postgres 9.2: I tried to use pg_top for postgres 9.2 and ran into an issue where pg_top does not work well with this version of postgres(mainly because some of the columns in pg_stat_activity have been changed eg: procpid changed to pid, current_query changed to query and state has been added). I took the latest version and still no luck so figured I search this on google and found a link worked for me. Here is what I did to make it work:
downloaded the latest source of pgtop http://pgfoundry.org/frs/download.php/1780/pg_top-3.6.2.tar.bz2
untar it and used the following patch before compiling it:
curl -L http://ln-s.net/+6lb|gzip -dc|patch -p1 -d pg_top-3.6.2
then use the nomal configure make and make install
For users that are on previous versions of Postgres 8.x there is an option for warm standby that is available. In case there is an issue with the primary server the standby server can be activated to be used as primary and data loss is prevented. In order to setup warm standby follow the steps below:
Steps to setup warm standby
–> Configure the Standalone Db server
–> Start the Back Up Mode
–> Take The Base Back Up
–> Stop The Back Up Mode
–> Configure the Stand By Db server
–> Start the Start By Server
–> View The Logs
–> Switching to The Stand by server
Configure the Stand Alone DB server
1) Edit the postgresql.conf file to activate the archive mode of the server and define
where to archive the transactional logs
postgres@px:~> vi data/postgresql.conf archive_mode = on archive_command = 'cp -i %p /d0/data/archive_backups/%f' archive_timeout = 300s
2) Start the DB server
postgres@px:~> pg_ctl -D server starting data start
3) Create some objects
postgres@px:~> psql Welcome to psql 8.3.4, the PostgreSQL interactive terminal. [local]:5432:postgres# \l List of databases Name | Owner | Encoding -----------+----------+---------- postgres | postgres | UTF8 template0 | postgres | UTF8 template1 | postgres | UTF8 (3 rows) [local]:5432:postgres# create database newdb ; CREATE DATABASE [local]:5432:postgres# \c newdb You are now connected to database "newdb". [local]:5432:newdb# CREATE TABLE sample as select * from pg_class,pg_namespace ; SELECT [local]:5432:newdb# \dt List of relations Schema | Name | Type | Owner --------+--------+-------+---------- public | sample | table | postgres (1 row) postgres@px:~/data> psql [local]:5432:postgres# CREATE TABLE classproc pg_class natural join pg_proc ; SELECT as select * from
Start the Backup Mode
[local]:5432:postgres# select pg_start_backup('L1') ; pg_start_backup ----------------- 0/C621394 (1 row)
Take the Base Backup
postgres@px:~> cd basebkp postgres@px:~/basebkp> cp -r /home/postgres/data . postgres@px:~/basebkp> total 12 drwxr-xr-x 3 postgres users drwxr-xr-x 5 postgres users drwx------ 11 postgres users l 4096 2008-12-29 15:28 ./ 4096 2008-12-29 15:21 ../ 4096 2008-12-29 15:28 data/ postgres@px:~/basebkp> du -sh 369M data data postgres@px:~/basebkp> mv data/ standby postgres@px:~/basebkp> ls standby
Stop the Backup Mode
[local]:5432:postgres# CREATE TABLE chktable as select * from pg_tables ; SELECT [local]:5432:postgres# CREATE TABLE chk12 as select * from pg_tables natural join pg_class ; SELECT [local]:5432:postgres# select pg_stop_backup() ; pg_stop_backup ---------------- 0/1547E284 (1 row)
Configuring the Stand By Server
postgres@px:~/basebkp> vi standby/recovery.conf postgres@px:~/basebkp> cat basebkp/standby/recovery.conf restore_command = 'pg_standby -l -d -s 2 -t /tmp/pgsql.trigger.5432 /d0/data/archive_backups %f %p %r' postgres@px:~/basebkp> vi standby/postgresql.conf port = 5433 ##change port postgres@px:~/basebkp/standby> rm -f postmaster.pid <strong>Start the Stand By Server</strong> postgres@px:~/basebkp> pg_ctl -D standby start server starting postgres@px:~/basebkp> psql -p 5433 -l psql: FATAL: the database system is starting up
View LOG files of the Stand By Server
postgres@px:~> less ~/basebkp/standby/pg_log/postgresql- 2008-12-29_162601.log [3263 4958acc1.cbf 1]LOG: database system was interrupted; last known up at 2008-12-29 15:27:28 IST [3263 4958acc1.cbf 2]LOG: starting archive recovery [3263 4958acc1.cbf 3]LOG: restore_command = 'pg_standby -l -d -s 2 -t /tmp/pgsql.trigger.5432 /d0/data/archive_backups %f %p %r' Trigger file : /tmp/pgsql.trigger.5432 Waiting for WAL file : 00000001.history WAL file path : /d0/data/archive_backups/00000001.history Restoring to... : pg_xlog/RECOVERYHISTORY Sleep interval : 2 seconds Max wait interval : 0 forever Command for restore : ln -s -f "/d0/data/archive_backups/00000001.history" "pg_xlog/RECOVERYHISTORY" Keep archive history : 000000000000000000000000 and later running restore : OK
Stop Stand Alone Server and Use the server as a Live DB server
postgres@px:~> touch /tmp/pgsql.trigger.5432 postgres@px:~/basebkp/standby/pg_log> less postgresql- 2008-12-29_163930.log WAL file not present yet. Checking for trigger file... WAL file not present yet. Checking for trigger file...trigger file found [12107 4958afea.2f4b 22]LOG: could not open file "pg_xlog/00000001000000000000001A" (log file 0, segment 26): No such file or directory [12107 4958afea.2f4b 23]LOG: redo done at 0/1960BAB8 [12107 4958afea.2f4b 24]LOG: last completed transaction was at log time 2008-12-29 16:32:41.142195+05:30 Trigger file : /tmp/pgsql.trigger.5432 Waiting for WAL file : 000000010000000000000019 WAL file path : /d0/data/archive_backups/000000010000000000000019 Restoring to... : pg_xlog/RECOVERYXLOG
Now lets check if everything is in place and you are all set
Lets talk about the most widely used monitoring system for PostgreSQL. There are other tools available, some paid and do a really good job but I would like to use something that is free for the general public and reliable. Nagios!!
Nagios is an open source system of widely used network monitoring, which monitors the equipment (hardware) and services (software) that you specify, alerting them when their behavior is not desired.
Configuring the monitoring server:
Web Server (This manual assumes that you use Apache)
Php (On the web interface)
Libgd (a graphics library is needed to display the statusmap)
Create a user nagios
useradd-m nagios passwd nagios
wget <a href="http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.1/nagios-3.4.1.tar.gz">http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.1/nagios-3.4.1.tar.gz</a> tar -xzvf nagios-3.4.1.tar.gz cd nagios-3.4.1 . / Configure - with-command-group = nagios make all make install make install-init make install-config make install-commandmode
Edit the file / usr / local / nagios / etc / objects / contacts.cfg
This change is to indicate to email alerts notify each set.
Replace “nagios @ localhost” by the post of administrator
Configure the web interface:
make install-webconf htpasswd-c / usr / local / nagios / etc / htpasswd.users nagiosadmin nagios.nagios chown / usr / local / nagios / etc / htpasswd.users chmod 664 / usr / local / nagios / etc / htpasswd.users service httpd restart
Install Nagios Plugins:
wget <a href="http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download">http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download</a> tar -xzvf nagios-plugins-1.4.15.tar.gz cd nagios-plugins-1.4.15 . / Configure - with-nagios-user = nagios - with-nagios-group = nagios - without-pgsql make make install
cd / etc / init.d chkconfig - add nagios chkconfig nagios on To check that all is well / Usr / local / nagios / bin / nagios-v / usr / local / nagios / etc / nagios.cfg
service nagios start
At this point it is possible to enter Nagios, opening the browser and giving as address: http://localhost/nagios
Enter nagiosadmin user and a key that when you set the web interface.
If I install on the same machine that will be monitored to installation of the plugin for PostgreSQL, otherwise continue with the next section.
Set the machine to be monitored:
PostgreSQL (The instructions in this manual were tested using version 8.3.7)
Create a user Nagios useradd-m nagios passwd nagios <strong>Install Nagios Plugins:</strong> The PostgreSQL plugin then install it manually to make sure to use the latest version. wget <a href="http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download">http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.15/nagios-plugins-1.4.15.tar.gz/download</a> tar -xzvf nagios-plugins-1.4.15.tar.gz cd nagios-plugins-1.4.15 . / Configure - with-nagios-user = nagios - with-nagios-group = nagios - without-pgsql make make install nagios.nagios chown-R / usr / local / nagios
Install the module Nagios nrpe:
wget <a href="http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.13/nrpe-2.13.tar.gz">http://sourceforge.net/projects/nagios/files/nrpe-2.x/nrpe-2.13/nrpe-2.13.tar.gz</a> tar -xzvf nrpe-2.13.tar.gz cd nrpe-2.13 . / Configure make all make install-plugin make install-daemon make install-daemon-config make install-xinetd
Configure the module nrpe:
Edit the file / etc / xinet.d / nrpe modify the value of field only_from "by the server ip to be monitored Edit the file / etc / services add the following line "nrpe 5666/tcp # NRPE" Configuring the server monitoring services to check on the remote machine
Test the module nrpe:
/ Usr / local / nagios / libexec / server check_nrpe-H Edit the file / usr / local / nagios / etc / objects / commands.cfg Add the following lines at end of file define command ( command_name check_nrpe command_line $ USER1 $ / HOSTADDRESS check_nrpe-H $ ARG1 $-c $ $ ) We must tell that we will monitor Nagios another machine by editing the file / usr / local / nagios / etc / nagios.cfg Add the following line in the section "OBJECT CONFIGURATION FILE (S)": cfg_file = / usr / local / nagios / etc / objects / dbhost.cfg Creating the definition of a file host dbhost.cfg define host ( use linux-server dbserver host_name pgsql alias 8.3 IP.del.servidor.PostgreSQL address ) service nagios restart
Install PostgreSQL plugin to monitor (on the server to be monitored)
Download the file from check_postgres.tar.gz http://bucardo.org/check_postgres check_postgres.tar.gz gunzip-dc | tar xvf - cd check_postgres check_postgres.pl cp / usr / local / nagios / libexec /. perl / usr / local / nagios / libexec / check_postgres.pl-symlinks To install checks must create an entry for each service that we will monitor the file / usr / local / nagios / etc / nrpe.cfg similar to (must be added in the section "COMMAND DEFINITIONS"): command [check_postgres_locks] = / usr / local / nagios / libexec / check_postgres_locks-w 2-c 3 Where the command can be taken http://bucardo.org/check_postgres/check_postgres.pl.html Then you should specify the services to monitor, creating entries like the following in / usr / local / nagios / etc / objects / dbhost.cfg in the monitoring server: define service ( use generic-service dbserver host_name PGSQL service_description locks check_command check_nrpe! check_postgres_locks ) define service ( use generic-service dbserver host_name service_description CPU Load check_command check_nrpe! check_load ) define service ( use generic-service dbserver host_name Current users service_description check_command check_nrpe! check_users ) define service ( use generic-service dbserver host_name Total service_description Processes check_command check_nrpe! check_total_procs ) define service ( use generic-service dbserver host_name service_description Zombie Processes check_command check_nrpe! check_zombie_procs )
Bloats can cause some serious performance impact on queries and you will see server load rise from what it was at an earlier point in time. The reason for bloats in a table/index could be due to various reasons. Updates rollbacks failed transactions etc. What is important here to identify these bloats and remove them before things get too bad.
I have simplified the process of identifying these bloats using the below script. The result set what you will see is:
relation | reltuples | relpages | otta | bloat | wastedpages | wastedbytes | pwastedbytes | relbytes | prelbytes ---------------------------------+------------+------------+------+-------+------------------+-----------------+-------------------+------------+------------------------------------- account.activation_return_policy | 973111 | 164186 | 5240 | 31.3 | 158946 | 1302085632 | 1242 MB | 1345011712 | 1283 MB billing.search_bill_info | 688832 | 6754 | 3372 | 2.0 | 3382 | 27705344 | 26 MB | 55328768 | 53 MB (2 rows)
Here you see that there are two tables that have 31.3% and 2.0% bloats respectively. It is a good idea to run pgreorg or cluster command depending on the database availability requirement to remove these bloats.
NOTE: CLUSTER command will lock the table for the time it removes the bloat and is slower than pgreorg. Make a practical choice here by looking at the options you have. If you need more information on pgreorg , look though my post about pgreorg.
The sql that I have used to get this information is given below. It is a good idea to put the sql in a crontab and probably send out the result set via email so that you are aware of the bloated tables and can work on reindexing /clustering or using pgreorg at a later time.
SELECT schemaname||'.'||tablename as relation, reltuples::bigint, relpages::bigint, otta, ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS bloat, CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages, CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes, pg_size_pretty((CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END)::bigint) AS pwastedbytes, bs*relpages::bigint as relbytes, pg_size_pretty((bs*relpages::bigint)::bigint) as prelbytes FROM ( SELECT schemaname, tablename, cc.reltuples, cc.relpages, bs, CEIL((cc.reltuples*((datahdr+ma- (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta FROM ( SELECT ma,bs,schemaname,tablename, (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr, (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2 FROM ( SELECT schemaname, tablename, hdr, ma, bs, SUM((1-null_frac)*avg_width) AS datawidth, MAX(null_frac) AS maxfracsum, hdr+( SELECT 1+count(*)/8 FROM pg_stats s2 WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename ) AS nullhdr FROM pg_stats s, ( SELECT (SELECT current_setting('block_size')::numeric) AS bs, CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr, CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma FROM (SELECT version() AS v) AS foo ) AS constants GROUP BY 1,2,3,4,5 ) AS foo ) AS rs JOIN pg_class cc ON cc.relname = rs.tablename and cc.relkind = 'r' JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema' ) AS sml WHERE sml.relpages - otta > 128 AND ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) > 1.2 AND CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END > 1024 * 100 ORDER BY wastedbytes DESC;
When to partition:
Many a times I have seen people saying oh the table is not so big in size and does not need partitioning. Let me clarify a little bit here. Its never about the size of a table but the number of distinct rows in a table when making a decision on when to partition. The main reason we want to partition a table is to increase the query performance that might be taking a long time.
First things first lets not rush into conclusions and step back and think about the queries itself. Can the query be tuned to perform better? That is the question you should ask yourself and check the explain plan to probably add an index? Increase work_mem? Or just rearrange the joins or even remove some joins that are not necessary(I will write another post about how to tune queries soon). Once you are convinced about the fact that the query cannot be improved further then its time to make a decision to partition the table. The way I do this is to pick up every query that is referencing the table and check the best way moving forward. For eg: if your application queries are running off of id column and you make a partition on the date column. You are in for bigger problems and the performance will degrade. Find the column on which you would want the partition to be based on and then move ahead with the testing and implementation. Personally I would create the partition on a test box and run some serious tests using pgbench. That would give you information if its worth partitioning the table or not.
I started working on table partitioning on postgres and thought this might come in handy when we are dealing with foreign keys that are not recognized when dealing with partitions. The standard method of partition will give us information on how to create and manage partitions but when it comes to foreign keys we have to go an extra mile to make sure foreign keys are maintained when implementing partition.
I have an example below with two tables that I want to partition (accounts and billing). Billing table has a foreign key on the billing_id column referencing accounts_id as below:
Table "public.billing" Column | Type | Modifiers --------------+---------+----------- billing_id | integer | not null billing_type | boolean | dt_billing | date | Indexes: "billing_pkey" PRIMARY KEY, btree (billing_id) Foreign-key constraints: "bill_id_fk" FOREIGN KEY (billing_id) REFERENCES accounts(account_id) ON UPDATE RESTRICT ON DELETE RESTRICT
Table "public.accounts" Column | Type | Modifiers --------------+------------------------+----------- account_id | integer | not null account_name | character varying(100) | dt_account | date | Indexes: "accounts_pkey" PRIMARY KEY, btree (account_id) Referenced by: TABLE "billing" CONSTRAINT "bill_id_fk" FOREIGN KEY (billing_id) REFERENCES accounts(account_id) ON UPDATE RESTRICT ON DELETE RESTRICT
Lets start by setting up partition on both of these tables with the date field.
NOTE: I have created a shadow table for accounts to keep track of the foreign keys. The reason for this is that foreign keys can be created from one partitioned table to another table, or to a specific partition, but not to the top level of a partitioned table because the parent table has no data. In many cases you are not going to partition using that foreign key column and then table 2 (billing) can’t see the respective column in table1 (accounts). For partitions to work with foreign keys I simply created a shadow table with the account_id column and reference that table for all the child billing tables. Also, in order to maintain this shadow table I am creating a rule that will update the shadow table when the primary accounts table is updated. You can call it a hack but it works well. If you don’t care about the foreign keys then just drop them and follow steps without creating shadow table and rule.
-- Creating accounts parent table CREATE TABLE accounts ( account_id integer NOT NULL, account_name character varying(100), dt_account date, CONSTRAINT accounts_pkey PRIMARY KEY (account_id ) ) WITH ( OIDS=FALSE ); ALTER TABLE accounts OWNER TO postgres; --Creating accounts child tables CREATE TABLE accounts_2012 ( account_id integer NOT NULL, account_name character varying(100), dt_account date, CONSTRAINT accounts_2012_pkey PRIMARY KEY (account_id ), CHECK ( EXTRACT(YEAR FROM dt_account) = 2012)) INHERITS (accounts); ALTER TABLE accounts_2012 OWNER TO postgres; CREATE TABLE accounts_2011 ( account_id integer NOT NULL, account_name character varying(100), dt_account date, CONSTRAINT accounts_2011_pkey PRIMARY KEY (account_id ), CHECK ( EXTRACT(YEAR FROM dt_account) = 2011)) INHERITS (accounts); ALTER TABLE accounts_2011 OWNER TO postgres; CREATE TABLE accounts_2010 ( account_id integer NOT NULL, account_name character varying(100), dt_account date, CONSTRAINT accounts_2010_pkey PRIMARY KEY (account_id ), CHECK ( EXTRACT(YEAR FROM dt_account) = 2010)) INHERITS (accounts); ALTER TABLE accounts_2010 OWNER TO postgres; --Creating shadow table for primary and foreign keys CREATE TABLE accounts_shadow ( account_id integer NOT NULL, CONSTRAINT accounts_shadow_pkey PRIMARY KEY (account_id ) ) WITH ( OIDS=FALSE ); ALTER TABLE accounts OWNER TO postgres; --Creating parent billing table CREATE TABLE billing ( billing_id integer NOT NULL, billing_type boolean, dt_billing date, CONSTRAINT billing_pkey PRIMARY KEY (billing_id ), CONSTRAINT bill_id_fk FOREIGN KEY (billing_id) REFERENCES accounts_shadow (account_id) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT ) WITH ( OIDS=FALSE ); ALTER TABLE billing OWNER TO postgres; --Creating child billing tables CREATE TABLE billing_2012 ( billing_id integer NOT NULL, billing_type boolean, dt_billing date, CONSTRAINT billing_2012_pkey PRIMARY KEY (billing_id ), CONSTRAINT bill_id_2012_fk FOREIGN KEY (billing_id) REFERENCES accounts_shadow (account_id) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CHECK ( EXTRACT(YEAR FROM dt_billing) = 2012)) INHERITS (billing); ALTER TABLE billing OWNER TO postgres; CREATE TABLE billing_2011 ( billing_id integer NOT NULL, billing_type boolean, dt_billing date, CONSTRAINT billing_2011_pkey PRIMARY KEY (billing_id ), CONSTRAINT bill_id_2011_fk FOREIGN KEY (billing_id) REFERENCES accounts_shadow (account_id) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CHECK ( EXTRACT(YEAR FROM dt_billing) = 2011)) INHERITS (billing); ALTER TABLE billing OWNER TO postgres; CREATE TABLE billing_2010 ( billing_id integer NOT NULL, billing_type boolean, dt_billing date, CONSTRAINT billing_2010_pkey PRIMARY KEY (billing_id ), CONSTRAINT bill_id_2010_fk FOREIGN KEY (billing_id) REFERENCES accounts_shadow (account_id) MATCH SIMPLE ON UPDATE RESTRICT ON DELETE RESTRICT, CHECK ( EXTRACT(YEAR FROM dt_billing) = 2010)) INHERITS (billing); ALTER TABLE billing OWNER TO postgres;
After creating the tables its time to create the functions and triggers like so:
--Function for billing table CREATE OR REPLACE FUNCTION billing_trig() RETURNS TRIGGER AS $$ BEGIN IF TG_OP = 'INSERT' THEN IF EXTRACT (YEAR FROM NEW.dt_billing) = 2012 THEN INSERT INTO billing_2012 VALUES(NEW.*); ELSIF EXTRACT (YEAR FROM NEW.dt_billing) = 2011 THEN INSERT INTO billing_2011 VALUES(NEW.*); ELSIF EXTRACT (YEAR FROM NEW.dt_billing) = 2010 THEN INSERT INTO billing_2010 VALUES(NEW.*); ELSE RAISE NOTICE 'Wrong Year in NEW.dt_billing Values%', NEW.dt_billing; END IF; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER insert_billing_tg BEFORE INSERT ON billing FOR EACH ROW EXECUTE PROCEDURE billing_trig(); --Function for accounts table CREATE OR REPLACE FUNCTION accounts_trig() RETURNS TRIGGER AS $$ BEGIN IF TG_OP = 'INSERT' THEN IF EXTRACT (YEAR FROM NEW.dt_account) = 2012 THEN INSERT INTO accounts_2012 VALUES(NEW.*); ELSIF EXTRACT (YEAR FROM NEW.dt_account) = 2011 THEN INSERT INTO accounts_2011 VALUES(NEW.*); ELSIF EXTRACT (YEAR FROM NEW.dt_account) = 2010 THEN INSERT INTO accounts_2010 VALUES(NEW.*); ELSE RAISE NOTICE 'Wrong Year in NEW.dt_account Values%', NEW.dt_account; END IF; END IF; RETURN NULL; END; $$ LANGUAGE plpgsql; CREATE TRIGGER insert_account_tg BEFORE INSERT ON accounts FOR EACH ROW EXECUTE PROCEDURE accounts_trig();
Finally lets create a rule on the accounts table to insert new values into the shadow table:
CREATE RULE 'accounts_id_rule' AS ON INSERT TO accounts DO INSERT INTO accounts_shadow VALUES ( NEW.account_id );