Bloats

Bloats can cause some serious performance impact on queries and you will see server load rise from what it was at an earlier point in time. The reason for bloats in a table/index could be due to various reasons. Updates rollbacks failed transactions etc. What is important here to identify these bloats and remove them before things get too bad.

I have simplified the process of identifying these bloats using the below script. The result set what you will see is:

relation | reltuples | relpages | otta | bloat | wastedpages | wastedbytes | pwastedbytes | relbytes | prelbytes
---------------------------------+------------+------------+------+-------+------------------+-----------------+-------------------+------------+-------------------------------------
account.activation_return_policy | 973111 | 164186 | 5240 | 31.3 | 158946 | 1302085632 | 1242 MB | 1345011712 | 1283 MB
billing.search_bill_info                   | 688832 | 6754 | 3372 | 2.0 | 3382 | 27705344 | 26 MB | 55328768 | 53 MB
(2 rows)

Here you see that there are two tables that have 31.3% and 2.0% bloats respectively. It is a good idea to run pgreorg or cluster command depending on the database availability requirement to remove these bloats.

NOTE: CLUSTER command will lock the table for the time it removes the bloat and is slower than pgreorg. Make a practical choice here by looking at the options you have. If you need more information on pgreorg , look though my post about pgreorg. 

The sql that I have used to get this information is given below. It is a good idea to put the sql in a crontab and probably send out the result set via email so that you are aware of the bloated tables and can work on reindexing /clustering or using pgreorg at a later time.

SELECT
schemaname||'.'||tablename as relation, reltuples::bigint, relpages::bigint, otta,
ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS bloat,
CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,
CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,
pg_size_pretty((CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END)::bigint) AS pwastedbytes,
bs*relpages::bigint as relbytes,
pg_size_pretty((bs*relpages::bigint)::bigint) as prelbytes
FROM (
SELECT
schemaname, tablename, cc.reltuples, cc.relpages, bs,
CEIL((cc.reltuples*((datahdr+ma-
(CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)) AS otta
FROM (
SELECT
ma,bs,schemaname,tablename,
(datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
(maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
FROM (
SELECT
schemaname, tablename, hdr, ma, bs,
SUM((1-null_frac)*avg_width) AS datawidth,
MAX(null_frac) AS maxfracsum,
hdr+(
SELECT 1+count(*)/8
FROM pg_stats s2
WHERE null_frac<>0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
) AS nullhdr
FROM pg_stats s, (
SELECT
(SELECT current_setting('block_size')::numeric) AS bs,
CASE WHEN substring(v,12,3) IN ('8.0','8.1','8.2') THEN 27 ELSE 23 END AS hdr,
CASE WHEN v ~ 'mingw32' THEN 8 ELSE 4 END AS ma
FROM (SELECT version() AS v) AS foo
) AS constants
GROUP BY 1,2,3,4,5
) AS foo
) AS rs
JOIN pg_class cc ON cc.relname = rs.tablename and cc.relkind = 'r'
JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname <> 'information_schema'
) AS sml
WHERE sml.relpages - otta > 128
AND ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::numeric END,1) > 1.2
AND CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END > 1024 * 100
ORDER BY wastedbytes DESC;
Advertisements

, , ,

  1. #1 by Anandan Kanagarajan on August 18, 2015 - 10:43 am

    Hi,

    I getting the following error when I am running this query on a postgres 9.3.5 cluster. Any help is appreciated.

    psql:/bkup/bt.sql:4: ERROR: syntax error at or near “;”
    LINE 4: CASE WHEN relpages <
    ^
    psql:/bkup/bt.sql:5: ERROR: syntax error at or near “otta”
    LINE 1: otta THEN 0 ELSE relpages::bigint – otta END AS wastedpages,
    ^
    psql:/bkup/bt.sql:42: ERROR: syntax error at or near “otta”
    LINE 1: otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS waste…
    ^
    psql:/bkup/bt.sql:43: ERROR: syntax error at or near “128”
    LINE 1: 128
    ^
    psql:/bkup/bt.sql:44: ERROR: syntax error at or near “1.2”
    LINE 1: 1.2
    ^
    psql:/bkup/bt.sql:44: ERROR: syntax error at or near “otta”
    LINE 1: otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END >
    ^
    psql:/bkup/bt.sql:45: ERROR: syntax error at or near “1024”
    LINE 1: 1024 * 100
    ^

    • #2 by Vikram Rai on August 19, 2015 - 1:50 am

      try this and let me know what you get. it works for 9.3.5

      SELECT
      current_database(), schemaname, tablename, /*reltuples::bigint, relpages::bigint, otta,*/
      ROUND(CASE WHEN otta=0 THEN 0.0 ELSE sml.relpages/otta::NUMERIC END,1) AS tbloat,
      CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::BIGINT END AS wastedbytes,
      iname, /*ituples::bigint, ipages::bigint, iotta,*/
      ROUND(CASE WHEN iotta=0 OR ipages=0 THEN 0.0 ELSE ipages/iotta::NUMERIC END,1) AS ibloat,
      CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes
      FROM (
      SELECT
      schemaname, tablename, cc.reltuples, cc.relpages, bs,
      CEIL((cc.reltuples*((datahdr+ma-
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::FLOAT)) AS otta,
      COALESCE(c2.relname,'?') AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,
      COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::FLOAT)),0) AS iotta — very rough approximation, assumes all cols
      FROM (
      SELECT
      ma,bs,schemaname,tablename,
      (datawidth+(hdr+ma-(CASE WHEN hdr%ma=0 THEN ma ELSE hdr%ma END)))::NUMERIC AS datahdr,
      (maxfracsum*(nullhdr+ma-(CASE WHEN nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
      FROM (
      SELECT
      schemaname, tablename, hdr, ma, bs,
      SUM((1-null_frac)*avg_width) AS datawidth,
      MAX(null_frac) AS maxfracsum,
      hdr+(
      SELECT 1+COUNT(*)/8
      FROM pg_stats s2
      WHERE null_frac0 AND s2.schemaname = s.schemaname AND s2.tablename = s.tablename
      ) AS nullhdr
      FROM pg_stats s, (
      SELECT
      (SELECT current_setting(‘block_size’)::NUMERIC) AS bs,
      CASE WHEN SUBSTRING(v,12,3) IN (‘8.0′,’8.1′,’8.2’) THEN 27 ELSE 23 END AS hdr,
      CASE WHEN v ~ ‘mingw32’ THEN 8 ELSE 4 END AS ma
      FROM (SELECT version() AS v) AS foo
      ) AS constants
      GROUP BY 1,2,3,4,5
      ) AS foo
      ) AS rs
      JOIN pg_class cc ON cc.relname = rs.tablename
      JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname = rs.schemaname AND nn.nspname ‘information_schema’
      LEFT JOIN pg_index i ON indrelid = cc.oid
      LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid
      ) AS sml
      ORDER BY wastedbytes DESC

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: