PostgreSQL Vacuum Script
PostgreSQL does have a built in auto vacuum , but sometimes you just want a small script that can be ran through Jenkins to perform the vacuum for you.
Wanted to share with you guys a small Python script I wrote that will perform a VACUUM VERBOSE ANALYZE on every table within a database.
You will need to get psycopg2 installed from PyPi first:
pip install psycopg2
At which point you should be able to use the below script with the correct environment variables to vacuum your database:
import os
import psycopg2
from datetime import datetime
dbname = os.environ.get('PG_DATABASE', '')
user = os.environ.get('PG_USER', '')
host = os.environ.get('PG_HOST', '')
password = os.environ.get('PG_PASS', '')
schema = os.environ.get('PG_SCHEMA', '')
class Vaccum(object):
def __init__(self):
"""
Connect to Postgre database.
"""
c = "dbname='%s' user='%s' host='%s' password='%s'"
try:
self.conn = psycopg2.connect(c % (dbname, user, host, password))
except:
raise Exception('Unable to connect to PostgreSQL.')
def get_tables(self):
"""
Get all tables in database's schema.
"""
query = """
SELECT table_name
FROM information_schema.tables
WHERE table_schema = '%s'
""" % schema
cur = self.conn.cursor()
cur.execute(query)
for table in cur.fetchall():
yield(table[0])
def vaccum(self, table):
"""
Run Vacuum on a given table.
"""
query = "VACUUM VERBOSE ANALYZE %s" % table
# VACUUM can not run in a transaction block,
# which psycopg2 uses by default.
# http://bit.ly/1OUbYB3
isolation_level = self.conn.isolation_level
self.conn.set_isolation_level(0)
cur = self.conn.cursor()
cur.execute(query)
# Set our isolation_level back to normal
self.conn.set_isolation_level(isolation_level)
return self.conn.notices
if __name__ == '__main__':
v = Vaccum()
for table in v.get_tables():
# simple verbose printout for VACUUM process.
now = datetime.utcnow()
verbose = '[%s] VACUUM VERBOSE ANALYZE %s' % (now, table)
print '=' * len(verbose)
print verbose
# print out the VACUUM VERBOSE results
# using formating.
notices = v.vaccum(table)
for notice in notices:
print notice
Usage looks like the following:
## set our environment variables for our database.
export PG_DATABASE=project
export PG_USER=user
export PG_PASS=password
export PG_SCHEMA=public
export PG_HOST=localhost
## execute the script to vacuum the database
$ python pg_vacuum.p
You should then get verbose output from the vacuum:
=====================================================================
[2016-03-08 18:56:28.332974] VACUUM VERBOSE ANALYZE django_migrations
INFO: vacuuming "public.django_migrations"
INFO: index "django_migrations_pkey" now contains 22 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "django_migrations": found 0 removable, 22 nonremovable row
versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: vacuuming "pg_toast.pg_toast_95319"
INFO: index "pg_toast_95319_index" now contains 0 row versions in 1 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_95319": found 0 removable, 0 nonremovable row versions
in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.django_migrations"
INFO: "django_migrations": scanned 1 of 1 pages, containing 22 live rows
and 0 dead rows; 22 rows in sample, 22 estimated total rows