PostgreSQL does have a built in auto vacuum , but sometimes you just want a small script that can be ran through Jenkins to perform the vacuum for you.
Wanted to share with you guys a small Python script I wrote that will perform a VACUUM VERBOSE ANALYZE on every table within a database.
You will need to get psycopg2 installed from PyPi first:
pip install psycopg2
At which point you should be able to use the below script with the correct environment variables to vacuum your database:
import os import psycopg2 from datetime import datetime dbname = os.environ.get('PG_DATABASE', '') user = os.environ.get('PG_USER', '') host = os.environ.get('PG_HOST', '') password = os.environ.get('PG_PASS', '') schema = os.environ.get('PG_SCHEMA', '') class Vaccum(object): def __init__(self): """ Connect to Postgre database. """ c = "dbname='%s' user='%s' host='%s' password='%s'" try: self.conn = psycopg2.connect(c % (dbname, user, host, password)) except: raise Exception('Unable to connect to PostgreSQL.') def get_tables(self): """ Get all tables in database's schema. """ query = """ SELECT table_name FROM information_schema.tables WHERE table_schema = '%s' """ % schema cur = self.conn.cursor() cur.execute(query) for table in cur.fetchall(): yield(table) def vaccum(self, table): """ Run Vacuum on a given table. """ query = "VACUUM VERBOSE ANALYZE %s" % table # VACUUM can not run in a transaction block, # which psycopg2 uses by default. # http://bit.ly/1OUbYB3 isolation_level = self.conn.isolation_level self.conn.set_isolation_level(0) cur = self.conn.cursor() cur.execute(query) # Set our isolation_level back to normal self.conn.set_isolation_level(isolation_level) return self.conn.notices if __name__ == '__main__': v = Vaccum() for table in v.get_tables(): # simple verbose printout for VACUUM process. now = datetime.utcnow() verbose = '[%s] VACUUM VERBOSE ANALYZE %s' % (now, table) print '=' * len(verbose) print verbose # print out the VACUUM VERBOSE results # using formating. notices = v.vaccum(table) for notice in notices: print notice
Usage looks like the following:
## set our environment variables for our database. export PG_DATABASE=project export PG_USER=user export PG_PASS=password export PG_SCHEMA=public export PG_HOST=localhost ## execute the script to vacuum the database $ python pg_vacuum.p
You should then get verbose output from the vacuum:
===================================================================== [2016-03-08 18:56:28.332974] VACUUM VERBOSE ANALYZE django_migrations INFO: vacuuming "public.django_migrations" INFO: index "django_migrations_pkey" now contains 22 row versions in 2 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "django_migrations": found 0 removable, 22 nonremovable row versions in 1 out of 1 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. Skipped 0 pages due to buffer pins. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: vacuuming "pg_toast.pg_toast_95319" INFO: index "pg_toast_95319_index" now contains 0 row versions in 1 pages DETAIL: 0 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "pg_toast_95319": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages DETAIL: 0 dead row versions cannot be removed yet. There were 0 unused item pointers. Skipped 0 pages due to buffer pins. 0 pages are entirely empty. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: analyzing "public.django_migrations" INFO: "django_migrations": scanned 1 of 1 pages, containing 22 live rows and 0 dead rows; 22 rows in sample, 22 estimated total rows