Telegraf and Missing CPU Interrupts

As I’ve been playing around with Telegraf and Grafana , I’ve noticed CPU interrupts and context switches are not apart of the standard metric gathering.

We know vmstat shows this information, and can be shown it in a easy processable list form:

$ vmstat -s
 1016888 K total memory
 497920 K used memory
 184412 K active memory
 173296 K inactive memory
 518968 K free memory
 70276 K buffer memory
 86416 K swap cache
 522236 K total swap
 88460 K used swap
 433776 K free swap
 18175 non-nice user cpu ticks
 0 nice user cpu ticks
 17799 system cpu ticks
 172172 idle cpu ticks
 7214 IO-wait cpu ticks
 0 IRQ cpu ticks
 2412 softirq cpu ticks
 0 stolen cpu ticks
 227755 pages paged in
 986944 pages paged out
 2353 pages swapped in
 121572 pages swapped out
 458705 interrupts
 1467529 CPU context switches
 1461773910 boot time
 3456 forks

I decided to hack together a little exec plugin for Telegraf using a Python script, running the script will get you standard out JSON:

$ /cpu.py
{
 "CPU context switches": 1487011,
 "IO-wait cpu ticks": 7375,
 "IRQ cpu ticks": 0,
 "K active memory": 190392,
 "K buffer memory": 70756,
 "K free memory": 512988,
 "K free swap": 433776,
 "K inactive memory": 173296,
 "K swap cache": 86560,
 "K total memory": 1016888,
 "K total swap": 522236,
 "K used memory": 503900,
 "K used swap": 88460,
 "boot time": 1461773910,
 "forks": 3607,
 "idle cpu ticks": 183663,
 "interrupts": 465536,
 "nice user cpu ticks": 0,
 "non-nice user cpu ticks": 18424,
 "pages paged in": 227767,
 "pages paged out": 990560,
 "pages swapped in": 2353,
 "pages swapped out": 121572,
 "softirq cpu ticks": 2418,
 "stolen cpu ticks": 0,
 "system cpu ticks": 17902
}

The script is a little hacky, but gets the job done for these test: cpu.py

#!/usr/bin/env python
import commands
import json

output = {}
raw = commands.getstatusoutput('vmstat -s')
raw = raw[1].split('\n')

for row in raw:
    value = int(row.split()[0])
    key = ' '.join(row.split()[1:])
    output[key] = value

print json.dumps(output)

All that is left is to add this script to Telegraf as a plugin:

$ cat /etc/telegraf/telegraf.d/cpu.conf
[[inputs.exec]]
 command = "/cpu.py"
 data_format = "json"
 name_suffix = "_cpu"
 interval = "5s"

And now we have access to all the counters shown in vmstat:

Screen Shot 2016-04-27 at 9.59.11 AM