If I was teaching Server Health 101, I’d start with four key metrics:
- CPU Usage
- Memory Usage
- Network I/O
- Disk Utilization
The approach for fetching these metrics on Linux hosts is tried-and-true (hint: look in the /proc
folder). However, I was curious about Docker Containers: where do I access these critical metrics? What tools does Docker provide? Are there any gotchas?
Read on for what I found.
Where are container metrics accessed on the host?
Under-the-hood, Docker Containers (and all Linux-based containers), are based on control groups. Control groups (cgroups) isolate the resource usage (CPU, memory, disk I/O, and network) for a collection of processes.
Remember how I mentioned system resource metrics for hosts are found under the /proc
folder? Well, for cgroups, they are found under /sys/fs/cgroup
.
For example, if I wanted to fetch the memory metrics for a running container, I’d first fetch the long-form container ID (docker ps --no-trunc
), then look for its metrics under /sys/fs/cgroup/memory/docker/(id)
.
Here’s some truncated example output for a cgroup’s memory metrics:
$ sudo cat /sys/fs/cgroup/memory/docker/10b0fb69677ef5e42cd8dc817b452e179104145a0216b6cb010c8ac0a9351208/memory.stat ... total_cache 110592 total_rss 211771392 total_rss_huge 174063616 ...
total_rss
is the memory used, so my container is using 202 MB of memory.
You can fetch memory, CPU, and disk IO metrics from the /sys/fs/cgroup
pseudo files, but what about network activity? Well, network metrics aren’t accessible under this pseudo file system. Getting access to this, is well, more complicated.
Rather than fetching network metrics through involved means, lets jump to the Docker Stats API. It provides an easier way.
The Docker Stats API
Starting with Docker 1.5, there are two handy ways to view stats on your containers. This approach makes it much easier to fetch metrics than reading from /sys/cgroups
or jumping through hoops to read network metrics.
Command Line
docker stats
, with the name/ids of containers, will render a handy top-like display of key metrics. For example, if I’m running an Elasticsearch container named – you guessed it – “elasticsearch”, I’d run docker stats elasticsearch
and see a constantly updating display in my terminal:
CONTAINER CPU % MEM USAGE/LIMIT MEM % NET I/O elasticsearch 0.75% 202.2 MiB/1.958 GiB 10.09% 9.937 KiB/10.11 KiB
Boom! There’s CPU, Memory, and Network metrics. The only thing I’m missing is disk activity.
Lets look at the Docker Remote API for that.
API Endpoint
Many more metrics are available by fetching metrics from the Docker Remote API directly via GET /container/(id)/stats. This handy endpoint encapsulates all of the key health metrics (cpu, memory, disk io, and network io) for a single container.
How about some example code for fetching disk metrics since those aren’t rendered via docker stats
? I’ll use the Ruby Excon gem to connect to the Docker socket, fetch metrics for a container, and exit:
#!/usr/bin/env ruby # Pass the container name as an argument require 'excon' require 'json' def streamer lambda do |chunk, remaining_bytes, total_bytes| stats = JSON.parse(chunk) puts stats["blkio_stats"]["io_service_bytes_recursive"] exit end end connection = Excon.new('unix:///', socket: '/var/run/docker.sock') connection.request(method: :get, path: "/containers/#{ARGV[0]}/stats", response_block: streamer)
After looking at our disk activity, it looks like our short lived container has only read 3.62MB of data.
Now, a couple of gotchas here:
- Streaming: When you connect to the Docker API stats endpoint, it will start streaming stats every second. Don’t fret – this is going to disappear. In the near-future, you’ll be able to fetch a non-streaming version of container stats via
GET /container/(id)/stats?stream=false
. - Counters, not rates: Many of the metrics you fetch via the Docker API are incrementing counters. This includes the disk io metrics. It’s not the rate of activity (ex: disk writes/min). In a full implementation of Docker stats monitoring, you’d likely want to convert these to rates. I’ve left that to you :).
A sample monitoring script
I threw together a sample script that reports metrics via everyone’s favorite metrics protocol, StatsD so you can easily view charts of container metrics. You’ll need to have Ruby installed on the host.
On your host running Docker:
1Grab the script:
wget https://gist.githubusercontent.com/blurredbits/1f716615998bb44d0be3/raw/5921aa6870a69beb69f4e5327dbe9b5b4a32fe41/container_monitor.rb
2I’ll use Scout to quickly get up and running as it supports StatsD:
curl -Sso scout_install.sh https://scoutapm.com/scout_install.sh; sudo /bin/bash ./scout_install.sh YOUR_ACCOUNT_KEY_HERE
3 Make the script executable:
chmod +x container_monitor.rb
4 Run the script:
./container_monitor.rb
5 Start a container:
docker run -it ubuntu /bin/bash
6 Execute a few commands within the container, and watch the metrics flow in Scout!
TL;DR
Whether using Docker or another Linux-based container system, containers use cgroups under-the-hood. Memory, CPU, and Disk IO metrics are accessible in the pseudo file system location /sys/fs/cgroup
. Grabbing network starts is more involved.
Docker conveniently encapsulates fetching container stats via the docker stats
command. You can also fetch more detailed streaming metrics via the remote API.
Also see
- Monitoring Docker Events
- docker-scout
- Implementing Docker event monitoring from scratch
- Scout joins Docker Technology Partner Program
For more Docker monitoring insights, follow us on Twitter.
doug+scoutapp@scripted.com