August 12, 2015
Customers telling me our app is slow? I'm looking at a response time graph.
On the front page of Hacker News? I'm looking at requests per-second and response time on a graph.
Lots of things going wrong? Show me ALL the metrics.
The challenges with building a one-page dashboard of app health?
We track eight key health metrics for our applications:
So, what are some approaches to help me get an at-a-glance view of app health?
Let's start simple: we'll put a timeseries chart for each metric, one per-row, 350px in height (including margin), on a page.
That's 2,800 pixels in height. My MacBook has a screen height of 800 pixels, so unless I build a script to continuously scroll up and down and take a hit of dramamine, I'm not going to view all the metrics at once.
A scaled representation is below:
A good chunk of the time, I care about trends, not absolutes. I care if our response time is increasing, but not the absolute value of it (as long as it is acceptable).
Sparklines are great for this - you can grasp where a metric is heading with little space:
If I had a 140px x 20px sparkline for each of our 8 metrics, I could actually fit them in a single row across my browser window. I've got at-a-glance app health!
...but wait...it's very common that I need to interact with a metric on a chart to look at values in more detail. Sparklines are too small to support this kind of interaction well. It's also very helpful to view some metrics with stacking (ex: response time by category) - that won't fit well in a sparkline.
Where we're at:
As Comcast says, bundle it!
The display I settled on: a large chart above a row of sparklines for each key metric:
pushState(). If I want to share a view with a colleague, they'll see that state (ex: putting the error rate in the large chart).
Here's a sample interaction. Note that the blank metrics aren't yet available, but will be during our BETA period:
Visit scoutapm.com to sign up for early access to application monitoring.
Follow us on Twitter for more as we build the app monitoring solution we've always wanted to use.