Detect and Fix HAProxy+Apache+Passenger Queue Backlogs
To inspire hard work, some young men hang a poster on their wall that includes: (1) an exotic sports car (2) a
I don't like running errands because I don't like waiting in lines. My nightmare: having to
- Finding a parking spot
- Getting a shopping cart
- Checking out
Modern web apps face the same queuing issues serving web requests under heavy traffic. For example, a web request served by Scout passes through several queues.
That's Apache (for SSL processing) to HAProxy on the load balancer, then Apache to Passenger to the Rails app on a web server.
A request can get stuck in any of those five spots. The worst part about queues? Time in
Now, before you start worrying about queues, take a deep breath. First, each of these systems
Third (and most importantly), each of these systems handles queues in remarkably similar ways. Understanding some basic queuing concepts will go a long way. Let's take a look at some basics and then specific examples for Apache, HAProxy, and Passenger.
Global queues prevent large outliers
If you're shopping during
You don't need to do anything to enable global queuing for Apache and HAProxy. For Passenger, it depends on your version: according to the Passenger docs, the default value for
Beware of cascading backlogs
You're opening a hot new club in the warehouse area of the city because all hot new clubs open in warehouse areas. You tell the bouncer to keep the line outside the door long to make your club look busy. On the inside though, things are calm: there's no wait at the bar.
However, your burly bouncer is a teddy bear and lets everyone inside so they don't have to wait in the cold. Suddenly, there's no line, but the bartenders are overwhelmed. The backlog was just shifted to another queue.
It's the same with your web app: increasing the number of max HAProxy connections will push more traffic to your web servers. This may cause backlogs on the web servers. It may cause higher database activity. You'll need to closely monitor the performance of your app when you open the floodgates on your load balancer.
Faster app performance = fewer backlogs
Busy lunch spots know the faster they
More capacity = more memory
Brick and mortar businesses need a healthy balance of staff vs. customers. Too much staff during a non-peak time wastes money: too little during a busy time means upset customers.
It's the same for your Rails app. Increasing your capacity has real-world costs: memory. The biggest consumer of memory will likely be Passenger as it serves the actual Rails app (which may be hundreds of MB in size).
Requests will back up in Apache if the maximum allowed connections
Watch this line:
If idle workers
The command above refreshes the Apache Server Status every second. We
To enable the status page, you’ll need to use
Requests will back up in HAProxy if:
- The global number of maximum connections is exceeded
- The max connections allowed for a specific backend is exceeded
The easiest way to check for an HAProxy backlog is to examine the output of the HAProxy stats page. There are four important metrics to watch.
At the top of the page,
maxconn shows the maximum number of connections HAProxy will handle.
current conns shows the number of connections HAProxy is handling now:
In this case, we're using 20 of 4096 available connections. There is plenty of headroom.
For each backend server, look at the
Limit columns under the
In this case, the maximum number of concurrent sessions a single web server handled was 561. This is a bit more than half of the 1,000 connection limit specified in the adjacent column.
Enabling the stats page
stats enable stats uri /haproxy?stats stats auth administrator:PASS
To modify the global maximum number of connections, change
server web1 web1.host.com:80 maxconn 1000
Phusion Passenger serves our Rails and Sinatra apps. To look for a queue backlog we use the following command:
This displays information on each Passenger process. If the
Sessions count is high for a process, it has a queue of requests waiting to be processed:
There is a 20 session backlog for the process above: it looks like we need to increase the number of Passenger processes.
As I mentioned earlier, Passenger is likely to be the biggest consumer of memory in your web stack. It requires some special attention. Take a look at our previous post: Production Rails Tuning with Passenger: PassengerMaxProcesses for instructions on tuning Passenger.
A big challenge with ongoing queue monitoring is sampling: the commands used to watch for queue backups in Apache, HAProxy, and Passenger all show the current status. If you aren’t watching these over time, you may miss a backup.
The easiest solution we’ve found is monitoring request times at the highest point in the stack. For us, this is Apache on our load balancer. Anything that happens beneath Apache (HAProxy, or on an individual web server) will show up here. If we’re seeing larger request times we can dig deeper into the stack.
We use Scout's Apache Log Analyzer and Rails Monitoring plugins to monitor request times. A spike in Apache request times can indicate a queue backlog:
We use Scout’s HAProxy Monitoring plugin to monitor HAProxy.
See our previous post: Is your Rails app under-provisioned? for more.
- Configuration Documentation
- HAProxy Monitoring Scout Plugin
- Phusion Passenger