Jay Taylor's notes
back to listing indexScaling tip: Tailor your HAProxy queues to the workloads you expect - The Official Posterous Tech Blog
[web search]The Official Posterous Tech Blog
Peer behind the curtain. Find out how we work our magic.
- Viewed times
Say you have a large bucket, 10 pounds of sand and 3 pounds of rocks. If you put the sand in first, you won’t have space for the rocks. But if you put the rocks in first, you’ll be able to fit the sand in the gaps between the rocks. That’s similar to how the typical load on a web server behaves for different kinds of requests.
You’ve got some requests that are fast, sometimes 5ms or less. They’re simple redirects. They’re lightweight. You can plow through them fast. That’s the sand.
You’ve got some slow requests — those are the rocks. Those hit your databases, or are infrequently used API requests.
If you try to put them together in the wrong order, you’ll get a mess. What should be fast ends up slow. What being slow ends up extra slow.
That’s why you should use haproxy to separate them out. HAProxy is a software load balancer that gives great reporting and fine-grained control over what requests should go to which servers, and how many connections to allow at any given time (minconn / maxconn).
This is an invaluable feather in your cap if you have to run a large production website. Here’s what our haproxy looks like. We have three types of requests: normal Rails requests, getfile requests (the sand — fast S3 redirects) and RSS. Every so often we get inundated by bots who request RSS feeds — and by separating our traffic out, it prevents these bots from overwhelming the rest of the site.
We increase the number of connections (60) that getfile can do to the backend because these requests are absurdly fast and they’re the sand. They should be able to get through at all times. Before we added this separate setting for these requests, they’d get backlogged behind the huge rails requests. It costs nothing for a rails request (200ms) to wait for a 5ms getfile redirect, but the reverse is not true.
Here’s our haproxy config (irrelevant details omitted) that outlines how we put separate haproxy queues to work.
global daemon pidfile /var/run/haproxy.pid user haproxy group haproxy backend getfile server app01 172.32.1.1:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app02 172.32.1.2:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app03 172.32.1.3:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app04 172.32.1.4:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app05 172.32.1.5:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app06 172.32.1.6:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 server app07 172.32.1.7:80 minconn 60 maxconn 120 check inter 10s rise 1 fall 3 weight 1 backend rss server app08 172.32.1.8:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 backend posterous_default server app01 172.32.1.1:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app02 172.32.1.2:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app03 172.32.1.3:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app04 172.32.1.4:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app05 172.32.1.5:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app06 172.32.1.6:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 server app07 172.32.1.7:80 minconn 20 maxconn 40 check inter 10s rise 1 fall 3 weight 1 frontend posterous 0.0.0.0:8282 acl getfile_path path_beg /getfile acl rss_path path_beg /rss use_backend getfile if getfile_path use_backend rss if rss_path default_backend posterous_default
As an added bonus, HAProxy can even be used for rate limiting and fighting denial of service attacks which we’ll be incorporating soon into our configuration. Check back here later to see more about how it works for us in production.