Building a scalable and high available system will require some sort of load balancing. For one it gives you the ability to scale out to tackle high loads and it allows you to do deployments without having any downtime:

  1. Bootstrap new box and wait until ready
  2. Put new box into load balancer, observe
  3. Gracefully take old box out of load balancer
  4. Destroy old box

Every developer who takes devops seriously knows that you have to have a staging environment. Since your staging environment should be as close to the production, we decided to put a load balancer in front of our staging environment, like we do it fro prod. Why you ask? Because our load balancers are (in our case) not transparent: They

  • Do TLS termination
  • Perform request throttling
  • Add http headers (like client’s IP address)
  • etc.

Our system has two HTTP endpoints: Our website (customer acquisition and delivery of our browser app) and our API. As a start-up you have limited resources, especially when you are in Beta phase. So we decided in favor of HAproxy running on an EC2 instance, instead of a managed Elastic load balancer (ELB).

Important things to mention why ELB might be a better pick for you: You can add or remove instances conveniently via the AWS API, so it can be part of your Continuous Deployment pipeline. Plus it is a load balancer with better uptime (multi AZ, managed service, …). However, you might not go for the ELB if you do not want to invest too much money, when you have only one box running behind the ELB 99% of the time. So we decided to go with HAproxy.

We came up with an automation for the inclusion / removal of back-end servers in the HAproxy config. Since our boxes are managed via chef, we can use the knife search node feature.

Assuming you also use chef (not chef-solo), you can easily query for other nodes in a recipe:

template "/etc/haproxy/haproxy.cfg" do
  source "haproxy.cfg.erb"
  owner "haproxy"
  group "haproxy"
  variables({
    web_backend_nodes: search(:node, "chef_environment:production AND role:web AND NOT exclude_from_loadbalancer:true").sort_by{ |n| n.name },
    api_backend_nodes: search(:node, "chef_environment:production AND role:api AND NOT exclude_from_loadbalancer:true").sort_by{ |n| n.name }
  })
  notifies :reload, "service[haproxy]", :delayed # only reload when nodes changed.
end

service "haproxy" do
  supports status: true, restart: true, reload: true
  action [ :enable ]
end

You see the NOT exclude_from_loadbalancer thing? That is a part of the search query that tells chef to not return nodes that have set an attribute called exclude_from_loadbalancer to true. We set this attribute to false per default for all our boxes through a base-recipe, but you can also leave it undefined. By having this toggle, we can take nodes gracefully out of the load balancer by setting exclude_from_loadbalancer to true in the override_attributes on the node, then re-run chef-client on the load balancer.

And this is the excerpt from the haproxy.cfg.erb template file file:

frontend https-in
  bind :443 ssl crt /etc/ssl/key.pem ca-file /etc/ssl/bundle.crt
  acl host_web hdr(host) -i www.domain.com
  acl host_api hdr(host) -i api.domain.com

  use_backend web    if host_web
  use_backend api    if host_api

backend web
  mode http
  http-check expect status 200
  option httpchk GET /your-healthcheck-url
  <% @web_backend_nodes.each do |backend_node| %>
  server <%= backend_node.name %> <%= backend_node['ec2']['local_ipv4'] %>:80 check
  <% end %>

backend api
  mode http
  http-check expect status 200
  option httpchk GET /your-healthcheck-url
  <% @api_backend_nodes.each do |backend_node| %>
  server <%= backend_node.name %> <%= backend_node['ec2']['local_ipv4'] %>:80 check
  <% end %>

You either let chef-client run periodically on the load balancer, or you can use knife ssh "role:loadbalancer" "sudo chef-client".


blog comments powered by Disqus

Published

19 May 2016

Tags