Discussing load balancing with HAProxy
When an application becomes popular, it sends an increased number of requests to the application server. A single application server may not be able to handle the entire load alone. We can always scale up the underlying hardware, that is, add more memory and more powerful CUPs to increase the server capacity; but these improvements do not always scale linearly. To solve this problem, multiple replicas of the application server are created and the load is distributed among these replicas. Load balancing can be implemented at OSI Layer 4, that is, at TCP or UDP protocol levels, or at Layer 7, that is, application level with HTTP, SMTP, and DNS protocols.
In this recipe, we will install a popular load balancing or load distributing service, HAProxy. HAProxy receives all the requests from clients and directs them to the actual application server for processing. Application server directly returns the final results to the client. We will be setting HAProxy to load balance TCP connections.
Getting ready
You will need two or more application servers and one server for HAProxy:
- You will need the root access on the server where you want to install HAProxy
- It is assumed that your application servers are properly installed and working
How to do it…
Follow these steps to discus load balancing with HAProxy:
- Install HAProxy:
$ sudo apt-get update $ sudo apt-get install haproxy
- Enable the HAProxy
init
script to automatically start HAProxy on system boot. Open/etc/default/haproxy
and setENABLE
to1
: - Now, edit the HAProxy
/etc/haproxy/haproxy.cfg
configuration file. You may want to create a copy of this file before editing:$ cd /etc/haproxy $ sudo cp haproxy.cfg haproxy.cfg.copy $ sudo nano haproxy.cfg
- Find the
defaults
section and change themode
andoption
parameters to match the following:mode tcp option tcplog
- Next, define
frontend
, which will receive all requests:frontend www bind 57.105.2.204:80 # haproxy public IP default_backend as-backend # backend used
- Define
backend
application servers:backend as-backend balance leastconn mode tcp server as1 10.0.2.71:80 check # application srv 1 server as2 10.0.2.72:80 check # application srv 2
- Save and quit the HAProxy configuration file.
- We need to set
rsyslog
to accept HAProxy logs. Open thersyslog.conf
file,/etc/rsyslog.conf
, and uncomment following parameters:$ModLoad imudp $UDPServerRun 514
- Next, create a new file under
/etc/rsyslog.d
to specify the HAProxy log location:$ sudo nano /etc/rsyslog.d/haproxy.conf
- Add the following line to the newly created file:
local2.* /var/log/haproxy.log
- Save the changes and exit the new file.
- Restart the
rsyslog
service:$ sudo service rsyslog restart
- Restart HAProxy:
$ sudo service haproxy restart
- Now, you should be able to access your backend with the HAProxy IP address.
How it works…
Here, we have configured HAProxy as a frontend for a cluster of application servers. Under the frontend
section, we have configured HAProxy to listen on the public IP of the HAProxy server. We also specified a backend for this frontend. Under the backend
section, we have set a private IP address of the application servers. HAProxy will communicate with the application servers through a private network interface. This will help to keep the internal network latency to a minimum.
HAProxy supports various load balancing algorithms. Some of them are as follows:
- Round-robin distributes the load in a round robin fashion. This is the default algorithm used.
- leastconn selects the backend server with fewest connections.
- source uses the hash of the client's IP address and maps it to the backend. This ensures that requests from a single user are served by the same backend server.
We have selected the leastconn algorithm, which is mentioned under the backend
section with the balance leastconn
line. The selection of a load balancing algorithm will depend on the type of application and length of connections.
Lastly, we configured rsyslog
to accept logs over UDP. HAProxy does not provide separate logging system and passes logs to the system log daemon, rsyslog
, over the UDP stream.
There's more …
Depending on your Ubuntu version, you may not get the latest version of HAProxy from the default apt
repository. Use the following repository to install the latest release:
$ sudo apt-get install software-properties-common $ sudo add-apt-repository ppa:vbernat/haproxy-1.6 # replace 1.6 with required version $ sudo apt-get update && apt-get install haproxy
See also
- An introduction to load balancing the HAProxy concepts at https://www.digitalocean.com/community/tutorials/an-introduction-to-haproxy-and-load-balancing-concepts