DevOps Concepts #1: Load Balancing Algorithms
Find out what is Load Balancing and how it works. Learn different ways to do Load Balancing.
Today we are going to speak about balance. No, it’s not where you will have to bread at certain rate in one minute to calm down and achieve balance. We are speaking about load balancing.
When you are working with multiple instances of your application one of the important thing is to have proper load balancing.
Without it, you can send traffic to one instance and it needs to handle all traffic. This is required in order to have proper scaling.
Stateless vs Stateful services
First things first, there are two types of services:
Stateless
Stateful
When you build a service as a stateful, it means that your server remembers previous transactions. When you are building auth system, in stateful service sessions are persisted on the server.
This results in the problem because your requests have to be routed to the same server all the time in order to serve request. It makes scaling much harder because requests are not evenly distributed between servers.
That’s why we should aim to build stateless services. They will process every request as a new request without knowledge of the past. Same requests, will give same output. For example, JWT is good for building stateless services because you can verify it from all servers, and requests can be routed to any server.
So to build proper load balancing, first build stateless services. Let’s now dive in into different algorithms.
Different algorithms
When you start building app, one server might be enough. So all your requests go to one server and it serves the traffic. The problem is when you start growing it no longer can catchup to the requests.
This way, our load balancer is sending all traffic to on instance/server. Which after certain time gets overwhelmed and stops processing it.
One way to improve this is to add more servers, but then you need too choose how do you want to divide the traffic. First and simple algorithm is Round Robin.
Round Robin
Round Robin is that simple that it just evenly sends traffic between servers. If you have 3 servers, it will send traffic to them in order and keep sending equal number of requests to each one.
Problem with this algorithm is that not all requests are same. Which results in situations that one will get more expensive requests and it will cause it to slow down. But guess what! Even if it slows down, Round Robin will continue to send requests to it.
Your servers can be different, algorithm kinda assumes that they are same because it sends same traffic to them. If one server is bigger than other it should receive more traffic, while it’s not a situation here.
When server gets overloaded, you can also have request queue, it will allow you to serve that traffic even if server is overloaded. But the issue is that request latency will go higher because they are kept in queue for some time before its processed. And it doesn’t solve issue, we want to serve it asap.
In this setup, the problem is also if you have stateful service. Then you want your requests from one client to be routed to the same server all the time. To achieve that you need to use Sticky Round Robin.
Sticky Round Robin
Sticky Round Robin allows you to send requests from same client to the same server all the time. But when you get a request from new client it should be routed to one of the servers.
This should be mostly avoided because, it’s hard for round robin to works in this situations. Traffic is not equally spread and if you have different servers problems gets even more problematic.
As you can see on the image, with sticky session, it will send all requests from User 2 to second server, while User 1 will be sent to first server. This results in unequal number of requests between servers.
This can be avoided by building stateless service, and not requiring users to go to the same servers.
Weighted Round Robin
Weighted Round Robin is another way to improve this. When you add server to the load balancer, there are 2 ways assign weight to them.
This way, load balancer understands how much traffic one server should receive. If you have servers with different size, it can help you distribute traffic between them in better way.
It’s better than basic Round Robin because as you can see on the image. When server is bigger, it can handle more traffic and it will get more requests to it.
But it’s still the problem because not all connections are the same and eventually it doesn’t work. In this setup your biggest server will be utilised the most and that’s good because you are not wasting resources.
Least Connections
Round Robin is good, but what if I tell your there is something better? It’s a least connections algorithm, because load balancers distributes traffic it knows how many connections exists toward one server. We can leverage that and send traffic to the server which has the lower number of connections.
This helps distribute work to the ones which can handle it.
It’s a great choice because it will help distribute the traffic and use all resources that you have in the pool.
Least time
There is one more algorithm that is worth to mention. It will track the response time for requests, and send traffic to the servers which has the lowest processing time. This is similar to weight in Round Robin.
It great because it will use servers which are most capable of handling the request, and you don’t need to take care of weight.
As the time of currently lowest response time increases, another server will start taking over the requests. And load balancer starts sending traffic to next instance in the pool.
It tracks the latency from some number of requests and calculate the average. It calculates average time in way that the newest measures contribute more to the average calculation, and oldest contribute the least.
Conclusion
By default AWS Load Balancer is using Round Robin algorithm. But it also supports Least Time (they call it Least Outstanding Requests).
I think they can be used based on the use case and the scale. In the past, Round Robin never let me down. It’s good to have option to change it. Everything has its time and place when it should be used, so choose it wisely.