Crash Course on Load Balancing Algorithms
A device or service that distributes traffic across multiple servers or microservices.
Distributed System Refreshers :
Best Practices for Developing Microservices
20 System Design Concepts Every Developer Should Know - Part - II
·
Distributed System Learning Roadmap
What is load balancing?
Load balancing is the method of distributing network traffic equally across a pool of resources that support an application. Modern applications must process millions of users simultaneously and return the correct text, videos, images, and other data to each user in a fast and reliable manner. To handle such high volumes of traffic, most applications have many resource servers with duplicate data between them. A load balancer is a device that sits between the user and the server group and acts as an invisible facilitator, ensuring that all resource servers are used equally.
What are load balancing algorithms?
Static load balancing
Static load balancing algorithms follow fixed rules and are independent of the current server state. The following are examples of static load balancing.
Round-robin method
Servers have IP addresses that tell the client where to send requests. The IP address is a long number that is difficult to remember. To make it easy, a Domain Name System maps website names to servers. When you enter aws.amazon.com into your browser, the request first goes to our name server, which returns our IP address to your browser.
In the round-robin method, an authoritative name server does the load balancing instead of specialized hardware or software. The name server returns the IP addresses of different servers in the server farm turn by turn or in a round-robin fashion.
Pros:
Simple and easy to implement.
Works well when all servers have similar capacity and workloads.
Cons:
Doesn't account for server health or load.
Can lead to uneven distribution if servers have varying capacities.
Weighted round-robin method
In weighted round-robin load balancing, you can assign different weights to each server based on their priority or capacity. Servers with higher weights will receive more incoming application traffic from the name server.
Pros:
Balances workloads better for servers with different capacities.
Allows prioritization of more powerful servers.
Cons:
Requires manual configuration and monitoring.
Not suitable for rapidly changing workloads.
IP hash method
In the IP hash method, the load balancer performs a mathematical computation, called hashing, on the client IP address. It converts the client IP address to a number, which is then mapped to individual servers.
Pros:
Ensures session persistence without needing additional mechanisms.
Simple to implement for session-based systems.
Cons:
Uneven distribution if client IPs are not evenly distributed.
Not suitable for dynamic scaling, as new servers disrupt the hash mapping.
Dynamic load balancing
Dynamic load balancing algorithms examine the current state of the servers before distributing traffic. The following are some examples of dynamic load balancing algorithms.
Least connection method
A connection is an open communication channel between a client and a server. When the client sends the first request to the server, they authenticate and establish an active connection between each other. In the least connection method, the load balancer checks which servers have the fewest active connections and sends traffic to those servers. This method assumes that all connections require equal processing power for all servers.
Pros:
Dynamically adjusts to real-time server loads.
Effective in environments with long-lived connections (e.g., HTTP/2, WebSockets).
Cons:
Adds computational overhead to monitor connection counts.
May not account for variations in connection load (e.g., some connections are resource-heavy).
Weighted Round Robin
The Weighted Round Robin algorithm is an extension of the Round Robin algorithm that assigns different weights to servers based on their capacities. The load balancer distributes requests based to these weights.
Pros:
Balances workloads better for servers with different capacities.
Allows prioritization of more powerful servers.
Cons:
Requires manual configuration and monitoring.
Not suitable for rapidly changing workloads.
Weighted least connection method
Weighted least connection algorithms assume that some servers can handle more active connections than others. Therefore, you can assign different weights or capacities to each server, and the load balancer sends the new client requests to the server with the least connections by capacity.
Pros:
Balances based on both server capacity and active connections.
More effective for heterogeneous server setups.
Cons:
Complexity increases with weight calculations.
Still susceptible to imbalanced load if weights aren't calibrated correctly.
Least response time method
The response time is the total time that the server takes to process the incoming requests and send a response. The least response time method combines the server response time and the active connections to determine the best server. Load balancers use this algorithm to ensure faster service for all users.
Pros:
Dynamic and adjusts based on server performance.
Effective for latency-sensitive applications.
Cons:
Adds monitoring overhead to measure response times.
Ineffective if response times fluctuate significantly.
Resource-based method
In the resource-based method, load balancers distribute traffic by analyzing the current server load. Specialized software called an agent runs on each server and calculates usage of server resources, such as its computing capacity and memory. Then, the load balancer checks the agent for sufficient free resources before distributing traffic to that server.
Pros
Efficient Resource Utilization: Balances requests based on actual server capacity and current load.
Prevents Overloading: Avoids overloading any single server, improving reliability and response times.
Dynamic Adaptability: Adjusts in real-time to fluctuations in resource usage and traffic patterns.
Improved User Experience: Ensures faster response times by selecting the least loaded server.
Cons
Monitoring Overhead: Constant monitoring of server resources adds computational and network overhead.
Complex Implementation: Requires integration with monitoring tools and algorithms to analyze resource metrics.
Latency in Decision Making: Delays can occur if resource updates are not frequent or accurate.
Dependent on Accurate Metrics: Incorrect or outdated metrics can lead to suboptimal routing decisions.
What are the types of load balancing?
We can classify load balancing into three main categories depending on what the load balancer checks in the client request to redirect the traffic.
Application load balancing
Complex modern applications have several server farms with multiple servers dedicated to a single application function. Application load balancers look at the request content, such as HTTP headers or SSL session IDs, to redirect traffic.
For example, an ecommerce application has a product directory, shopping cart, and checkout functions. The application load balancer sends requests for browsing products to servers that contain images and videos but do not need to maintain open connections. By comparison, it sends shopping cart requests to servers that can maintain many client connections and save cart data for a long time.
Network load balancing
Network load balancers examine IP addresses and other network information to redirect traffic optimally. They track the source of the application traffic and can assign a static IP address to several servers. Network load balancers use the static and dynamic load balancing algorithms described earlier to balance server load.
Global server load balancing
Global server load balancing occurs across several geographically distributed servers. For example, companies can have servers in multiple data centers, in different countries, and in third-party cloud providers around the globe. In this case, local load balancers manage the application load within a region or zone. They attempt to redirect traffic to a server destination that is geographically closer to the client. They might redirect traffic to servers outside the client’s geographic zone only in case of server failure.
DNS load balancing
In DNS load balancing, you configure your domain to route network requests across a pool of resources on your domain. A domain can correspond to a website, a mail system, a print server, or another service that is made accessible through the internet. DNS load balancing is helpful for maintaining application availability and balancing network traffic across a globally distributed pool of resources.
Redundant Load Balancers
The load balancer can be a single point of failure; to overcome this, a second
load balancer can be connected to the first to form a cluster. Each LB
monitors the health of the other and, since both of them are equally capable
of serving traffic and failure detection, in the event the main load balancer
fails, the second load balancer takes over.
Benefits of Load Balancing
Users and customers depend on near-real-time ability to find information and conduct transactions. Lag time or unreliable and inconsistent responses—even during peak demand and usage times—can turn a customer away forever. And high spikes in compute need can cause havoc to an internal server or server system if the incoming demand—or “load”—is too high to be easily accommodated.
Advantages of using a load balancer include:
Application availability: Users both internal and external need to be able to rely on application availability. If an application or function is down, lagging, or frozen, precious time is lost—and a potential source of friction is introduced that might drive a customer to a competitor.
Application scalability: Imagine you run a ticketing company, and tickets for a popular performance are announced to be available at a certain date and time. There could be thousands or even more people trying to access your site to buy tickets. Without a load balancer, your site would be limited to whatever your single/first server can accommodate—which likely won’t be much with that much demand. Instead, you can plan for this big spike in traffic by having a load balancer to direct requests and traffic to other available compute surfaces. And that means more customers can get their desired tickets.
Application security: Load balancing also lets organizations scale their security solutions. One of the primary ways is by distributing traffic across multiple backend systems, which helps to minimize the attack surface and makes it more difficult to exhaust resources and saturate links. Load balancers can also redirect traffic to other systems if one system is vulnerable or compromised. In addition, load balancers can offer an extra layer of protection against DDoS attacks by rerouting traffic between servers if a particular server becomes vulnerable.
Application performance: By doing all of the above, a load balancer boosts application performance. By increasing security, by optimizing uptime, and by enabling scalability through spikes in demand, load balancers keep your applications working as designed—and the way you, and your customers, want them to.
If you found this guide helpful and want to stay updated with more insightful posts on software architecture and engineering, be sure to Follow me and Subscribe for more knowledge-packed content. 🔔💻
Happy learning, and may your systems be ever reliable! 🚀✨