In this article, we will see one of the major architectural building blocks, The Load Balancer. Specifically Application Load Balancer.
Load Balancer is used in almost every large-scale software system.
Here we will see the quality attributes the Load Balancer software building block can provide to our system.
We will also investigate the different types of load balancing solutions and how to use those solutions in architecting large-scale software systems.
Table of Contents
First, we’ll discuss what load balancers are and how they work.
A load balancer’s primary function is to distribute incoming requests evenly across multiple servers.
When thinking about the big picture of software system architecture, we find that the ideal strategy to accomplish both high availability and horizontal scalability is to launch many copies of our app on different hosts.
But without a load balancer, the client application running on our customers’ PCs would need to know the addresses of those computers and the number of application instances.
This makes it very difficult for us to make modifications to the system because the client application is so closely bound to the internal code.
Most load balancing solutions provide an abstraction between the client application and our collection of servers, which is very useful even though the primary function of a load balancer is to distribute the workload across the servers in a cluster.
With this abstraction, all our computers and storage appear to be just one super-powered server.
Each of the many available load-balancing systems provides a unique degree of abstraction.
Now we will examine the various ways in which a load balancer improves our system’s performance. Specifically, we will see what all quality attributes a load balancer can provide to our system.
A load balancer provides us with high availability first and foremost.
By hiding a group of servers behind a load balancer, we can horizontally scale our system both up and down by introducing additional servers.
When the load on our system increases it scales up servers and removes unnecessary servers when the load on our system decreases to save money.
In the cloud environment, it is even easier to include more hardware on demand. Cloud auto-scaling policies can be used to dynamically adjust infrastructure.
With auto-scaling new application instances may be quickly and cheaply added or removed in the cloud, depending on factors such as the number of requests per second or the available network bandwidth.
The load balancer also guarantees us constant application availability, which is an important quality attribute.
Most load balancers can be configured with little effort to prevent them from sending requests to non-reachable servers.
By keeping tabs on which servers are up and running, load balancers may intelligently distribute work across the available machines while bypassing the ones that are down or moving at a glacial pace.
Next, We’ll examine the impact load balancers have on the system’s overall efficiency.
When it comes to performance, load balancers may raise latency and response time to the user, but this is usually a fair trade-off given the throughput gains they provide.
Since we may have an arbitrary number of backend servers thanks to the load balancer (within reason, of course), our throughput is substantially higher than if we were limited to a single server’s capacity.
The load balancer aids in the attainment of another crucial quality attribute, maintainability.
By being able to quickly add and remove servers from the pool, we can choose to shut down servers for maintenance or to change their software versions. Not bothering the customer in any way.
When that server’s upkeep is done, it can be reinstated to the load balancer and the next one taken down.
In this method, we may have a rolling release without sacrificing our service level agreement’s guaranteed uptime.
Now, let’s discuss the various load balancers available to us.
The Domain Name System (DNS) is used to implement a very basic load balancer.
DNS is a core component of the Internet that translates human-readable domain names like google.com, yahoo.com, etc. into numerical IP addresses used by network routers to fulfill user requests.
A directory service like this might map each domain name to the corresponding IP addresses.
Whenever a user or client application needs to establish a connection with our infrastructure, they will issue a DNS query to the DNS server, which will then return an IP address corresponding to our domain name.
After getting the server’s IP address, the client app can make a direct connection. However, a DNS record need not be associated with a single IP address, and can instead be simply set to provide a list of IP numbers that belong to many servers.
By convention, most client applications simply select the first address in the list that makes use of the resolved IP address for a particular domain.
But the fact is that most DNS servers are implemented in such a way that they return the list of addresses for each domain in a different order on each client request.
By cycling through this list in a round-robin method, the domain naming system helps to distribute the load on our servers.
Although this approach to load balancing is quite straightforward and inexpensive in fact, it’s practically free when you buy a domain name, but it does come with a few limitations.
Our primary issue is that DNS does not check the status of our servers.
Therefore, the Domain Name System will continue to direct users to the non-responsive server, even if it goes down.
A DNS record’s time to live specifies how often its associated list of IP addresses refreshes.
This mapping table between domain names and IP addresses can be cached in several places, including the client’s local machine.
That extends the window of time in which requests are still being forwarded to a failing server before they are redirected elsewhere.
Another drawback of DNS-based load balancing is that its load balancing strategy is always as simple as a round-robin.
Round-robin does not account for the fact that some of our application instances may be running on more powerful servers than others, nor can it detect that one of our servers may be more overloaded than the others.
A third problem with DNS-based load balancing is that a rejected application would learn the IP addresses of all of our servers.
Not only does this compromise our system’s security by revealing internal workings, but it also compromises the system’s usability.
The reason for this is that a malicious client application might theoretically select one IP address and deliver requests only to that server, causing it to become significantly more overloaded than the others.
Two more load balancing systems are available that are significantly more robust and sophisticated, and they can be used to overcome all these shortcomings.
Hardware load balancers and software load balancers are the two options here.
Software load balancers are just programs that can operate on any general-purpose computer, but hardware load balancers run on specialized devices designed and tuned exclusively for load balancing.
In contrast to DNS load balancing, all client-server communication in the case of software and hardware load balancers is sent through the load balancer.
To rephrase, our solution is much more secure because neither the individual IP addresses nor the total number of servers we have behind the load balancer is visible to the end users.
Furthermore, both hardware and software load balancers can actively detect when one of our servers has become unresponsive by sending periodic health checks to the server.
Finally, hardware and software load balancers are becoming increasingly capable of distributing workloads across our servers in a way that takes into account a wide variety of factors, including the type of hardware each application instance is running on, the current load on each server, the number of open connections, and so on.
Both software and hardware load balancers have the advantage of being able to be utilized internally to establish an abstraction between different services, in addition to balancing requests from external users.
In the case of an e-commerce platform, for instance, it is possible to divide the front-end service responsible for responding to customers’ browser queries from the back-end services responsible for fulfilling orders and charging customers.
Separate deployments of individual services are possible in the form of several application instances hosted by a cluster of computers. The load balancer acts as a conduit for the communication between the services.
This allows us to scale each service independently and without impact on the others.
Load balancers, whether software or hardware, are preferable to DNS in terms of load balancing, monitoring, failure recovery, and security, although they are frequently collocated with the cluster of servers they are managing.
All traffic to and from the client must pass via the load balancer, therefore placing it too distant from the servers would result in significant delays.
Having a single load balancer for two sets of servers will compromise the efficiency of one of the data centers if the system is deployed across several physical locations.
We still need a DNS solution to translate human-friendly domain names into IP addresses, as load balancers do not solve this problem on their own.
For this purpose, the Global Server Load Balancer (GSLB) exists as a fourth load-balancing option.
A GSLB (Global Server Load Balancer) is sort of a hybrid model between a DNS service and the software or hardware load balancer.
In most cases, a GSLB solution can function as a DNS server in the same way as other DNS servers do. However, it also has the capability of making smarter routing choices.
The GSLB can, on the one hand, determine the user’s location from the origin IP included in the request.
A GSLB service, on the other hand, may be monitored in the same ways that a regular software or hardware load balancer can.
We register all of our servers with our GSLB, and it always knows where they are and what their status is.
These servers, known as load balancers, are typically deployed across multiple data centers in a wide-ranging system deployment.
The GSLB may only respond with the IP address of the closest load balancer when a user issues a DNS query.
From then on, the user will utilize that IP address to access our system in that data center without going via any intermediary servers or virtual private networks (VPNs).
The unique feature of GSLBs is that they may be set up to prioritize traffic based on criteria other than physical proximity.
Because of their ongoing connection with our data centers, they may be set up to direct traffic based on factors such as the current CPU load at each data center, the best-projected response time, and the bandwidth available between the user and a specific data center.
As a result of this fantastic feature, we can ensure that all users, regardless of where they happen to be located, receive optimal performance.
GSLBs are also crucial in post-disaster restoration efforts.
Users can be redirected to alternative data centers in the event of a disaster or power outage at one of our facilities, increasing our availability.
In conclusion, we can establish many load balancers and register all of their addresses with the GSLB’s DNS service or any other DNS service to ensure that no one load balancer becomes a single point of failure in any given region.
That being said, the client apps can simply acquire a list of all our load balancers, and then choose one at random or use the first one in the list.
Following are the types of load balancers,
Network Load Balancer
Application Load Balancer
Classic Load Balancer
Application Load Balancers route the HTTP/HTTPS traffics. Using the Application Load Balancer, requests from clients are forwarded to one or more open ports on each container instance in your cluster, with the application load balancer deciding which port to use. Dynamic host port mapping is a feature of application load-balancing appliances.
The Network Load Balancer’s only concern is to route the incoming TCP or UDP connection to its intended destination. When an HTTP request comes in, the Network Load Balancers don’t analyze it. In this sense, the Network Load Balancers are substantially less active than the Application Load Balancers. Therefore, the Network Load Balancers spend much less time forwarding requests.
NGINX is a free, open-source, high-performance HTTP server and reverse proxy (load balancer). It is known for its high performance, stability, rich feature set, and simple configuration.
HAProxy
HAProxy is a highly reliable performance HTTP/TCP load balancer. It is free and open-source,
HAProxy is specifically suited for high-traffic websites and powers a significant portion of the world’s most visited ones.
HAProxy is considered the de-facto standard open-source load balancer and is shipped with most mainstream Linux distributions.
HAProxy supports most Unix-style operating systems.
Microsoft Azure Load Balancer
AWS – Elastic Load Balancing (ELB)
GCP – Cloud Load Balancing
In this article, we covered the network load balancer, one of the important software architecture building blocks and the different quality attributes that the load balancer provides to our system. We also discussed four load balancing solutions, which are DNS load balancing, hardware load balancing, software load balancing, and Global Server Load Balancing.
OWIN (Open Web Interface for .NET) is an interface between web servers and web applications…
JSON (JavaScript Object Notation) is a commonly used data exchange format that facilitates data exchange…
The CAP theorem is also known as Brewer's theorem. What is CAP Theorem? CAP theorem…
Some of the Key factors that need to consider while architecting or designing a software…
The Interface Segregation Principle (ISP) is one of the SOLID principles of object-oriented design. The…
The Single Responsibility Principle (SRP), also known as the Singularity Principle, is a software design…