Load balancer
Load balancing (performed by a load balancer) is a type of service performed by a computer that assigns work loads to a set of networked computer servers in such a manner that the computing resources are used in an optimal manner.
| Contents |
Introduction
A load balancer can be used to increase the capacity of server farm beyond that of a single server. It can also allow the service to continue even in the face of server down time due to server failure or server maintenance.
A load balancer consists of a virtual server (also referred as vserver or VIP) which, in turn, consists of an IP address and port. This virtual server is bound to a number of physical services running on the physical servers in a server farm. These physical services contain the physical server's IP address and port. A client sends a request to the virtual server, which in turn selects a physical server in the server farm and directs this request to the selected physical server.
Different virtual servers can be configured for different sets of physical services, such as TCP and UDP services in general. Protocol- or application-specific virtual servers that may be supported include HTTP, FTP, SSL, SSL BRIDGE, SSL TCP, NNTP and DNS.
The load balancing methods manage the selection of an appropriate physical server in a server farm.
Persistence can be configured on a virtual server; once a server is selected, subsequent requests from the client are directed to the same server. Persistence is necessary in applications where state is maintained on the server, such as a shopping cart application.
Load balancers also perform server monitoring of services in a web server farm. In case of failure of a service, the load balancer continues to perform load balancing across the remaining services that are UP. In case of failure of all the servers bound to a virtual server, requests may be sent to a backup virtual server (if configured) or optionally redirected to a configured URL. For example, a page on a local or remote server which provides information on the site maintenance or outage.
Among the server types that may be load balanced are:
- Server farms
- Caches
- Firewalls
- Intrusion detection systems
- SSL offload or compression appliances
In Global Server Load Balancing (GSLB) the load balancer distributes load to a geographically distributed set of server farms based on health, server load|load or proximity.
Load Balancing Methods
- Least connections
- Round robin
- Least response time
- Least bandwidth
- Least packets
- Token
- URL hashing
- Domain name hashing
- Source IP address
- Destination IP address
- Source IP - destination
- RTT, used for GSLB
- Static proximity, used for GSLB
Persistence
When a load balancer initially selects a specific physical server and directs a client request to this physical server, then for some applications all subsequent requests for the same client may need to be sent to the same physical server to access state information for that client.
Common persistence types
- Source IP
- cookie insert
- SSL session ID
- URL passive
- Custom server ID
- Rule
- Destination IP
- Source and destination IP
Monitoring
Server monitoring checks the state of a server by periodic probing of the specified destination. Based on the response, it takes appropriate action. Monitors specify the types of request sent to the server and the expected response from the server. The load balancing system sends periodic requests to the server. The response from the servers must be received not later than configured response timeout. If the configured number of probes fail, the server is marked "DOWN" and the next probe is sent after the configured down time. The destination of the probe may be different than the server's IP address and port. A load balancer may support multiple monitors. When a service is bound to multiple monitors, the status of the service is arrived based on the results sent by all monitors.
Common monitor types
- Ping
- TCP connection
- Application, e.g., query and expected response check
