What Is Scalability?

Scalability means the ability of the application to handle & withstand increased workload without sacrificing the latency.

Scalability is the term we use to describe a system’s ability to cope with increased load.

For instance, if your app takes x seconds to respond to a user request. It should take the same x seconds to respond to each of the million concurrent user requests on your app.

The backend infrastructure of the app should not crumble under a load of a million concurrent requests. It should scale well when subjected to a heavy traffic load & should maintain the latency of the system.

What Is Latency?

Latency is the amount of time a system takes to respond to a user request. Let’s say you send a request to an app to fetch an image & the system takes 2 seconds to respond to your request. The latency of the system is 2 seconds.

Minimum latency is what efficient software systems strive for. No matter how much the traffic load on a system builds up, the latency should not go up. This is what scalability is.

If the latency remains the same, we can say yeah, the application scaled well with the increased load & is highly scalable.

Measuring Latency

Latency is measured as the time difference between the action that the user takes on the website, it can be an event like the click of a button, & the system response in reaction to that event.

This latency is generally divided into two parts:

  1. Network Latency
  2. Application Latency

Network Latency

Network Latency is the amount of time that the network takes for sending a data packet from point A to point B. The network should be efficient enough to handle the increased traffic load on the website. To cut down the network latency, businesses use CDN & try to deploy their servers across the globe as close to the end-user as possible.

Application Latency