Cloud-based applications are great. They free users from hardware, obsolescence, maintenance, and so on. But what challenges do providers face? And how do they change the way software is designed?
Let’s talk about scalability and efficiency.
We’re all familiar with the concept in one way or another. A Formula 1 car is efficient because it can convert every bit of energy into speed and acceleration. The drawback is that its range of use is quite limited. A solar-powered car is also very efficient because it converts every bit of energy into motion. In this case, its range of use is limited by the tiny amount of energy available.
Efficiency is always about getting the maximum output from some constrained resource. The game is about maximizing that output given the restriction.
With scalability, we still address one or more performance dimensions, like speed, space, and time. In this case, resources aren’t constrained but they are the variable part. The problem is ensuring that, given the necessary amount of resources, we can produce the desired output.
For example, a train is a scalable solution. You can increase its capacity by adding more cars. And as it gets heavier, you can add more locomotives.
A more formal way to express this concept:
- Increasing the available resources enables the system to cope with higher loads while maintaining the same level of service.
If an order management system serves 10 customers with 4 CPUs guaranteeing a 10-ms latency, then 20 customers will be served by 8 CPUs and the latency will still be guaranteed to be 10 ms.
- Ideally, the amount of resources needed increases linearly with the load (best is something less than linear).
That is, there should be an economy of scale so that higher loads are less expensive. For example, let’s consider our hypothetical order management system. If there was a quadratic relation, where doubling the number of customers required four times the resources, the system would be uneconomic (though there are exceptions).
How does this translate into software design?
Scalability and efficiency are not interchangeable. Scalable systems are designed to be scalable, and super-efficient systems are also designed to be that way. It doesn’t happen by chance.
An interesting consideration is that most successful scalable software is also efficient. This is because efficiency enables the provider to keep the resource usage low (which helps increase profitability and improve the user experience). For example, Google is obsessed with efficiency because there is only so much electrical energy available to a data center.
A truck will never run at 300Km/h and an F1 car will never scale – try as you might, you’ll never get your new Ikea wardrobe home with a McLaren F1.
On the other hand, cloud-based software needs to scale because of its very nature. Increasing the number of users, transactions, locations, features, and so on, is at the base of the economy of scale that makes software in the cloud convenient for users and providers.
Although there are various nuances to handling scalability, parallelism is always at the base. Instead of just one instance of the software, there are many instances cooperating and working in parallel. From a technical standpoint, the challenge is making sure the work is coordinated (but this is too large a topic to get into here).
As in the example of transportation, software is designed to live in the cloud and scalability must be built in.
Development effort of increasing the load in a non-scalable system. Beyond a certain point, a slight increase in load requires a massive amount of development.
How do we recognize software with scalability built-in?
An effective cloud-based service provider knows that scalability is about being in control of performance KPIs (key performance indicators). This enables the provider to monetize the investment in scalability (as you’ve probably guessed, designing scalable software isn’t cheap). It also enables the user to access a high-quality service.
Scalability must be specified as a requirement. Requirements can be documented, tested, and made part of the software lifecycle. When performance has become part of the fabric of the software, it can be commoditized and offered as a service. When a provider can offer various levels of service performance (for example, by offering latency SLAs), this is usually a sign that they have engineered the system well.
A final note about scalability is that it’s part of a set of non-functional requirements that includes availability and resilience. From an engineering standpoint, these requirements are often addressed together because redundancy and parallelism are often used to achieve the goal. A provider who can control the performance of their solution will usually be good at managing its availability as well.
To learn more about SaaS at ION, and what products are already available, speak to a member of our team.