The Scale Cube

2 min readJul 11, 2019

The scale cube is a useful visualization of a three-dimensional scalability model, shown in the figure below :

This cube is described in Martin Abbott and Michael Fisher’s excellent book, The Art of Scalability (Addison-Wesley, 2015).

The Scale cube defines 3 separate ways to scale an application: X, Y, and Z.

X-Axis scaling load balances requests across multiple instances. It is a common way to scale a monolithic application. Multiple instances of the application are run behind a load balancer. The load balancer distributes the requests among N identical instances of the application. This way of scaling improves the capacity and availability of the application.

Z-Axis scaling also runs multiple instances of the application, but here each instance only works on a subset of data. The data is partitioned amongst these N identical instances and load balancer distributes and routes the request by using a request attribute. An application might, for example, route requests using userId.

Y-Axis scaling functionally decomposes an application into services. (aka MicroServices).

X- and Z-axis scaling improves the application’s capacity and availability. But none of these approaches solve the problem of increasing development and application complexity.

Y-Axis scaling splits the application into multiple services. Each service performs a specific function. So, Y-Axis scaling decomposes a large monolithic application into small services, each of which can be scaled further by using X-Axis and Y-Axis scaling independently.

The Scale Cube

Written by Karan Sharma

No responses yet