Controlled Scaling

The scaling process needs to be controlled to provide the overall stability of the application and its underlying infrastructure.

The monitoring information, in conjunction with the desired performance of the system (set point), needs to be controlled by the overall system in order to prevent the constant seeking of the set point. This allows the infrastructure to operate more efficiently and in a far more stable manner. The simplest control scheme to introduce to the dynamic IaaS is to add hysteresis to the system.

With this method of control, instead of aggressively seeking the set point for our end-user experience, we create a band around it. Action is taken only when the performance falls outside of this band.

In the above example, as our end user experience improves in the form of decreased response time, we start to provide higher service quality then we really need, and are consuming too much capacity. This results in spending too much on the service, so we scale back the resources required to get as close as possible to our desired SLA (set point). When response time increases and we have a reduction in service quality, we scale out when we reach our SLA scale out threshold and add resources to bring the service level to our set point.

We might choose to bring our service level slightly above or below the desired set point based on our understanding of the service, how it responds to additional resources, and the cause of the increase itself.

By creating a dead band within the scaling model, we allow the service performance to fluctuate about the set point and not aggressively seek it, which can result in the Ping-Pong effect.