Intelligent Scaling

Intelligent scaling eliminates the drawbacks of fixed and scale everything dynamic infrastructure. This comes with the cost of increasing complexity in the system. An infrastructure that performs intelligent scaling must consider the current state of all the components within the system to identify which components are responsible for the degradation of KPI or where the service is currently over-provisioned. The system must monitor the KPI (set point) as well as the details of the system itself (infrastructure and application monitoring).

When a KPI event occurs, the system performs an analysis to determine next steps.

The decision flowchart for intelligent scaling must perform the following tasks:

The identification of a KPI violation is common requirement of all the scale out modes, however, in all other modes, the violation triggers the scaling event. In the intelligent scaling model, the KPI violation triggers a second phase analysis or causal analysis to determine the cause of the KPI violation.

To determine the KPI violation root cause and whether to scale the service, you must analyze the infrastructure and application metrics. This can be done using the following techniques:

The analysis results in one of two outcomes. The violation is performance-related or capacity-related, or it is related to a problem with the service. In the case of a service problem, an alert is issued, and there should be no further action on the part of the scaling tasks.

If the analysis determines that the root cause is a bottleneck or over-provisioning, the next task is to localize the cause.

During the localization phase, the service metrics are further analyzed to identify the root cause of the capacity issue. Identify within which tier additional resources are needed, and which type of resource those should be.

Are the following required?

Identify the location of the capacity issue by using techniques such as the following:

After the system knows there is a performance-related or capacity-related issue, and where the issue is located, it can issue a scaling request to the orchestrator to resolve the issue. The solution might be to add additional web servers or another database node to a cluster. It might also remove capacity from the service to bring costs back in line.