Leveraging vSAN for Highly Available Management Clusters : Use Case Architecture
   
Use Case Architecture
4.1 Two-Node vSAN Management Cluster Overview
The system design example described in this section provides a scenario for a Cloud Management Platform deployment over a cluster with basic resiliency that is supported by vSAN. The design elements for this architecture include the following:
The management cluster holds dedicated compute, network, and storage resources.
The management cluster is entirely located at a single site.
The cluster supports a full CMP suite of products, including optional CSP-related services.
vSAN powers the storage solution, vSphere HA is enabled for the VMs, and additional high availability features (vSphere FT and vSphere Data Protection) are enabled where required for the workloads.
The following figure illustrates the deployment for this scenario.
Figure 6. Example of Cloud Management Platform Deployment over a vSAN Cluster
 
Additional specific configurations, also represented in the preceding figure, include the following:
A minimal two-node cluster is implemented.
A witness host appliance for vSAN is deployed on another server in the local data center.
Two fault domains are identified with the two rack servers hosting the respective nodes of the cluster. One additional domain refers to the witness host.
Failures to Tolerate policy is set to the value of 1 to enforce two copies per object.
The availability of the vSAN architecture in this example is close to four 9s, or 99.99%. vSAN is able to provide greater availability rates by increasing the number of copies per object (FTT) and the number of fault domains, and therefore, the number of data nodes in the cluster as well.
Some of the availability metrics for computing the overall availability are variable and lie outside the scope of this architecture example because they are primarily linked to hardware resources. These variable metrics include the following:
Sizing of the resources presented to the virtual machines running the management workloads.
Rack (power supplies, cabling, and so on).
Top-of-the-rack network switches.
Host physical server and hardware components (including CPUs, memory, controller, and so on).
Hard disk MTBF (traditional spindle-based disks and high-performance flash devices).
Hard disk capacity and performance, which influences rebuild time.
FTT setting, which influences the required capacity of the management cluster.