Architecting a vSphere Compute Platform : Scalability and Designing Physical Resources
   
Scalability and Designing Physical Resources
With any virtual infrastructure design that is required to scale extensively across hundreds or even thousands of hosts, provide petabytes of storage, and support large complex networks, extensibility is a key factor. For a successful service provider deployment, scaling a large physical platform while maintaining control, compliance, and security is critical. Taking a predefined building block approach to this type of architecture, from day one, is paramount in planning for scalability.
In addition, the configuration and assembly process for each system must be standardized, with all components installed identically. Standardizing the configuration of the physical components is critical in providing a manageable and supportable infrastructure by eliminating variability, which in turn, reduces the amount of operational effort involved with patch management and helps provide a flexible building block solution that meets a service provider’s requirement for elasticity.
While the configuration and scaling is likely to be hardware vendor dependent, this model must form part of the platform design, which will be linked to commercial growth estimates of the services offered.
For instance, the sample design in the following figure represents a possible building block scenario where the service provider for the compute payload clusters is employing a traditional approach to storage, whereas a hyper-converged architecture is utilized for the hardware associated with the management and edge clusters, with VMware vSAN™ ready compute nodes deployed.
From a physical platform perspective, each “Payload vPod” is made up of 96 rackmount ESXi compute hosts configured as four 24-node VMware vSphere® clusters split equally across three server cabinets and a two-node vSphere local management component cluster. Each Payload vPod also houses two 48-port 10-GbE “leaf” switches, two 48-port 8-GB multilayer fabric switches, and two 1-GbE IPMI management switches for out-of-band connectivity. Each of these Payload vPods is designed to provide a fault domain for compute, network, and storage.
 
 
Figure 5. Sample Payload vPod Logical Architecture
 
 
The number of Payload vPods in this sample design can be scaled out accordingly, depending on hardware, software, and power limitations. The vSphere components in each Payload vPod are managed by a single vCenter Server instance.
In addition to the Payload vPod, this sample cloud-based architecture requires a Management vPod to house the Top Level Management (TLM), Cloud Management Platform (CMP), and VMware NSX® Edge™ components required to support all the Payload vPods within the “VMware Cloud Provider Program Data Center Block.” In this sample architecture, the Cloud Platform Management Cluster and Edge Cluster are hosted on distributed virtual datastores provided by the vSAN ready nodes.
Figure 6. Sample Management vPod Logical Architecture
 
In this sample design, there is a design constraint of 16 Payload vPods per availability zone, due to power limitations in the data center halls. In this architecture, this entity, made up of one Management vPod and 16 Payload vPods, is referred to as a VMware Cloud Provider Program Data Center Block.
This building block architecture is further represented in the data center layout figures that follow, demonstrating one VMware Cloud Provider Program Data Center Block located at a single physical data center across two availability zones.
 
Figure 7. Logical Data Center Layout – Single VMware Cloud Provider Program Data Center Block
 
Figure 8. Physical Data Center Layout – VMware Cloud Provider Program Data Center Block
 
The compute, storage, and network resources available from each component layer of this building block architecture, as specified in this sample design, are provided in the following table.
Table 1. Capacity Scalability of Building Block Architecture
Resource
Host
Cluster
Payload vPod
VMware Cloud Provider Program Data Center Block
Memory
512 GB DDR3
10.5 TB
(24 nodes with 3 reserved for HA)
42 TB
672 TB
CPU
2 x Intel E5 8-Core
3.1 GHz = 49.6 GHz
1,041.6 GHz
(24 nodes with 3 reserved for HA)
4,166.4 GHz
66,662.4 GHz
Fast Storage
N/A
Flexible Configuration
180 TB
2.9 PB
Standard Storage
N/A
Flexible Configuration
300 TB
4.8 PB
Network Bandwidth
20 Gbps
420 Gbps
1,680 Gbps
(80 Gbps MLAG to Spine)
10 Gbps to Internet
 
There is no single solution to scaling the VMware Cloud Provider Program physical infrastructure platform. During the design phase, a number of factors play an important role in designing the building blocks including, but are not limited to:
Expectations regarding services and provider growth
Hardware availability and lead times
Physical hardware scalability limitations (such as with blade system management tools)
Capital expenditure and hardware depreciation considerations
Data hall power, space, and cooling limitations