Introduction to Microsoft SQL AlwaysOn

Traditionally, running Microsoft SQL Server on the VMware vSphere® platform offered multiple database availability and disaster recovery options using VMware features, such as VMware vSphere High Availability and VMware vSphere Distributed Resource Scheduler™ (DRS), with Microsoft features, such as failover clustering. The Microsoft SQL Server high availability features have also been combined with vSphere to create a flexible availability and recovery solution, applying the most efficient and appropriate tools for each specific use case.

All of Microsoft SQL Server availability technologies are supported on the core vSphere platform, including:

It is important to recognize that SQL database availability involves more than the technologies just described. When architecting this type of solution, you must also consider all the design factors for local site and remote site availability options, backup and restore options, and best practices and deployment considerations.

While the objective of this paper is to address use cases available with the AlwaysOn Availability Groups mechanism, to provide a complete picture of the available options with AlwaysOn technology, the following table lists all of Microsoft SQL Server 2016 availability options and their potential ability to meet Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs).

Technology	Granularity	Storage Type	RPO	RTO
Log shipping	Per database	Non-shared storage architecture	Last shipped transaction log	DBA recovery only
Database mirroring	Per database	Non-shared storage architecture	None*	< 3 seconds or administrator recovery
AlwaysOn Failover Cluster Instances	Per instance	Shared storage only	None	~30 seconds
AlwaysOn Availability Groups	Per database	Non-shared storage architecture	None**	~3 seconds or administrator recovery

The potential RPO and RTO values shown in the previous table are typical examples of achievable values in real world implementations. However, architecture, hardware, and configuration are significant factors that can skew these estimations significantly.

The focus of this document is the SQL Server 2016 AlwaysOn Availability Groups feature, which provides flexible design choices for selecting an appropriate high availability and disaster recovery solution for applications, and supports availability at either the application database or instance level. The AlwaysOn Availability Groups feature was developed for applications that require high uptime, need protection against failures within a data center, and require adequate redundancy against failures across data centers. This solution architecture helps protect application databases from planned and unplanned downtime without the need for shared storage between nodes.

The Availability Groups feature provides an enterprise-level alternative to database mirroring. The availability group supports the failover environment for a set of read-write primary databases, and one to four sets of corresponding secondary databases. Additionally, you can make secondary databases available for read-only access or backup operations. With AlwaysOn Availability Groups automatic failover mode enabled, if the primary replica fails, a secondary replica automatically takes over servicing client requests and becomes the new primary replica.

The AlwaysOn Availability Groups feature provides support for the following functions:

This AlwaysOn Availability Group mechanism is also different from traditional SQL failover cluster Instances because a failover cluster can provide availability only at the entire Microsoft SQL Server instance level, not for individual databases.

As mentioned previously, the AlwaysOn Availability Groups feature is built on a non-shared disk architecture. Each availability group replica has its own copy of the database, which can be deployed on VMFS or RDMs. Log streams are used to replicate data from the primary to the secondary replica nodes synchronously or asynchronously, with up to two synchronous replicas allowed. Availability groups sit on top of WSFC technology to provide a failover policy or quorum management. WSFC is required by AlwaysOn Availability Groups, with each availability group replica also being a WSFC node, unlike AlwaysOn Failover Cluster Instances, for which there is no requirement to use shared disks.

While VMware does not support VMware vSphere vMotion® or DRS on clustered SQL Server virtual machines within a shared disk architecture (such as AlwaysOn Failover Cluster Instances), this restriction does not apply to AlwaysOn Availability Groups that are built on a non-shared disk architecture. VMware fully supports using vSphere HA, vSphere vMotion, and DRS with AlwaysOn Availability Groups.

With vSphere vMotion, a VMware ESXi™ host can be powered down for planned maintenance at any time without interruption to the client requests. vSphere HA can quickly reboot a Microsoft SQL Server virtual machine, which can rejoin the AlwaysOn Availability Group session if a hardware failure occurs.

In this planned maintenance scenario, vSphere vMotion is employed to proactively live migrate an availability group replica to a different host to allow the servicing of the hardware to take place without requiring an AlwaysOn Availability Group failover event. There is no disruption to the Microsoft SQL Server service during the vSphere vMotion operation and no interruption to the client’s application connections or any in-flight transactions. By coupling together vSphere vMotion with AlwaysOn Availability Groups technologies, you can eliminate the need to fail over the Microsoft cluster and reduce service interruptions for operational hardware maintenance or renewal.

In the unplanned failure scenario, the SQL Server environment could be vulnerable if further host failures were to occur during the time between the loss of a replica and its restoration because, depending on available bandwidth and network conditions, the resynchronization of the replica could take a significant period of time to complete. vSphere HA helps to alleviate this issue by restarting the failed replica virtual machine on another available host in the vSphere cluster. This facilitates a significantly quicker restore to full protection of the database, and reduces the amount of time spent by the availability group in the failover state. In the event of an unplanned physical host failure, you do not need to wait for the physical host to be serviced and brought back online to bring back the AlwaysOn Availability Group replica. Instead, vSphere HA automatically detects the host failure and immediately reboots the AlwaysOn Availability Group replica virtual machine onto a different available ESXi host.

With these integrated VMware mechanisms for high availability, it is perfectly possible, if not expected, to achieve better levels of uptime running AlwaysOn Availability Groups with vSphere than on physical hardware. This provides a high availability and disaster recovery solution for Tier 1 virtualized business critical databases to maximize continuous availability of the application it services and for business continuity during disaster scenarios. Furthermore, this solution architecture:

SQL Server 2016 AlwaysOn provides flexible design choices for selecting an appropriate solution for your application with protection against failures within a data center and adequate redundancy across data centers. For a further overview of the AlwaysOn design options for high availability and disaster recovery, see AlwaysOn Availability Groups (SQL Server) at https://technet.microsoft.com/en-us/library/hh510230.