AlwaysOn Availability

AlwaysOn Availability is a type of failover and disaster recovery system for a selected set of user databases, called availability groups. It supports a set of primary databases and up to four sets of secondary databases. This service is more than just mirroring, however, and is not a backup system. AlwaysOn is designed to keep databases active for users in the event of a failure or disaster. Also, it is not intended to support an entire database system.

AlwaysOn Availability require Windows Server Failover Clustering (WSFC) but is not dependent on SQL Failover Clustering. The system operates from a paradigm of primary and secondary availability groups, which host sets of primary and secondary databases. An availability group also consists of one primary replica of the primary database, and secondary replicas of the secondary databases. Each must occupy its own node in the WSFC and have a SQL instance per node.

The primary replica has read-write permissions. The primary replica also updates the secondary replicas with the primary database’s transaction logs. Once the logs are saved to a hard drive, which is called “hardened,” the secondary replicas update the secondary databases with those logs.

The secondary is in read-only status until there is a failure. When a failure occurs, the primary reverts to a secondary, becoming read-only, and the secondary takes the role as the primary. There are three ways to configure the failover: automatic, manual, and forced. Each configuration has advantages and disadvantages, especially when it comes to updating between primaries to secondaries.

Updating availability replicas occurs through one of two modes: asynchronous or synchronous. If it is asynchronous, the primary replica will send updates to the secondary replica without waiting to find out if the secondary has hardened the last update before it updates its corresponding database. Synchronous will wait on the secondary to respond that the last update has been hardened. Asynchronous could result in data loss, synchronous can slow down the database.

Users of the database will not experience an outage through the use of a “Client Listener.” An availability group listener will monitor which replicas and databases are active and, if there is a failure, will automatically switch the traffic to the secondary that becomes a primary.

The WSFC can also repair pages that become corrupt or damaged. If the damage is on the primary replica, it will broadcast a request for a copy of the damaged page from the secondary replicas, copy it, and replace the damaged file. If the damage is on the secondary, it will request a fresh copy and will replace the page automatically.