vSphere 5.0 High Availability (HA)


·         vSphere 5.0 uses an agent called “FDM” Fault Domain Manager

·         No more primary/secondary node concept as in its predecessors

·         New master/slave concept with an automated election process

·         vpxa(vcenter agent) dependency removed

·         HA talks directly to hostd instead of using a translator to talk to vpxa

·         FDM agent communicates with vCenter to retrieve information about the status of virtual machines and vCenter is used to display the protection status of virtual machines

·         HA is no longer dependent on DNS

·         Character limit that HA imposed on the hostname has been lifted(Previously 26 chars)

·         If you add ESX/ESXi 4.1 or prior hosts to a vSphere 5.0 vCenter Server, the new vSphere HA Agent (FDM) will be installed

Master/Slave concept

·         One of the nodes in your cluster becomes the Master and the rest become Slaves

·         Master responsibilities

o   Monitors availability of hosts / VMs in cluster

o   Manages VM restarts after VM/host failures

o   Maintains list of VMs available in each ESXi host

o   Restarting failed virtual machines

o   Exchanging state with vCenter

o   Monitor the state of slaves

·         Slave responsibilities

o   Monitor their running VMs and send Status to Master and perform restarts on request from Master

o   Monitors Master Node Health

o   If the master should fail, participates in master election

Master-election algorithm

·         Takes 15 to 25s (depends on reason for election)

·         Elects participating host with the greatest number of mounted datastores.

·         Managed Object ID is used if there is a tie.

An election is held when:

·         vSphere HA is enabled initially

·         Master’s host fails or enters maintenance mode

·         A Management Network partition occurs


·         Two different Heartbeat mechanisms

o   Network heartbeat mechanism

o   Datastore Heartbeat mechanism (New – Used when network is unavailable)

·         Network heartbeat mechanism

o   Sends heartbeat between slaves and master in every second

o   Election using UDP and master-slave communication using TCP

o   When a slave isn’t receiving any heartbeats from the master, it will try to determine whether it is isolated or whether the master is isolated or has failed

o   Prior vSphere 5.0, virtual machine restarts were always initiated, even if only the management network of the host was isolated and the virtual machines were still running

·         Datastore heartbeating

o   Adds a new level of resiliency and allows HA to make distinction between a failed host and an isolated / partitioned host.

o   Prevents Unnecessary Restarts

o   Two different files used : PowerOn file , Host hb file

o   Uses ‘PowerOn’ File to determine Isolation

o   Datastore heartbeat mechanism is only used in case the master has lost network connectivity with the slaves

o   2 datastores are automatically selected by vCenter for this mechanism

o   For VMFS datastores, the Master reads the VMFS heartbeat region(Uses locking mechanism).

o   For NFS datastores, the Master monitors a heartbeat file that is periodically touched by the Slaves

o   File created by each hosts in datastore (Host-<number>-hb)

o   Virtual Machine availability is reported by a file created by each Slave which lists the powered on VMs. (host-<number>-poweron)

Locking mechanism

·         HA leverages the existing VMFS files system locking mechanism

·         The locking mechanism uses a so called “heartbeat region” which is updated as long as the lock on a file exists

·         Host needs to have at least one open file on the volume to update heartbeat region

·         Per-host file is created on the designated heartbeating datastores to ensure heartbeat

·         HA will simply check whether the heartbeat region has been updated

Isolated vs Partitioned

·         Host is considered to be either Isolated or Partitioned when it loses network access to a master but has not failed

·         Isolation address is the IP address the ESXi hosts uses to check on isolation when no heartbeats are received

·         VMware HA will use the default gateway as an isolation address (Normally)

·         Isolated

o   Is not receiving heartbeats from the master

o   Is not receiving any election traffic

o   Cannot ping the isolation address

·         Partitioned

o   Is not receiving heartbeats from the master

o   Is receiving election traffic

o   (at some point a new master will be elected at which the state will be reported to vCenter)

·         When multiple hosts are isolated but can still communicate amongst each other over the management networks, it is called a network partition.

·         When a network partition exists, a master election process will be issued.

·         By default the isolation response is triggered after ~30 seconds with vSphere 5.x

Failed Master Host

·         Master Election Initiated

·         New Master Elected

·         New Master Restarts all VMs on the Protected list with Not Running State

Failed Slave Host

·         Master Check Network heartbeat

·         Master Checks Datastore Heartbeat

·         Master Restarts VMs Affected

Isolation Responses

·         Power Off

·         Leave Powered On

·         Shut Down

Isolation Detection

·         Slaves will Hold Single Server Election and Check Ping Address

·         Master will Check Ping Address

·         Master Restarts VMs Affected

Isolation of a slave

·         T0 – Isolation of the host (slave)

·         T10s – Slave enters “election state”

·         T25s – Slave elects itself as master

·         T25s – Slave pings “isolation addresses”

·         T30s – Slave declares itself isolated and “triggers” isolation response

Isolation of a master

·         T0 – Isolation of the host (master)

·         T0 – Master pings “isolation addresses”

·         T5 – Master declares itself isolated and “triggers” isolation response

Master declares a host dead when:

·         Master can’t communicate with it over the network

·         Host is not connected to master

·         Host does not respond to ICMP pings

·         Master observes no storage heartbeats

Results in:

·         Master attempts to restart all VMs from host

·         Restarts on network-reachable hosts and its own host

Master declares a host partitioned when:

·         Master can’t communicate with it over the network

·         Master can see its storage heartbeats

Results in:

·         One master exists in each partition

·         VC reports one master’s view of the cluster

·         Only one master “owns” any one VM

·         A VM running in the “other” partition will be

·         monitored via the heartbeat datastores

·         restarted if it fails (in master’s partition)

·         When partition is resolved, all but one master abdicates

A host is isolated when:

·         It sees no vSphere HA network traffic

·         It cannot ping the isolation addresses

Results in:

·         Host invokes (improved) Isolation response

·         Checks first if a master “owns” a VM

·         Applied if VM is owned or datastore is inaccessible


·         Restarts those VMs powered off or that fail later

·         Reports host isolated if both can access its heartbeat datastores, otherwise dead

Determine if a slave is alive

·         Rely on heartbeats issued to slave’s HB datastores

·         Each FDM opens a file on each of its HB datastores for heartbeating purposes

·         Files contain no information. On VMFS datastores, file will have the minimum-allowed file size

·         Files are named X-hb, where X is the (SDK API) moID of the host

·         Master periodically reads heartbeats of all partitioned / isolated slaves

Determine the set of VMs running on a slave

·         A FDM writes a list of powered on VMs into a file on each of its HB datastores

·         Master periodically reads the files of all partitioned/isolated slaves

·         Each poweron file contains at most 140 KB of info. On VMFS datastores, actual disk usage is determined by the file-sizes supported by the VMFS version

·         They are named X-powereon, where X is the (SDK API) moID of the host

Protected-vm files are used

·         When recovering from a master failure

·         To determine whether a master is responsible for a given VM

FDMs create a directory (.vSphere-HA) in root of each relevant datastore

Within it, they create a subdirectory for each cluster using the datastore.


Post a Comment

Popular posts from this blog

VMware and Windows Interview Questions: Part 2

VMware and Windows Interview Questions: Part 3

VMware vMotion error at 14%