Exchange

Introduction to Managed Availability: Local Monitoring Files and Overrides Part III

Now that you’ve finished Part I & Part II of my three part Managed Availability blog series, I will now provide some information about local .xml monitoring files and overrides of Managed Availability.

Local Managed Availability .xml monitoring files

Some HealthSets, such as the FEP HealthSet are local .xml files. Because FEP is the Forefront Endpoint Protection service, some of you may want to disable this HealthSet on the servers, because there is no use for it.
Browse to %ExchangeInstallationPath%\Microsoft\Exchange\V15\Bin\Monitoring\Config, search for FEPActiveMonitoringContext.xml and open the file with an editor, such as Notepad.
Change line 12 by replacing Enabled = True to Enabled = False
Restart the Microsoft Exchange Health Management service on the server where you modified the .xml file.

Overrides

With overrides, you can change the Managed Availability monitoring thresholds and define you own settings when Managed Availability in case of errors should take action.
There are two kinds of overrides:

  • Local overrides: are used to customize a component on a specific server or on components which aren’t globally available. For example, if you are running multiple data centers and would like to change only server components on a specific location for individual monitoring. Local overrides are managed with the *-SetMonitoringOverride set of cmdlets. They are stored in the registry under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\ExchangeServer\v15\ActiveMonitoring\Overrides\ and are automatically updated every 10 minutes. The Microsoft Exchange Health Management service reads the changes in the registry path above.
  • Global overrides: are used to customize a component for a whole Exchange organization. They are managed with the *-GlobalMonitoringOverride set of cmdlets. Global overrides are stored in Active Directory:

CN=Overrides,CN=Monitoring Settings,CN=FM,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=Xiopia,DC=local
You can set overrides for specific Exchange versions, such as CU6 with version “15.0.995.29”. This setting will then be effective until the Exchange version changes and will be set with the ApplyVersion parameter.
The other method is to set overrides for a specific timeframe. With Exchange 2013 CU6 you can set overrides for a maximum of 365 days with the Duration parameter.

Managed Availability and server reboots

Responders only execute in the event that a monitor is marked in an Unhealthy state and will try to recover that component. Managed Availability provides multi-stage recovery actions:

  1. Restart the application pool
  2. Restart the service
  3. Restart the server
  4. Take the server offline so that it no longer accepts traffic

There are several types of responders available:
Restart Responder, Rest AppPool Responder, Failover Responder, Bugcheck Responder, Offline Responder, Escalate Responder, and Specialized Component Responders.In this article, I will be primarily discussing Restart Responders.

Restart responders are subject to throttling policies. This means, the responder definition contains a section “ThrottlePolicyXML” which can be overridden if desired. For example, we use the “StoreServiceKillServer” responder. To view the definitions, use the following cmdlet via EMS:
(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/ResponderDefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like “StoreServiceKillServer”}

There are many parameters, such as ServiceName, CreatedTime, Enabled, MaxRetryAttempts, AlertMask, and so on. The following section from the restart responder definition is important:
 ThrottlePolicyXml :

LocalMaximumAllowedAttemptsInOneHour=”-1″ LocalMaximumAllowedAttemptsInADay=”1″ GroupMinimumMinutesBetweenAttempts=”600″ GroupMaximumAllowedAttemptsInADay=”4″ />

The thresholds are self-explanatory. The only difference is “Local” and “Group.” Local means one Exchange server and group means there are more than one Exchange servers in your organization. You have to check and configure the setting based on your needs.

To prevent a reboot, create a local or global override:
I was looking for a “*ForceReboot*”  by Managed Availability and found the following Requester:
(Get-WinEvent -LogName Microsoft-Exchange-ManagedAvailability/* | % {[XML]$_.toXml()}).event.userData.eventXml| ?{$_.ActionID -like “*ForceReboot*”} | ft RequesterName
ServiceHealthMSExchangeReplForceReboot

Add-GlobalMonitoringOverride -Identity Exchange\ServiceHealthMSExchangeRepIForceReboot -ItemType Responder -PropertyName Enabled -PropertyValue 0 –Duration 90:00:00:00

To check the configuration changes, use the following cmdlet:
(Get-WinEvent -LogName Microsoft-Exchange-ActiveMonitoring/responderdefinition | % {[XML]$_.toXml()}).event.userData.eventXml | ?{$_.Name -like “ServiceHealthMSExchangeRepIForceReboot “} | ft name,enabled
clip image002
This prevents the server from a force reboot in case of errors with the “ServiceHealthMSExchangeRepl” health set. Enabled must be “0” (instead of 1).

Inform Managed Availability about the repairing process

To inform Managed Availability (and your monitoring software too) that you are in a repairing process, use the following cmdlet and define using the Name Parameter for the appropriate Monitor:
Set-ServerMonitor –Server -Name Maintenance –Repairing $true

After repairing:
Set-ServerMonitor –Server -Name Maintenance –Repairing $false

To avoid automatic recovery actions, you should disable the managed service using Set-ServerComponentState:
Set-ServerComponentState –Component RecoveryActionsEnabled –Identity -State Inactive –Requester Functional

After finishing recovery you have to enable the RecoveryActionsEnabled component with the following cmdlet:
Set-ServerComponentState –Component RecoveryActionsEnabled –Identity -State Active –Requester Functional

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s