Health Checks

 
monitor.png

This page contains all the information you need to know about the Health Checks including detailed information on each one supplied with MonitorWang. In general a HealthCheck will return a result that is "true" or "false" indicating success or failure of the test plus it will also provide a numeric (long/int64/bigint) value in the ResultCount property - this value being appropriate to the specific test; eg: the MSMQ Queue Info HealthCheck Result property is based on whether the queue exists or not and the ResultCount value will be the number of items within the queue if it does exist. Please consult the documentation for each HealthCheck below for details on the value placed in the ResultCount property - this can help you should you wish to display this in a Geckoboard dashboard widget.

Exception Handling

A Health Check is effectively "sandboxed" by it's scheduler - very simply a Health Checks "Execute()" method is wrapped in a try/catch block and should it throw an exception it will be caught and details of the exception will be published; the result will be "null" as we don't know whether it actually passed or failed. Detailed information about the exception is available in the Result.Check.CriticalFailureDetails property; the Result.Check.CriticalFailure property will also be set to true to clearly indicate a critical failure has occured. The result message will also contain a GUID - this GUID is used by MonitorWang when it logs the error in the monitorwang.log file and allows you to locate the exception details and stack trace for the failure.

The Health Check will continue to execute in case this is just a transient problem.

So - no explicit code is required to deal with exceptions, the MonitorWang infrastructure takes care of this for you however if you do need to trap an exception but still want this reported/handled by MonitorWang then in your custom HealthCheck provide your own try/catch and then just rethrow the exception after you have done your processing to allow MonitorWang to deal with it.

Owl Energy Monitor

If you want to monitor your home/office/properties energy usage then check out this page dedicated to the Owl Energy Monitor and how MonitorWang can access this data and display it on Geckoboard.

Sql Query based

This check allows you to provide a FROM clause to a Sql query. MonitorWang will append your "FROM..." to a "SELECT COUNT(*)" to create a scalar query statement - the number of rows returned can then be used to interpret success or failure.

Pre-requisites

The account identity that MonitorWang runs under will need to have the appropriate Sql LOGIN and SELECT permissions to the Sql entities in your FROM clause.

Configuration

  • The config\check.castle.config file stores the check configuration - the example one that ships with MonitorWang is called "MonitorWangDatabaseHasResults".
    • The ConnectionString property is now "smart" - this can be an actual connection string OR the name of a connection stored in config\data.connection.config. This means that you can reference a connection string rather than have to replicate it to the various database components that require it making deployment easier.
    • Remember when writing your FROM query - you must encode any xml entities such as <>" etc. These are commonly used when writing queries.
    • InterpretZeroRowsAsAFailure controls how the row count is interpretted as a result. Set this to true and the check result will indicate a failure if the query returns no rows. Set it to false to make a non zero rowcount a failure.
  • Finally remember to associate your check with a scheduler in the config\binding.castle.config file

LogParser based

Microsoft LogParser is a fantastic utility for interrogating logging information. "The world is your database with Log Parser"...I kid you not - this is taken from the official docs. Basically it allows you to write SQL-esque style statements against lots of textual log data sources (listed below) - it's pretty cool, like an old-school LINQ to "Log Data" type thing.

Stuff you can do with it!....
  • Search the Eventlog for entries from specific applications
  • Search IIS logs for any status code (eg: 404's)
  • Search rss/atom feeds for a specific term
  • Search the file system for files that are larger/smaller than a specific size
  • Search firewall logs
  • Loads more things!!
Check out the LogParser documentation that is installed with LogParser (see Pre-requisites below) for more excellent examples of what you can do with it.

More LogParser resources (links pinched from Lizard Labs)...

Supported Input Formats

  • CSV
  • EVT (Event Log)
  • FS (File System)
  • IIS
  • IIS (W3C)
  • REG (Registry)
  • TEXTLINE
  • TSV
  • URLSCAN
  • W3C - Examples of log files in this format include,
    • Personal Firewall log files
    • Microsoft Internet Security and Acceleration Server (ISA Server) log files
    • Windows Media Services log files
    • Exchange Tracking log files
    • Simple Mail Transfer Protocol (SMTP) log files
  • XML

Unsupported Input Formats

I took a wild stab in the dark at what wouldn't be widely used/required and have not implemented Health Checks based on the following formats. If you're desperate for one of these and can't roll it yourself then drop me a request via the discussion pages and I'll see what can be done - to be honest though they are pretty simple to implement, just look at any of the checks in the Contrib.Checks.LogParser project source to see how simple!
  • ADS (Active Directory)
  • BIN
  • COM
  • ETW
  • HTTPERR
  • IISODBC
  • NCSA
  • NETMON
  • TEXTWORD

Pre-requisites

In order to use any LogParser based Health Check you first need to download and install LogParser onto the machine that will run these checks; LogParser cannot be redistributed so you must download and install it yourself - this is the link to download it from Microsoft. LogParser can query logs from other machines (remotely) but you do not need to install it on the remote machine being queried.

If you are new to LogParser then I recommend using one of the cool GUI's to help you get started with it; I use Visual LogParser but have become aware of Log Parser Lizard which also looks pretty good. Both free/donation-ware.

Configuration

  • From the source code, build the MonitorWang.Contrib.Checks.LogParser project
    • This is my first attempt at a "Contrib" style drop-in configuration for an external dll containing checks
  • Copy MonitorWang.Contrib.Checks.LogParser.dll & Interop.MSUtil.dll to the MonitorWang Agent folder. Also copy the files from the "Config" folder, logparser.binding.castle.config & logparser.check.castle.config.
    • The binaries zip will have these files already installed in the correct location.
  • Update the logparser.binding.castle.config & logparser.check.castle.config as appropriate.
  • In the "check" file "enable" the checks you wish to run. Please consult the LogParser "Documentation" for help with writing the queries.
    • Remember when writing your FROM query - you must encode any xml entities such as <>" etc. These are commonly used when writing queries.
    • The parameters to each check are named (almost) the same as the original LogParser parameter - use the LogParser documentation to help explain each one.
  • The "binding" file contains bindings for ALL LogParser based checks - just find the ones that contains the checks you have enabled and make sure they bind to the correct "schedule" configuration; the default is "EveryMinute".
  • The ResultCount property will contain the number of rows returned by the query.

Windows Service State

Released in v1.0.6
This check will compare the state of a windows service to the state you expect the service to be in. If you have mission critical services that must be running then you can use this check to ensure that they are always "Running".

Features

  • Support for two states, "Running" and "Stopped" (you might also want to ensure a particular service is NOT running).
  • Local and Remote - check for services on the same server as MonitorWang or a remote server.
  • Multiple services - create a list of services that should be in the same state, this cuts down on configuration.

Pre-requisites

  • Permissions to query/use the Service Control Manager information on the target server

Configuration

  • Set the "ExpectedState" property; valid values are "Running" and "Stopped"
  • Set the "Server" property if you need to check services on machine different to the one running the MonitorWang Agent. Leave blank or remove the element to check against the localhost.
  • Add a new <item> to the "Services" property for each windows service you wish to monitor. You can use the service display or short name here.

Windows Service Startup

Released in v1.0.6
This check will compare the startup type of a windows service to the type you expect the service to have. Often after system maintenance or service redeployment the startup type can be reset so this check will ensure your services are always in the correct startup mode.

Features

  • Local and Remote - check for services on the same server as MonitorWang or a remote server.
  • Multiple services - create a list of services that should be in the same state, this cuts down on configuration.

Pre-requisites

  • Permissions to query/use the Service Control Manager information on the target server

Configuration

  • Set the "ExpectedStartupType" property; valid values are "Auto", "Manual" & "Disabled"
  • Set the "Server" property if you need to check services on machine different to the one running the MonitorWang Agent. Leave blank or remove the element to check against the localhost.
  • Add a new <item> to the "Services" property for each windows service you wish to monitor. You can use the service display or short name here.

Url Ping

Released in v1.0.6
This check will ping a url to test it responds and can optionally publish its result based on the response time. You can setup this check to either always publish a result or only if it fails. As a successful ping includes the response time you can use this to monitor variations over time (perfect in Geckoboard as a Linechart or Geck-o-meter via the MonitorWang Geckoboard Data Service Activity plugin!)

Features

  • Publish only failures or successful pings to track response times.
  • Monitor multiple urls, just add a new item to the list.
  • Set an optional response time threshold - if the response is slower than this (milliseconds) then publish this as a result failure.

Pre-requisites

  • Outbound http access from the MonitorWang server to the urls

Configuration

  • Add a new <item> to the "Urls" property for each url you wish to monitor.
  • Set the "FailIfResponseMillisecondsOver" property to the maximum number of milliseconds you expect the url to respond by. If the url takes longer than this the check will publish a failed result.
    • Leave this property blank, set to 0 (zero) or omit from the xml config to disable this feature.
  • If you are only interested in tracking whether the url is available or not then set the "PublishOnlyIfFailure" property to true (default). If you want to track the response time then set this to false and every ping result will be published; successful pings include the response time in the result "ResultCount" property so you publish these to a database and then visualise them with the MonitorWang Geckoboard Data Service Activity as a linegraph or Geck-o-meter.

FileInfo

Will return information about a specific file including,
  • Exists (= Check Result, exists = true, not exists = false)
  • CreationTimeUtc
  • LastAccessTimeUtc
  • LastWriteTimeUtc
  • Length (bytes)
  • Attributes
  • ResultCount property will contain the file size in bytes.

Pre-requisites

  • File system permission to access the file specified

Configuration

  • Configure the check for filename and binding file for schedule

FolderInfo

Will return information about a specific folder including,
  • Exists (= Check Result, exists = true, not exists = false)
  • CreationTimeUtc
  • LastAccessTimeUtc
  • LastWriteTimeUtc
  • FileCount (immediate children)
  • FolderCount (immediate children)
  • Attributes
  • ResultCount property will contain the number of files

Pre-requisites

  • File system permission to access the folder specified

Configuration

  • Configure the check for folder name and binding file for schedule

MSMQ Queue Info

Will return information about a specific MSMQ queue including,
  • Exists (= Check Result, exists = true, not exists = false)
  • Number of messages
  • Datetime stamp of the oldest message
  • ResultCount property will contain the number of messages

Pre-requisites

  • Permission to access the MSMQ specified

Configuration

  • Configure the check for queue name and binding file for schedule

MSMQ Queue Not Empty

Slightly modified version of the MSMQ Queue Info check. Check result is based on number of queue items;
  • 0 = success
  • > 0 = failure

Pre-requisites

  • Permission to access the MSMQ specified

Configuration

  • Configure the check for queue name and binding file for schedule

WMI Process Not Running

This will simply return the number of processes that match the name specified in the check configuration
  • Check result = fail if zero processes running
  • ResultCount property will contain the number of matching running processes

Pre-requisites

  • ?

Configuration

  • Configure the check for the process name and binding file for schedule

Last edited May 14, 2011 at 4:51 AM by jimbobdog, version 34

Comments

No comments yet.