REALLY IMPORTANT NEWS!

11th June 2011

MonitorWang is getting a new name and some important updates in the next release.

If you are looking for the latest version of MonitorWang it is now located here: "Wolfpack" (a new codeplex project)

Based on feedback and users/contributors requirements I've launched the next version under a new codeplex project named "Wolfpack". I've been hard at work porting it to .Net 4 and a VS2010 solution plus added some urgently needed tests and a couple of new Health Checks (CPU & Disk space); the source code has also been loaded into the Wolfpack codeplex Mercurial repository. The rename is aimed at providing a more serious image as it has a few "issues" Stateside - when I originally named the project I really didn't expect it to take off as much as it has and feedback about the name potentially stopping its use is reason enough to make the change.

This MonitorWang project will remain as is to host the .Net 3.5 version but no new development will be done on this codebase and there will be no new releases here.

Thanks to all that have supported/downloaded MonitorWang - hopefully you'll all join the "Wolfpack" and continue to make this an even better monitoring solution! If you have any RSS feed subscriptions here then please consider updating them to the equivalent in Wolfpack and start "following" it too. The Wolfpack documentation page is the place to start your journey!

Cheers,

James

 

Project Description

MonitorWang is an extensible .Net windows service based framework for running jobs to monitor your software and system. The data collected can be sent directly to Growl clients, WCF, SQL, SQLite, NServiceBus and it exposes a native Geckoboard REST/JSON data feed so you create instant dashboards for your system. It comes preloaded with some tasks but it's simple to implement your own!

Quick Start

The Documentation page has links to installing, configuring and creating your own custom plug-ins. You will also find a Roadmap link and details of how to stay up to date with project news. For more information about the idea behind the project and a high level view of how it works read on...

Background

In my day job our team often has to monitor and report on the state of our software - error logs, queues etc. Rather than solve this directly I saw an opening for an open source system/software monitoring framework - something relatively simple to extend and implement your own custom monitoring activities. Our specific requirement is to monitor the length of MSMQ queues (usually an "error" queue!) for our NServiceBus installations. So I started this open source project - odd name I know, all my OSS projects are called SomethingWang - this explains why.

What is it?

MonitorWang is a "system monitor" - however it's focus is aimed squarely at monitoring the touch points your application has with it's infrastructure. From my experience over the last 8 years working with large scale eCommerce and business systems it is remarkable how many critical logs, queues, database tables accumulate or record error or abnormal conditions only it to go un-noticed until it blows up and becomes a real problem. Early detection using a monitoring system could save you many many hours of "What went wrong? Get it working!" time.

MonitorWang can also be used to monitor for custom business activity - your KPI's, not just errors. For instance, the easy to use "Sql Scalar HealthCheck" allows you to quickly setup queries to detect these activities and data scenarios - combined with a Geckoboard account you can create a powerful business dashboard to visualise your system. Need to see how many high value orders have been placed in the last 3 hours or the number of new users? Simple, MonitorWang can do this "out-of-the-box" against your SqlServer (& other supported) databases!

MonitorWang aims to provide a simple, extensible system you can easily adapt to monitor your mission critical applications, platforms and systems. It's designed so that you can easily create new plugins to detect and monitor scenarios and situations unique to your systems and business however the plugins provided should be able to cover many of these common "touch points" including...
  • IIS logs
  • Firewall logs
  • Event logs (including queries that join to those on remote machines)
  • Many other textual logging formats such as CSV, XML (including making an http call to retrieve the data, eg: RSS, Webservice)
  • FileSystem
  • Sql Server data (write queries to detect any sort of data condition eg: monitor for orders > £value or not despatched after N days)
  • MSMQ
  • Windows Services
  • Web service/site Ping
Finally it uses best of breed components to keep you notified of what your system is doing. Growl and Geckoboard provide excellent notification and visualisation mechanisms - MonitorWang provides support for these "out of the box" so you'll never miss a problem again! For those that don't know these components, Growl is a desktop system tray toaster style notification app that can even forward notifications to your smartphone and Geckoboard provides a browser-based dashboard experience with many custom visualisation widgets alongside native support for many enterprise systems like Basecamp, Pingdom, Google Analytics, Zendesk.

It's designed so that you can run MonitorWang instances ("Agents") across many servers collecting data and publishing to another "Server" instance where you would republish the data to a database and/or Growl - this is the Distributed System Monitoring role it was primarily aimed at.

Distributed MonitorWang

BTW: It's written in C# with Visual Studio 2008 and targets v3.5 .Net framework. It can be run on both x86 and x64 windows operating systems.

It's a .Net Windows Service (based on the Topshelf framework) that has a plug-in architecture for performing any custom activities ("Health Checks") you wish on a set schedule. The data from each Health Check is then "published" via another set of extensible plugins ("Publishers"). You can also create general background activity plug-ins ("Activities") - these just start and stop with the service; a good example is the activity that creates a WCF ServiceHost to run a self-hosted WCF service. Finally it supports the ability to create plugins for when a Health Check executes ("Schedulers" eg: at a fixed time/schedule, triggered by an external event etc).
The full list of extensibility points is,
  • Startup Plugin - executes at the very beginning of an Agent starting up - use this for general initialisation or Agent configuration
  • Scheduler Plugin - defines when the associated Health Check plugin will execute
  • HealthCheck Plugin - the actual test/code to run on the schedule defined by its host Scheduler plugin
  • Publisher Plugin - these provide the means to communicate the result of a Health Check
  • Activity Plugin - these provide general service wide features - these are usually messenging orientated - eg: create the NServiceBus handlers to receive NSB messages or create a WCF ServiceHost to allow results to be sent to MonitorWang or retrieve data as in the case of the RESTful Geckoboard Data Service Activity.
  • RoleProfile Plugin - a "role profile" component forms the heart of MonitorWang. A default "Agent" profile is provided and this will load and execute all the components required by the MonitorWang service. You can provide a custom implementation for this core component by simply implementing an interface and passing the name (class name) on the command line switch at startup.
  • Publisher Filters - these provide a way to attach custom rules to whether a result is published. Filters can be global for either Publisher or Health Check or made specific to a single Publisher/HealthCheck combination.
  • Growl Notification Finalisers - these provide a way to attach custom logic to format the Growl notification (priority and message text). There are two built-in Finalisers that quickly allow you to set the priority based on the success/failure of a HealthCheck or the "Count" that a HealthCheck returns via simple xml configuration. Being able to set the priority is useful as you can configure Growl to forward notifications based on the priority (eg: only forward "Emergency" priority notifications to your iPhone). For instance you could create a WMI Process Check to detect a mission critical process that should always be running - if it failed ("count" = 0) you could set the notification to "Emergency". Finalisers can be chained together (order is undefined though). The built-in Finalisers also serve as a great example of how to create your own ones - a base class to inherit from takes care of some heavy lifting so you can concentrate on just writing your logic.

HealthCheck plugins provided (more info here); see this guide for a walk-thru on creating a new Health Check.

  • Reporting MSMQ queue state (queue length, oldest message datetime)
  • Reporting MSMQ queue is not empty
  • Detect if a process running (local/remote - WMI)
  • Windows Service State (local/remote) - checks the services you specify are in the correct state (eg: running)
  • Windows Service Startup (local/remote) - check the services you specify have the correct startup type (eg: auto/manual/disabled)
  • Url Ping - allows you to specify multiple urls to ping and optionally set a response time threshold. If the ping fails or breaches the threshold a failed result will be published. You can also keep track of response times - looks great graphing them with a Geckoboard widget!
  • File information - reports data about a specific file (exists, size etc)
  • Folder information - report data about a specific folder (exists, number of files/sub folders)
  • LogParser based HealthCheck components (query the EventLog, IIS, XML, FileSystem, RSS/Atom feeds! etc)
  • SqlServer Scalar Query - write ad-hoc queries (a SELECT COUNT(*) FROM... statement) to detect custom data conditions in your SqlServer databases
  • Owl Energy Monitor HealthCheck - this allows you to monitor your home/office energy consumption and view the data in your Geckoboard.

 Publisher plugins provided,

  • MSSQL Database - save the monitoring data to MS SqlServer database.
  • SQLite Database - save your MonitorWang data in SQLite format.
  • WCF - transmit the monitoring data to a WCF service (also provided).
  • NServiceBus (NSB) - send the monitoring data as a NSB message (to a NSB Gateway/handler provided).
  • Growl  - the awesome system tray notification app. Monitoring data can be sent to a Growl instance, from here you can forward it on to others (say your ops team) and even your iPhone via Prowl!

Activity plugins provided,

  • Geckoboard Data Service - A WCF REST Starter kit based JSON data feed that connects your MonitorWang data stored in a database to Geckoboard custom widgets. See the Activities page for more info.
  • WCF BasicHttp ServiceHost - provides self-hosting of the MonitorWang WCF service so that the publisher has somewhere to "publish" to!

Example Geckoboard dashboard with custom widgets powered by the MonitorWang Geckoboard Data Service

 

Scheduler plugins provided,

  • 24/7 Scheduler - provides total control when your health checks execute over a week period. You can set as many times per day as you like and which days including shorthand configuration to set weekdays, weekends and every day of the week; these can all be combined to provide a complete custom schedule. Eg: you could configure a check to run on set times weekdays, set additional times for monday, wednesday & saturday and have no timings on sunday.
  • Interval Scheduler - provides a simple interval based scheduler; executes the associated HealthCheck every N seconds.

It's a simple interface to implement to create your own activities to run any .Net code you like. Custom schedules are also possible by implementing another interface.

I've also worked on making the deployment options flexible. Each Health Check, Activity & Publisher has an "Enabled" configuration setting to allow you to quickly reconfigure what Health Checks and Activities are running and what publishers are invoked. Using this method we can quickly deploy the service using a simple xcopy deployment.

Publishers

The WCF & NServiceBus (NSB) publishers are particularly powerful in that they can be used to transmit the data to another server. This allows MonitorWang to be installed on the server to be monitored ("Agent") and send the data to another server running MonitorWang enabled to capture this data ("Server") where it is republished to whatever publishers have been "Enabled"; it is possible to have multiple MonitorWang "Agents" publishing data via WCF/NSB to the same "Server". An "Agent" publishing to the Growl publisher is also capable of communicating with a Growl client on a remote server.

Remember, Growl is a system tray application - it only runs for the logged-in/interactive user!

Growl Deployment

The SQLite & SqlServer publishers save the data into a simple table called "AgentData" - this allows you to capture the current "state" of the running HealthChecks; it would be possible to write a simple "viewer" app to query the db and display the data on a big flat panel screen sitting in the office so that everyone could see the "system health" however I recommend that you check out Geckoboard as MonitorWang has native support for this via the Geckoboard Data Service Activity. MonitorWang will publish to ALL "Enabled" publishers so you could set up the "Server" to republish to both a Database and Growl.

Rules & Thresholds

At the moment there are no "rules" or "thresholds" as such - only those baked directly into each HealthCheck (eg: only report something if it's logic says so) - I'm keen to keep HealthChecks dumb and for them to provide a stream of monitoring data and run a separate Activity for the "rules" - this Activity would interpret the monitor data and allow you to set thresholds for alerts. For instance you would have an "Agent" report an MSMQ queue length back to a "Server" which would save it to SQL; the "Rules" Activity would monitor the database and apply any alerting rules (eg: if Agent is X and Queue is Y and Queue Length > 5 then republish data to Growl as an emergency notification).

However, new to version 1.0.4 is a feature called Result Publisher Filters - these allow you to intercept the call to each enabled publisher and decide whether to actually publish the result or abort it.

Getting Started with MonitorWang

The documentation should provide you with everything you need.

This guide should help you install and configure MonitorWang...

This guide should help you write you own custom HealthCheck component and get it reporting data...

Last edited Aug 23, 2013 at 10:44 AM by jimbobdog, version 116