Welcome back for another episode of the ABC's of NSM.  What's NSM you say?  We'll go with Network and System Management, but you could throw Security in there as well.  We'll work our way through the alphabet over the next several weeks looking at tools and concepts along the way for all the administrators out   there.   By the way, you can thank Joe for the format & Donabc_2_4 for the title (I  couldn't for the life of me come up with one.)   

Today's letter N is for Nagios.  Nagios is a web-based monitoring platform primarily focused on host and service availability.   There are literally hundreds of community plug-ins available to extend Nagios, and near and dear to my heart, integration with Cacti.  Some of the features beyond simple availability monitoring include:

  • Alerts - Send notifications immediately via page or email to first line operators
  • Escalations - If problem isn't resolved at specified notification, escalate to next level
  • Remediation - Configure auto-response scripts to restart services, etc
  • Planning - Schedule downtime for groups / services to suppress alerts during maintenance
  • Web views - Group hosts / services accordingly to quickly isolate problems
  • Multi-tenancy - Different users can be presented with restricted views for customer/role isolation

I won't lie to you, the initial configuration process is a little tedious and can seem overwhelming, but the payoff is worth it, trust me.

 

Follow me on Twitter Follow me on LinkedIn Follow me on Facebook