Forum Discussion

Anton639_262757's avatar
Anton639_262757
Icon for Nimbostratus rankNimbostratus
May 11, 2016

Wrong SNMP Trap email alert being triggered in user_alert.conf

I currently have two webpages being hosted on the same server. I am using f5 to monitor those pages with the HTTPS health monitor. I have two separate monitors for the two pages. My goal was to be able to receive email alerts when one of the monitors would fail but i wanted the email to state which monitor exactly was the one generating the alert so that i can know immediately which page is no longer up. I did the following in the user_alert.conf

alert WEBPAGE1 Monitor Fail " SNMP_TRAP: Pool /Common/Test_Pool member Server_Test (ip:port=10.100.X.X:0) state change green --> red ( Monitor /Common/WebPage1_Monitor from 10.10.X.X : connect: timeout search result false)" {
    snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.200";
                email toaddress="anton639@email.com"
  fromaddress="F5_BIGIP "
  body="Webpage1 Monitor Fail"
}

alert WEBPAGE2 Monitor Fail " SNMP_TRAP: Pool /Common/Test_Pool member Server_Test (ip:port=10.100.X.X:0) state change green --> red ( Monitor /Common/WebPage2_Monitor from 10.10.X.X : connect: timeout search result false)" {
    snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.201";
                email toaddress="anton639@email.com"
  fromaddress="F5_BIGIP "
  body="Webpage2 Monitor Fail"
}

My issue is that when i am testing and i intentionally stop webpage 2 from running, i am receiving the email alert for webpage one. I am assuming the snmp trap text used to identify the event is not differentiating between the two monitors and is sending the first snmmp trap in the list. Is it possible to send an email alert for the specific health monitor that is failing even though the monitors are of the same type? What can be changed in my configuration to achieve this? Your assistance will be appreciated.

2 Replies

  • I don't think you've provided the actual config that you used, because the example above would generate a syntax error due to the spaces in the name "WEBPAGE1 Monitor Fail"

    alert WEBPAGE1 Monitor Fail " SNMP_TRAP: Pool ....
    

    So I'll assume you haven't actually made that error on your real device (if you have, you'll see alertd restarting constantly, and messages about an error on line 1 of /config/user_alert.conf in the /var/log/ltm log)

    From the text of the error message you're matching on, it appears to be a GTM alert that you want to act on.

    I've set up a quick test, and for me, it appears to work (it sends the .201 and .202 traps, though I could just have easily set it up to send email, as you have done)

    Here's what I've got in alert.conf:

    alert WEBPAGE1 "SNMP_TRAP: Pool /Common/gtm-pool member /Common/test1 .* state change" {
       snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.201;
    }
    
    alert WEBPAGE1 "SNMP_TRAP: Pool /Common/gtm-pool member /Common/test2 .* state change" {
       snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.202;
    }
    

    Then I have a GTM pool with two virtuals in it, and I'm bring those virtuals up and down to generate the messages. When that happens, I get the normal snmp trap for that message, plus the .201 (or .202) OID.

    If you're wanting to monitor the LTM event, then you would need to change the match pattern to the appropriate ltm log entry.

  • sorry for the late reply. I had modified my original config to post up here and must have added spaces by mistake. I will look at the config you rplied with and see if I can find my error. thankd