External Monitoring broken in TMOS 11.2.x??
Hi,
I discovered a strange problem with external monitors on 11.2.x. After an upgrade attempt from 10.2. to 11.2., all my external monitors (perl scripts) ceased to work. There was no message at all in /var/log/ltm, so I enabled debug logging for bigd (tmsh modify sys db bigdb.debug value enable). Additionally I monitored bigd with strace. There was not much information in the debug file (/var/log/bigdlog), but strace did show me, that the external monitoring script could not be read/found !?? (see info below).
I opened a support case but I wonder that I'm the first guy to detect this problem, in 11.2.x !??! So, I thought it might be a good idea to discuss this problem on devcentral as well.
Now my questions:
1.) Does this work for you guys? I mean: external scripts with 11.2.x? I do get this problem on all systems and with almost all external scripts. It seems to work if I just reload the external scripts 5-10 times. But that's to weird to run this in production!
2.) Any idea why execve() reports ENOENT (see below)? The file is there, readable and runs fine from the CLI.
3.) Any idea how I can fix this?
==> /var/log/bigdlog
As you can see, the "EAV failed", however without any specific reason.
2012-09-28 17:11:02.931264: ID 6 :(_analyze_pings): visit DOWN, now=1348845062.930199 [ addr=::ffff:14x.xx.40.216:443 mon=/Common/citrix_web_interface_DORI_ctxprep1a fd=-1 pend=0 up_intvl=10 dn_intvl=10 timeout=31 time_until_up=0 immed=0 next_ping=1348845063.300199 last_ping=1348845053.316199 deadline=1348845062.916199 snd_cnt=27 rcv_cnt=0 ]
2012-09-28 17:11:03.331527: ID 6 :(_do_ping): time to ping, now=1348845063.331199 [ addr=::ffff:14x.xx.40.216:443 mon=/Common/citrix_web_interface_DORI_ctxprep1a fd=-1 pend=0 up_intvl=10 dn_intvl=10 timeout=31 time_until_up=0 immed=0 next_ping=1348845063.300199 last_ping=1348845053.316199 deadline=1348845072.930199 snd_cnt=27 rcv_cnt=0 ]
2012-09-28 17:11:03.332249: ID 6 :(_spawn_external_pinger): spawned EAV pid=13639 [ addr=::ffff:14x.xx.40.216:443 fd=9 ]
2012-09-28 17:11:03.333216: ID 6 :(_main_loop): rfd selected [ addr=::ffff:14x.xx.40.216:443 srcaddr=::%0:0 fd=9 pend=0 ]
2012-09-28 17:11:03.333394: ID 6 :(_recv_external_node_ping): reading [ addr=::ffff:14x.xx.40.216:443 ]
2012-09-28 17:11:03.333449: ID 6 :(recv_external_node_ping): EAV failed [ addr=::ffff:14x.xx.40.216:443 ]
2012-09-28 17:11:03.333498: ID 6 :(_kill_external_pinger): killing [ addr=::ffff:14x.xx.40.216:443 ]
==> /var/tmp/bigd.strace.
As you can see, the execve() reports " ENOENT (No such file or directory)", although the file is there and works, if run from the shell.
[root@LB:Active:Standalone] external_monitor_d grep execve /var/tmp/bigd.strace.11.2.1.txt
13618 execve("/config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1", ["/config/filestore/files_d/Common", "::ffff:14x.xx.40.216", "443"], [/* 23 vars */]
13618 <... execve resumed> ) = -1 ENOENT (No such file or directory)
13619 execve("/config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1", ["/config/filestore/files_d/Common", "::ffff:14x.xx.40.216", "443"], [/* 23 vars */]
13619 <... execve resumed> ) = -1 ENOENT (No such file or directory)
13620 execve("/config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1", ["/config/filestore/files_d/Common", "::ffff:14x.xx.40.216", "443"], [/* 23 vars */]
13620 <... execve resumed> ) = -1 ENOENT (No such file or directory)
[root@LB:Active:Standalone] external_monitor_d ls -al /config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1
-rwxr-xr-x 1 tomcat tomcat 15580 Sep 28 15:03 /config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1
[root@LB:Active:Standalone] external_monitor_d perl -c /config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1
/config/filestore/files_d/Common_d/external_monitor_d/:Common:citrix_web_interface_monitor.pl_1 syntax OK
Thanks!
Kurt