Forum Discussion
After some frustrating experiences, I found that you cannot run tcpdump out of the alertd execution context - SELinux gets in the way and prevents access to the network devices.
And yes - it does work in some circumstances, but not reliably for all releases/platforms/situations.
I had to build out a hardware/version compatible repro to demonstrate and solve this problem when I first ran into it.
I solved it like this:
Have a startup script that creates a named pipe, and waits on the named pipe to run the tcpdump
This is running in the root context and has permission to run tcpdump.
/config/startup/monitor_down_dump.sh
#!/bin/bash
NP=/var/run/monitor_down_tcpdump.pipe
if [ -e $NP ]; then
echo "$NP already exists; is this script already running?"
exit 1
fi
mkfifo $NP
read x < $NP
/bin/rm $NP
logger -p local0.info "$x"
# start a tcpdump
# THIS count VALUE MAY NEED TESTING AND TUNING
-nni 0.0:nnn -s0 -w /var/tmp/`uname -n`_`date +%F_%H:%M`.pcap
You also need a trigger script run from your user_alert that pushes data into the named pipe.
This runs in the alertd context and does not have permission to run tcpdump, but can push a message down the named pipe.
/shared/monitor_down_trigger.sh
#!/bin/bash
NP=/var/run/monitor_down_tcpdump.pipe
echo "debug_triggered" > $NP
and your user_alert.conf snippet
alert endb_mon_down "01070638:5: Pool /Common/pool_one member /Common/10.1.62.61:0 monitor status down." {
exec command="/shared/monitor_down_trigger.sh";
}
For my implementation, the customer also had a cron task that checked to see if the script was still running every 10 minutes, and restarted it if it had triggered or stopped. This may or may not be required.
- davidfisherJun 17, 2019Cirrus
I was trying the script on v12.1.
Is this workaround required for all versions? Which version are you running?
- Simon_BlakelyJun 17, 2019Employee
I developed that solution on 12.1.2, and I expect it to be required for all later versions.
It's complex, but it is reliable - just trying to run tcpdump out of user_alerts.conf may work (for example, it initially worked on my development 12.1.2 VE), but not for all cases (it didn't work on a physical 12.1.2 VCMP guest in the lab).
The solution I documented above does provide results.
However - it isn't instant, but using alertd introduces a delay anyhow.