Forum Discussion

Chris_15549's avatar
Chris_15549
Icon for Nimbostratus rankNimbostratus
Feb 02, 2009

Eav script not reporting correctly

Im having a problem with a EAV script. The script is working correctly and it connects to the host and when I run it by hand it returns UP but when the bip does it itself it returns down. but only for this one host. where as there is 2 other machines in this pool with the same check that it does not mark down and it works fine. I contacted f5 support but they didnt seem to have any idea what eav was or how it worked. Our salesrep directed me here and thought maybe someone could help. Thanks in advance!

6 Replies

  • Deb_Allen_18's avatar
    Deb_Allen_18
    Historic F5 Account
    If it is working correctly against 2 hosts but not another, not likely that it's an issue with the script, but go ahead & post your script so we can take a closer look.
  • well the reason I dont believe it to be the host is because if I run the script by hand it returns the correct response. its only when the f5 does it itself its not interpreting the responses correctly for some reason. They have the same config in the management gui. I can post the script if that would help but I dont think it would as I know the script works ive had it in production for over a year without any issues.
  • Deb_Allen_18's avatar
    Deb_Allen_18
    Historic F5 Account
    Hey again -

     

     

    You might want to check out this article I wrote a while back re: troubleshooting external monitors as well, if you haven't already:

     

     

    LTM External Monitors: Troubleshooting

     

    Click here

     

     

  • here is the script. it returns correctly on all the hosts the ltm just doesnt do it right.

     

     

     

     

    !/bin/sh

     

     

     

    MEMBER="${1}";

     

     

    [ ${} -lt 1 ] && exit 255

     

     

    PIDFILE="/var/run/gprsd_proc.${MEMBER}.eav.${PORT}.pid"

     

     

    if [ -f "${PIDFILE}" ]

     

    then

     

    kill -9 `cat "${PIDFILE}"` > /dev/null 2>&1

     

    fi

     

     

    echo "$$" > "${PIDFILE}"

     

     

    MEMBER=`echo $1 | sed 's/::ffff://'` 2>/dev/null

     

     

    log_info() {

     

    echo ${*} | logger -p local0.info

     

     

    stock syslog-ng destinations -

     

    local0.info - /var/log/ltm you want to see your syntax and the shell evaluation of

     

    variables (variable exansion)

     

    }

     

     

    GPRSD_PING=`ssh -l gprsd $MEMBER '/usr/local/bin/gprsping -h localhost -p 9827 -c 1 -t 3 2>&1 |head -n1 |grep Timeout &> /dev/null; echo $?;'`

     

    STATUS=$?

     

     

     

     

    For debug

     

    DOES NOT RUN host. It only serves to expand variables and offer

     

    feeback. Only uncommment for troubleshooting, or durable storage may be

     

    exhausted. Info will be appended to a log file.

     

    log_info host "${HOST_2_RESOLV}" "${MEMBER}" 2>/dev/null 1>/dev/null

     

     

     

     

    if [ "${GPRSD_PING}" -eq 1 ]

     

    then

     

     

    local4.info - /var/log/ltm

     

    local1.info - /var/log/messages

     

    local3.info - /var/log/messages

     

    this is for debug only! do not call log_info below unless

     

    echo "UP"

     

    fi

     

    rm -f "${PIDFILE}"

     

    exit "${STATUS}"

     

     

     

     

    Tailing the /var/log/secure I can see that it is hitting and getting accepted and then disconecting from both the primary and the standby.

     

     

    Feb 2 16:05:14 gprsd-003 sshd[25018]: Accepted publickey for gprsd from xx.xx.xx.xx port 60305 ssh2

     

    Feb 2 16:05:14 gprsd-003 sshd[25018]: pam_unix(sshd:session): session opened for user gprsd by (uid=0)

     

    Feb 2 16:05:14 gprsd-003 sshd[25018]: pam_unix(sshd:session): session closed for user gprsd

     

    Feb 2 16:05:14 gprsd-003 sshd[25047]: Connection closed by xx.xx.xx.xx

     

    Feb 2 16:05:15 gprsd-003 sshd[25048]: Accepted publickey for gprsd from xx.xx.xx.xx port 53096 ssh2

     

    Feb 2 16:05:15 gprsd-003 sshd[25048]: pam_unix(sshd:session): session opened for user gprsd by (uid=0)

     

    Feb 2 16:05:15 gprsd-003 sshd[25048]: pam_unix(sshd:session): session closed for user gprsd

     

    Feb 2 16:05:18 gprsd-003 sshd[25074]: Accepted publickey for gprsd from xx.xx.xx.xx port 60332 ssh2

     

    Feb 2 16:05:18 gprsd-003 sshd[25074]: pam_unix(sshd:session): session opened for user gprsd by (uid=0)

     

    Feb 2 16:05:18 gprsd-003 sshd[25074]: pam_unix(sshd:session): session closed for user gprsd

     

    Feb 2 16:05:20 gprsd-003 sshd[25100]: Accepted publickey for gprsd from xx.xx.xx.xx port 53126 ssh2

     

    Feb 2 16:05:20 gprsd-003 sshd[25100]: pam_unix(sshd:session): session opened for user gprsd by (uid=0)

     

    Feb 2 16:05:20 gprsd-003 sshd[25100]: pam_unix(sshd:session): session closed for user gprsd

     

    Feb 2 16:05:23 gprsd-003 sshd[25126]: Accepted publickey for gprsd from xx.xx.xx.xx port 60359 ssh2

     

    Feb 2 16:05:23 gprsd-003 sshd[25126]: pam_unix(sshd:session): session opened for user gprsd by (uid=0)

     

    Feb 2 16:05:23 gprsd-003 sshd[25126]: pam_unix(sshd:session): session closed for user gprsd

     

  • If the request to the node is made over SSH and the connection to the node is being made, I'm not sure a tcpdump will help determine what's failing.

     

     

    Can you add debug to the script (even if this breaks the monitoring of the node by returning output)? If the script runs successfully from the command line but not when called by bigd, maybe there is something different in the environment settings? Although, if the same script runs successfully against other pool members, it seems like the issue might be with the specific node.

     

     

    Aaron