Forum Discussion

William_Benett1's avatar
William_Benett1
Icon for Nimbostratus rankNimbostratus
Jul 11, 2007

Node status problems

I have an iRule that uses LB::status to check the state of a node. Based on that, I then load balance to it or not. I'm running 9.3.0 and I've run into an issue where the node shows up green for TMM but LB::status returns "down". What's up with that?

Here's the rule. Support bounced me to DC. This rule performs load balancing for a very weird application.
when RULE_INIT {
  array set ::session_table { }
  set ::lbcurrent 0
  set ::BSIFlowPort 49151
  set ::BSIFhighPort 49250
  set ::BSIFNODE1 "x.x.x.x"
  set ::BSIFNODE2 "x.x.x.y"
}
when CLIENT_ACCEPTED  {
set lport [TCP::local_port]
  switch -glob [TCP::local_port] {
    49* { if { $lport > $::BSIFlowPort && $::BSIFhighPort > $lport} {
          if { ! [expr {$lport & 1}]} {
            switch $::lbcurrent {
              0 {
                if { [LB::status node $::BSIFNODE1 up] == 0 } {
                  log local0. "chose node 2, node 1 status:\
                    [LB::status node $::BSIFNODE1]"
                  set tableentry "[IP::client_addr]:$lport"
                  set ::session_table($tableentry) $::BSIFNODE2
                  set ::lbcurrent 1
                  node $::BSIFNODE2
                  } else {
                  log local0. "chose node 1, node 1 status:\
                    [LB::status node $::BSIFNODE1]"
                  set tableentry "[IP::client_addr]:$lport"
                  set ::session_table($tableentry) $::BSIFNODE1
                  set ::lbcurrent 1
                  node $::BSIFNODE1
                  }
                }
              1 {
                if { [LB::status node $::BSIFNODE2 up] == 0 } {
                  log local0. "chose node 1, node 2 status:\
                    [LB::status node $::BSIFNODE2]" 
                  set tableentry "[IP::client_addr]:$lport"
                  set ::session_table($tableentry) $::BSIFNODE1
                  set ::lbcurrent 0
                  node $::BSIFNODE1
                  } else {
                  log local0. "chose node 2, node 2 status:\
                    [LB::status node $::BSIFNODE2]"
                  set tableentry "[IP::client_addr]:$lport"
                  set ::session_table($tableentry) $::BSIFNODE2
                  set ::lbcurrent 0
                  node $::BSIFNODE2
                  }
                }
            }
          } else {
            set evenport [ expr { $lport - 1}  ]
            set tableentry "[IP::client_addr]:$evenport"
            if { $::session_table($tableentry) == "" } {
              reject
            } else {
              log local0. "we got here with [TCP::local_port]"
              node $::session_table($tableentry) $lport
            }
          }
        }
      }
    default { discard }
  }
}
when CLIENT_CLOSED {
        if { ! [expr [TCP::local_port] &1]} {
          set tableentry "[IP::client_addr]:[TCP::local_port]"
            if { $::session_table($tableentry) == "" } {
            unset $::session_table($tableentry)
            }
        }
}

3 Replies

  • Deb_Allen_18's avatar
    Deb_Allen_18
    Historic F5 Account
    Let me do a little digging around & see what I can find out.

    In the meantime, a couple of observations:

    jfroggatt:

    For this test:
       if {[LB::status pool $LBpool member $LBmember $LBport] eq "up"} {
    you might consider using 'ne "down"' instead of 'eq "up"' if you ever want to be able to drain off connections or test the rest of the rule logic without a monitor applied. That way the status of DISABLED and UNCHECKED would pass the test.

    wbenetti:

    As documented in the wiki page (Click here), the LB::status command didn't work as expected in 9.2.3 with this syntax:
    if {[LB::status node $::BSIFNODE1 up] == 0} {
    and we had to instead use the syntax as in the example above:
    if {[LB::status...] ne "up"} {
    Let me know if you've verified that the correct boolean is returned in all cases in 9.3.0, and I'll do some further testing & update the wiki if appropriate.

    Also, I seem to recall difficulty in getting the "node" command to function properly without a port parameter, so you might want to try adding it to see if you get a more consistent result that way.

    HTH, and I'll be circling back on this one later this week.

    /deb

  • So I changed my code to [LB::status node $::BSIFNODE1]eq "down" per the LB::status page, but it still seems to get the incorrect node status.

     

     

    Something i've noticed is that anytime I mess with the health checks, it stands a chance of flopping the node state.

     

    It's like LTM is detecting a dead node, but it's not always updating whatever the iRule polls with the LB::status check.

     

    I'm not sure what you're asking for in terms of verification. The values seem to be incorrect when polled, but the logfiles show the correct node state. It seems like i can reproduce it with this procedure. Down node 2 (add a bad route, actually disable the box, etc). Wait for LTM to notice, run the test. Sometimes the rule notices that it's down and sends all traffic to node 1. Then bring node 2 up. Verify the logfile. The iRule will only send traffic to node 1 because the node 2 status is still listed as down or session_disable. If I go to the node section and do one of a number of tasks (modify the node health checks including changing from node default to node specific or enable/disable the node) the iRule's next invocation picks up the correct status and it processes traffic correctly.

     

     

    On the topic of the "node" command, I have to say that I haven't experienced anything like that. Quite honestly, it seems to work great when my rule can determine the node's actual status.
  • Deb_Allen_18's avatar
    Deb_Allen_18
    Historic F5 Account
    OK, thanks for the feedback, and for bringing the problem to our attention in the first place.

     

     

    Your Support case has been escalated for further investigation, so please be patient while we work together to try to solve your problem. I don't see your steps to reproduce logged in the case. I'd recommend forwarding that info to the Support engineer working your case to assist them towards a speedy resolution

     

     

    It's understandable that changes to the monitor may cause temporary changes to perceived pool member status. Once you have resolved the LB::status issue, if you are finding instability with monitoring, I'd recommend opening a new Support case to dig into it further.

     

     

    /deb