Forum Discussion

Oleg_68900's avatar
Oleg_68900
Icon for Nimbostratus rankNimbostratus
Jan 18, 2009

Failsafe monitors

The question is very simple, but introduction is a bit long...

 

 

Setup:

 

2 LTM v9 (Active-Standby configuration) are cross-connected to 2 FW (Active-Standby configuration) (Juniper, NSRP). Nothing else is there – no switches, just 4 Ethernet cables.

 

Each LTM has 2 ports on external VLAN, each LTM port connects to different FW (I.e. no STP is needed)

 

Same picture from FW standpoint – each FW has 2 ports on a zone connected to LTMs, each FW port connects to different LTM.

 

 

Obviously, the configuration above required failsafe monitoring.

 

Namely, if cable disconnected between active LTM and active FW, either LTM or FW needs to perform failover [since neither standby LTM nor FW will forward traffic].

 

 

(A) LTM can monitor external VLAN (and perform ARP request against default gateway)

 

(B) FW can monitor LTM floating IP on external VLAN.

 

 

Questions:

 

How log LTM failover takes? How much it depends on traffic/load and connections mirroring, etc?

 

 

2. Suppose in average I have 3-5 sec on FW failover, and 5-10 sec on LTM failover.

 

FW monitor timeout (B) is set to 4 sec.

 

On LTM failover (automatic or forced) LTM floating IP might be down long enough to cause FW failover as well.

 

And ever worse, when LTM failover has completed, FW might be unavailable, because it is performing failover itself, consequently causing LTM failover again. Not exactly a positive feedback loop, but definitely can throw a system out of equilibrium...

 

 

So both monitor timeouts should be chosen carefully.

 

I cannot find any references on how to calculate them properly. :-(

 

From common sense, at least one of them should be bigger than combined failover time (LTM + FW) plus some small constant to take care of fluctuation.

 

Documentation mentions, default timeout for VLAN failsafe monitor is 90 sec, but it seems way too much for me.

 

Can anybody share what is the guideline there?

 

From another hand, it might be that I build a system in wrong way completely, didn’t I?

 

My “weighted” experience with LTMs ~2 hours total. :-)

 

 

2 Replies

  • I have several customers using a 1 second gateway failsafe interval with a 5 -6 second timeout and this works well. I would not recommend going below 5 seconds on the timeout or you might be susceptible to false failures (flapping up and down).

     

     

    Denny
  • Danny,

     

    Thanks for the answer. :-)

     

     

    >a 1 second gateway failsafe interval with a 5 -6 second timeout

     

    Just to make sure, I got you right - in my configuration it means I need to setup:

     

    On LTM side:

     

    Create custom (Gateway ICMP) monitor Interval = 1 & timeout = 5 | 6

     

    Create a pool (with 1 IP = LTM default gateway)

     

    Create failsafe monitor based on the pool with Threshold = 1 & Action = Fail Over.

     

     

    Side note: Have you ever try using VLAN monitors? How do we set interval there (for ARP requests)?

     

     

    What are the monitor parameters on router/FW side?

     

    And what are the average failover timings on both sides?