Forum Discussion

Stefan_Klotz's avatar
Stefan_Klotz
Icon for Cumulonimbus rankCumulonimbus
May 09, 2014

Force to standby with HA Group

Hi there,

 

we have a LTM 4000s with version 11.4.1 HF3 and two LACP channels (each with 2 interfaces), one dedicated for heartbeat/sync/mirroring and the other as trunk with all the productive VLANs. For the scenario that the productive trunk fails, but the heartbeat is still alive, we configured a HA group for the productive trunk. We made several tests with disabling different ports on the switches. The behavior was always fine and as expected, especially the automatic failover based on the HA group if the productive trunk fails. But what we found out then, it's not possible to manually switch back. The active unit shows standby for a few seconds and then automatically comes back to active. The other device stays standby all the time (at least what we see via the WebGUI). First we thought it might be an issue, that the heartbeat trunk is not involved in the HA group, but even with both trunks included we have the same issue. But if we just disable the Active Bonus, the manual failover works fine. We have setup the following:

 

sys ha-group ha_group {

 

active-bonus 50 trunks {

 

channel_1.1_1.2_heartbeat { weight 10 }

 

channel_1.3_1.4_production { weight 90 } } }

 

Any idea what's the reason for this? Thank you!

 

Ciao Stefan :)

 

6 Replies

  • This is expected behavior:

     

    http://support.f5.com/kb/en-us/solutions/public/14000/500/sol14515.html

     

    Use 'Force Offline' and then 'Release Offline'

     

  • Things are getting more and more strange. Now we found out that this behavior only seems to be the case when we try to manually failback from unit2 to unti1. The other way round from unti1 to unit2 works fine without any issues. Unit1 has the lower management-IP if this might be relevant.

     

    Ciao Stefan :)

     

  • Rory's avatar
    Rory
    Icon for Nimbostratus rankNimbostratus

    I had this same issue in 11.2.0, which persisted after an upgrade to 11.4.0HF3.

     

    In my case we didn't have a dedicated heartbeat between units, which we added on the premise that going through our switching gear may have been mucking things up(and it was a good idea in general). It didn't help. Rebuilding the trust relationship also didn't help.

     

    The resolution for us came after one of our C2100 blades packed it in and needed to be replaced. After configuring management the replacement blade and rebuilding the trust between the VIPRIONs, the issue mysteriously disappeared.

     

    I suspect the hardware, since the C2100 blade on the other chassis also stopped POST'ing about 4 months earlier and replacing it in the same manner didn't resolve it. F5 couldn't find anything amiss with our configuration.

     

  • Kevin_K_51432's avatar
    Kevin_K_51432
    Historic F5 Account

    Hi Stefan, The scoring / failover aspects of clustering can get complicated. This bug could be one possibility:

     

    http://support.f5.com/kb/en-us/solutions/public/14000/100/sol14155.html?sr=37247153

     

    Also, I see a warning in the manual section "What is an HA group?" which states that you must ensure this is configured on both devices. This tells me that the feature doesn't sync and that perhaps some customers are overlooking this? There also appears to be quite a bit of good information for understanding scoring in general:

     

    https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip-device-service-clustering-admin-11-5-0/8.html?sr=37247045

     

    Hope this offers some help.

     

    Kevin

     

  • To failover manually when I use HA groups, I manually change the score in the HA group section for one of the objects (pool, trunk, etc.) to drop the active boxes score below the standby box's score (Have adjust accordingly based upon your active bonus setting) , then the box fails over.

     

  • Hi Cory, many thanks for this hint! I tested it yesterday and it works perfectly fine. Thank you!

     

    Ciao Stefan :)