Force to standby with HA Group

Question

Hi there,&nbsp;
we have a LTM 4000s with version 11.4.1 HF3 and two LACP channels (each with 2 interfaces), one dedicated for heartbeat/sync/mirroring and the other as trunk with all the productive VLANs. For the scenario that the productive trunk fails, but the heartbeat is still alive, we configured a HA group for the productive trunk. We made several tests with disabling different ports on the switches. The behavior was always fine and as expected, especially the automatic failover based on the HA group if the productive trunk fails. But what we found out then, it's not possible to manually switch back. The active unit shows standby for a few seconds and then automatically comes back to active. The other device stays standby all the time (at least what we see via the WebGUI). First we thought it might be an issue, that the heartbeat trunk is not involved in the HA group, but even with both trunks included we have the same issue. But if we just disable the Active Bonus, the manual failover works fine. We have setup the following:&nbsp;

sys ha-group ha_group {            &nbsp;
    active-bonus 50 
    trunks {&nbsp;
        channel_1.1_1.2_heartbeat {
            weight 10
        }       &nbsp;
        channel_1.3_1.4_production {
            weight 90
        }
    }
}&nbsp;

Any idea what's the reason for this?
Thank you!&nbsp;
Ciao Stefan :)&nbsp;

cory_50405 · Answer

This is expected behavior:&nbsp;
http://support.f5.com/kb/en-us/solutions/public/14000/500/sol14515.html&nbsp;
Use 'Force Offline' and then 'Release Offline'&nbsp;

stefan_klotz · Answer

Things are getting more and more strange. Now we found out that this behavior only seems to be the case when we try to manually failback from unit2 to unti1. The other way round from unti1 to unit2 works fine without any issues. Unit1 has the lower management-IP if this might be relevant.&nbsp;
Ciao Stefan :)&nbsp;

rory · Answer

I had this same issue in 11.2.0, which persisted after an upgrade to 11.4.0HF3.&nbsp;
In my case we didn't have a dedicated heartbeat between units, which we added on the premise that going through our switching gear may have been mucking things up(and it was a good idea in general).  It didn't help.  Rebuilding the trust relationship also didn't help.&nbsp;
The resolution for us came after one of our C2100 blades packed it in and needed to be replaced.  After configuring management the replacement blade and rebuilding the trust between the VIPRIONs, the issue mysteriously disappeared.&nbsp;
I suspect the hardware, since the C2100 blade on the other chassis also stopped POST'ing about 4 months earlier and replacing it in the same manner didn't resolve it.  F5 couldn't find anything amiss with our configuration.&nbsp;

kevin_k_51432 · Answer

Hi Stefan,
The scoring / failover aspects of clustering can get complicated. This bug could be one possibility:&nbsp;
http://support.f5.com/kb/en-us/solutions/public/14000/100/sol14155.html?sr=37247153&nbsp;
Also, I see a warning in the manual section "What is an HA group?" which states that you must ensure this is configured on both devices. This tells me that the feature doesn't sync and that perhaps some customers are overlooking this? There also appears to be quite a bit of good information for understanding scoring in general:&nbsp;
https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/bigip-device-service-clustering-admin-11-5-0/8.html?sr=37247045&nbsp;
Hope this offers some help.&nbsp;
Kevin&nbsp;

sec-enabled_658 · Answer

To failover manually when I use HA groups, I manually change the score in the HA group section for one of the objects (pool, trunk, etc.) to drop the active boxes score below the standby box's score (Have adjust accordingly based upon your active bonus setting) , then the box fails over.

Forum Discussion

Force to standby with HA Group

6 Replies

Recent Discussions

About custome Response Blocking Page

Office Online Server with SharePoint 2016

health monitor source IP address

How to target only webview rendering with CSS?

ltm policy asm_auto_l7_policy

Related Content

Certified Kubernetes Administrator - Study Group

What are You a Force For?

In an active/standby setup of ASM, with sync only device group, do signature updates sync up?

Brute Force protection for single parameter like OTP

F5 Active Standby Node Configuration