Adding standby F5 to HA pair using network failover

Question

All,
&nbsp;  can anyone advise why when I add a standby F5 LTM to a active/standby HA F5 LTM setup using network fail over the interfaces on the active F5 go offline after the device "see each other" and then seems to cycle with the interfaces of the active or the passive going on &amp; offline?&nbsp;
&nbsp;The devices were in a active/standby pair using a serial cable but had to switch to network due to the devices moving. It could be the connection between the two devices is not stable enough for HA heartbeat traffic, I'm just trying to find out if this would cause the interfaces to go offline.&nbsp;
&nbsp;Here is the log from the active F5 (192.168.52.131):&nbsp;
&nbsp; Sep 2 04:03:19 tmm tmm[933]: 01340001:3: HA Connection with peer 192.168.52.132:47998 established.
&nbsp; Sep 2 04:03:19 tmm tmm[933]: 01340001:3: HA Connection with peer 192.168.52.132:47998 established.
&nbsp; Sep 2 04:03:28 MQ1LTM01 sod[997]: 010c0025:5: Toggle from active to standby to active.
&nbsp; Sep 2 04:03:28 MQ1LTM01 sod[997]: 010c0025:5: Toggle from active to standby to active.
&nbsp; Sep 2 04:03:28 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.2 is DOWN
&nbsp; Sep 2 04:03:28 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.2 is DOWN
&nbsp; Sep 2 04:03:28 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.1 is DOWN 
&nbsp;Sep 2 04:03:28 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.1 is DOWN
&nbsp; Sep 2 04:03:28 MQ1LTM01 lacpd[990]: 01160010:6: Link 1.1 removed from aggregation
&nbsp; Sep 2 04:03:28 MQ1LTM01 lacpd[990]: 01160010:6: Link 1.1 removed from aggregation
&nbsp; Sep 2 04:03:32 tmm tmm[933]: 01340002:3: HA Connection with peer 192.168.52.132:47998 lost.
&nbsp; Sep 2 04:03:32 tmm tmm[933]: 01340002:3: HA Connection with peer 192.168.52.132:47998 lost.
&nbsp; Sep 2 04:03:35 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.2 is UP 
&nbsp;Sep 2 04:03:35 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.2 is UP
&nbsp; Sep 2 04:03:35 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.1 is UP 
&nbsp;Sep 2 04:03:35 sccp bcm56xxd[22311]: 012c0015:6: Link: 1.1 is UP
&nbsp;Sep 2 04:03:41 MQ1LTM01 lacpd[990]: 01160009:6: Link 1.1 added to aggregation 
&nbsp;Sep 2 04:03:41 MQ1LTM01 lacpd[990]: 01160009:6: Link 1.1 added to aggregation&nbsp;
&nbsp;The log from the standby F5 (192.168.52.132):
&nbsp;Sep 2 04:02:10 tmm tmm[933]: 01340002:3: HA Connection with peer 192.168.52.131:1028 lost.
&nbsp; Sep 2 04:02:11 sccp bcm56xxd[221]: 012c0015:6: Link: 1.1 is UP  
&nbsp; Sep 2 04:02:11 tmm tmm[933]: 01340002:3: HA Connection with peer 192.168.52.131:1028 lost. 
&nbsp;Sep 2 04:02:11 sccp bcm56xxd[221]: 012c0015:6: Link: 1.2 is UP
&nbsp; Sep 2 04:02:11 MQ1LTM02 lacpd[990]: 01160009:6: Link 1.2 added to aggregation 
&nbsp;Sep 2 04:02:17 MQ1LTM02 lacpd[990]: 01160009:6: Link 1.1 added to aggregation
&nbsp; Sep 2 04:03:19 tmm tmm[933]: 01340001:3: HA Connection with peer 192.168.52.131:1028 established.
&nbsp; Sep 2 04:03:31 sccp bcm56xxd[221]: 012c0015:6: Link: 1.2 is DOWN
&nbsp; Sep 2 04:03:31 MQ1LTM02 lacpd[990]: 01160010:6: Link 1.2 removed from aggregation
&nbsp; Sep 2 04:03:31 sccp bcm56xxd[221]: 012c0015:6: Link: 1.1 is DOWN
&nbsp; Sep 2 04:03:31 MQ1LTM02 lacpd[990]: 01160010:6: Link 1.1 removed from aggregation
&nbsp; Sep 2 04:03:32 tmm tmm[933]: 01340002:3: HA Connection with peer 192.168.52.131:1028 lost.
&nbsp; Sep 2 04:03:37 sccp bcm56xxd[221]: 012c0015:6: Link: 1.2 is UP
&nbsp; Sep 2 04:03:37 sccp bcm56xxd[221]: 012c0015:6: Link: 1.1 is UP
&nbsp; Sep 2 04:03:37 MQ1LTM02 lacpd[990]: 01160009:6: Link 1.2 added to aggregation
&nbsp; Sep 2 04:03:39 MQ1LTM02 sod[1000]: 010c0019:5: Active
&nbsp; Sep 2 04:03:40 MQ1LTM02 lacpd[990]: 01160009:6: Link 1.1 added to aggregation&nbsp;
&nbsp;thanks&nbsp;&nbsp;

chris_miller · Answer

What interfaces are you using for network failover? Just the management interface? Since the active is itself failing over, I imagine we're triggering one of our failsafe conditions, be it gateway, VLAN, or a service.  
&nbsp;  
&nbsp; What do you have defined as far as failsafe goes...vlans? gateway? ha-groups? etc...Also, are you using preferred redundancy state at all?

mw1 · Answer

Thanks for the reply, the devices are still on the 9.X code so no HA groups, also there is no vlan or gateway failsafe config in place currently. The device is currently using the main network trunk for network failover, not the managment interface. I know this is not best practise and I'm awating an engineer to get to the DC to rig a dedicated interface for the use for network failover. 
&nbsp;  
&nbsp; I am using preferred redundancy state (active for the active 192.168.52.131 and standby for the other). I did read that the log msg: "Toggle from active to standby to active." was expected if there was no redundancy state defined, but as I have I don't know if this is pointing to an issue. Do you know of any default fail safe conditions on 9.X code tat would cause the interfaces to go offline? 
&nbsp;  
&nbsp; thanks 
&nbsp;  &nbsp;

chris_miller · Answer

Looking at one of my 9.x boxes, I only see "Restart Service," "Restart All," and "Fail Over and Restart." Can you verify your trunks on each F5? I imagine you have LACP with spanning tree in pass-through for those interfaces? I've had boxes that go from active to standby to active all in one log entry but can't remember why that was...if you have a support contract, they might be able to shed some light there.

hamish · Answer

A system going active/standby/active is usually because of a detection of an active/active situation. That listtle toggle of active/standby/active will ensure that any ARP caches are updated with the correct info due to the gratuitous ARP's that are done as a system goes active. 
&nbsp;  
&nbsp; On the subject of 'Preferred Active'. I'd lose it. It doesn't really work very well in my experience. In fact there's at least two scenarios that it causes problems (i.e. The 'preferred active' box always comes up active and then you get an active/active situation. Stuff like that... 
&nbsp;  
&nbsp; In v9 if you have network failover, there's only 1 network that can be used for the HA heartbeat... Make sure you don't have problems with that link, and I would generally advise that it's on a dedicated point-to-point network (Without or without switches. But only use switches if the two boxes are out of reach with a single cable). V10 you can have multiple HB networks setup. MUCH nicer and MUCH more stable... Run (don't walk) to upgrade to v10 just for that reason IMO. 
&nbsp;  
&nbsp; Hamish &nbsp;

pradeep_more_73 · Answer

Hi All, &nbsp;
&nbsp; some thing related to network fail-over , as i am having 2 LTM 3600 with V11.2, &nbsp;
&nbsp;  i want to configure for network fail-over in our network of different sites,one device in one site and other in second site. &nbsp;
&nbsp; can you please help me , how should i configure them and hoe to test fail-over conditions, is there should be always connection between 2 devices to act in active / standby state...how to achieve in  this case

Forum Discussion

Adding standby F5 to HA pair using network failover

7 Replies

Recent Discussions

ASM instance creation

[ASM] - what is "Browser Challange file" ?

[ASM] - HTML5 Cross-Domain Request Enforcement - CLI command

Reverse Proxy Not Behaving

Stable Firmware for F5

Related Content

When Active/Standby failover send mail

What is Multi-Cloud Networking?

Configuring AWS HA Failover Across AZs Without EIPs Using F5 Cloud Failover Extension (CFE)

VLAN Failsafe failover settings change on STANDBY device - affect ACTIVE device?

Demo Guide & Video Series for F5 Distributed Cloud Network Connect (Multi-Cloud Networking)