Forum Discussion

Murat_AYDIN_133's avatar
Murat_AYDIN_133
Icon for Nimbostratus rankNimbostratus
Jun 18, 2014

failed to set up cluster after upgrading from v10.2.4 to v11.4.1

Hi,

 

we have two big-ip 11050 devices which are active - standby systems were running on v10.2.4. with the basic network configuration below: f5coolsube1.bigip internal server vlan is vlan2 - floating ip 10.23.249.45 and self ip 10.24.249.46

 

and external vlan is vlan1- floating ip 10.24.249.5 and self ip 10.24.246.6 management ip : 10.230.0.13

 

f5coolsube2.bigip internal server vlan is vlan2 - floating ip 10.23.249.45 and self ip 10.24.249.47

 

and external vlan is vlan1- floating ip 10.24.249.5 and self ip 10.24.246.7 management ip : 10.230.0.14

 

We upgraded these devices to v11.4.1 and loaded the v10.2.4 ucs with no-licence option. Now we are having problems with the setting up the cluster. First we couldn't add each device as peer to other device and we got ltm logs : Jun 18 07:29:44 f5coolsube1 mcpd[6447]: 0107157a:3: Only the self device can be moved.

 

Jun 18 07:29:44 f5coolsube1 err mcpd[6447]: 0107157a:3: Only the self device can be moved.

 

Jun 18 07:29:44 f5coolsube1 devmgmtd[7458]: 015a0000:3: failed on .sys_device: 0107157a:3: Only the self device can be moved.

 

Jun 18 07:29:44 f5coolsube1 devmgmtd[7458]: 015a0000:3: mcp operation failed: 0107157a:3: Only the self device can be moved.

 

Jun 18 07:29:44 f5coolsube1 err devmgmtd[7458]: 015a0000:3: failed on .sys_device: 0107157a:3: Only the self device can be moved.

 

Jun 18 07:29:44 f5coolsube1 err devmgmtd[7458]: 015a0000:3: mcp operation failed: 0107157a:3: Only the self device can be moved.

 

Then we reseted the device trust status and it seemed like peers added to each other but we got below logs and the devices went to the disconnect state. Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071436:5: CMI listener established at 10.24.249.46 port 6699

 

Jun 18 07:30:06 f5coolsube1 mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:06 f5coolsube1 notice mcpd[6447]: 01071434:5: No CMI peer devices configured

 

Jun 18 07:30:07 f5coolsube1 sod[7675]: 010c0053:5: Active for traffic group /Common/traffic-group-1.

 

Jun 18 07:30:07 f5coolsube1 notice sod[7675]: 010c0053:5: Active for traffic group /Common/traffic-group-1.

 

Jun 18 07:30:07 f5coolsube1 sod[7675]: 010c0019:5: Active

 

Jun 18 07:30:07 f5coolsube1 notice sod[7675]: 010c0019:5: Active

 

Jun 18 07:30:07 f5coolsube1 logger: /usr/bin/tmipsecd --tmmcount 12 ==> /usr/bin/bigstart start racoon

 

Jun 18 07:30:07 f5coolsube1 notice logger: /usr/bin/tmipsecd --tmmcount 12 ==> /usr/bin/bigstart start racoon

 

Jun 18 07:31:57 f5coolsube1 mcpd[6447]: 0107143c:5: Connection to CMI peer 10.24.249.47 has been removed

 

Jun 18 07:31:57 f5coolsube1 notice mcpd[6447]: 0107143c:5: Connection to CMI peer 10.24.249.47 has been removed

 

AFter these logs we reseted the device trust statuses again and the f5coolsube1 continued to reboot itself until we installed the old configuration.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140029:5: HA nic_failsafe tmm9 fails action is reboot.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140043:0: Ha feature nic_failsafe reboot requested.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140029:5: HA nic_failsafe tmm10 fails action is reboot.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140043:0: Ha feature nic_failsafe reboot requested.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140029:5: HA nic_failsafe tmm11 fails action is reboot.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140043:0: Ha feature nic_failsafe reboot requested.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140102:2: Overdog daemon requests reboot.

 

Jun 18 07:34:44 f5coolsube1 overdog[6174]: 01140104:5: Watchdog touch disabled.

 

Jun 18 07:34:44 f5coolsube1 notice overdog[6174]: 01140029:5: HA nic_failsafe tmm9 fails action is reboo

 

Any advice on how to solve this issue ?

 

4 Replies

  • We recently upgraded a pair of lesser appliances from 10.2.2 HF3 (Partitions) to 11.4.1 HF3. Our high-level procedure that was reviewed and modified by F5 consulting services was this:

     

    1. Disconnect Standby Device

       

    2. Rebuild Standby Device with v10.2.2 media (Volumes)

       

    3. Renew device certificate with expiration of 10 years.

       

    4. Load volume 1.2 with 10.2.2 HF3 and load volume 1.3 with 11.4.1 HF3

       

    5. Boot to volume 1.2 and import config from UCS file (fall-back volume)

       

    6. Boot to volume 1.3 and migrate config from volume 1.2 (dropdown menu selection)

       

    7. Disable interfaces on active device; enable interfaces on newly rebuilt device (essentially a manual failover).

       

    8. Rebuild device that is now standby using same procedure, EXCEPT when you boot to volume 1.3 you DO NOT migrate the config.

       

    9. When offline/standby device boots to 11.4.1 HF3 build network components manually to match required configuration.

       

    10. Renew device certificate with expiration of 10 years.

       

    11. Build device trust/peer list from scratch and create device group for sync/failover.

       

    I've never seen anyone say they migrated from v10.2.2 or earlier to v11.x without having to completely rebuild the device trust and pair the devices up from scratch. Hope this helps. If you have specific questions about the device trust/device group/peer list process what are they?

     

    • JG's avatar
      JG
      Icon for Cumulonimbus rankCumulonimbus
      Thanks very much for sharing this. I wonder what the purpose of 2 is? What specific benefits does that step bring? It seems that the strategy is to create a volume for each of v10.* and v11.4.1 with the default configuration. What is meant by "load" in 4? Is that simply to apply v10.2.2 HF3 and install v11.4.1 (with HF3) each to another volume without running "switchboot"? And is it a new feature in v11.4.1 that one can migrate the config of a previous version from a non-active boot volume?
    • Steve_M__153836's avatar
      Steve_M__153836
      Icon for Nimbostratus rankNimbostratus
      Apologies, I for some reason used some odd verbiage. "Load" should be "Install". For step 2 we had devices that were built with Partitions instead of Volumes so we had to rebuild them from scratch. Otherwise it would just been a matter of installing to other volumes. We wanted to create a vanilla 10.2.2 volume, a 10.2.2 HF3 volume to match our existing configuration and import the config there as a fall-back volume, and then the 11.4.1 HF3 was our migration volume. When you boot into a v11 from a v10 volume via the GUI you have no option, but the config is migrated to the v11 volume whether you want it to or not. There are probably command-line ways around that, but we wanted it to migrate our config. Hope this clarifies my post a bit.
    • HR_38560's avatar
      HR_38560
      Icon for Nimbostratus rankNimbostratus
      We have the same issue, did you manage to fix it?