Saving UCS on VCMP guest hangs up
OK, I have a vcmp enabled Viprion 2400 which runs 4 guests(3 LTM, 1 LTM + APM). When I run tmsh save sys usc the config_save hangs up. Now a bit more details. We have 30 Big-IP "boxes" across our production environment most running 10.2.4 and some running 11.6, few of which are VE and few VCPMs. So, we have a mix of different hardware platforms including two new Viprion 2400 chassis with two blades in each, configured as redundant pair. I have a daily UCS backup job running from a remote Linux box. It's as bash script kicked off by Cron_Daily every evening which connects to Big-IP appliances via SSH, executes "tmsh save sys ucs" command, does SCP and then cleans up locally and them does the same thing creating and coping SCF file. I have had this script running with great success for almost a year until I have added Viprion to environment. It appears that at some point saving UCS file hangs up, so when I try to save ucs day after something goes wrong, it fails to create an archive. If I do it from the GUI I get no error what so ever. For a while I can seen the standard icon with down pointer and default message: "Receiving configuration data from your device". After few minutes it just disappears with no errors ( no ucs file created in /var/local/ucs). When executing tmsh save sys ucs from comand line: I get following message:
Waiting for process config_save (pid:23131) to complete.
Waiting for process config_save (pid:23131) to complete.
Waiting for process config_save (pid:23131) to complete.
Waiting for process config_save (pid:23131) to complete.
It looks like previous config_save job has hung up and has never finished and now unable to start a new one. So, something gets hang up during process responsible for saving ucs file which prevents concurrent jobs from running. There is nothing in LTM nor Audit logs, only reference that it's now executing config_save...
Redeploying (essentially rebooting) vcmp guest corrects the issue for some time, but then it reappears again randomly. You can never tell when the problem reappears again. Sometimes it happens few days later and sometimes I can go for a month without an issue, but sooner or later it will happen. This only happens on Viprion and never on physical boxes nor VE. Also, saving scf file never fails or hangs up on this VCMP guests only effecting UCS.
I have a case open with F5 support and they haven't been able to conclude on root cause. The proposed solution was to provision APM module and then remove it. According to them there was a bug in 11.4 (if I am not mistaking) which could cause saving UCS to hang up if APM module was previously provisioned on the system. I am on the latest 11.6 HF at the moment. Also, APM was never provisioned on 3 out of 4 Vcmp guests. I doubt that it's going to work..
So I really wonder what the heck is going on and why config_save hangs. Has anyone come across similar issue with Viprion and VCMP? Please, share your thoughts and ideas...!