Saving UCS on VCMP guest hangs up

Question

OK, I have a vcmp enabled Viprion 2400 which runs 4 guests(3 LTM, 1 LTM + APM). When I run tmsh save sys usc the config_save hangs up. 
Now a bit more details. We have 30 Big-IP "boxes" across our production environment most running 10.2.4 and some running 11.6, few of which are VE and few VCPMs. So, we have a mix of different hardware platforms including two new Viprion 2400 chassis with two blades in each, configured as redundant pair. 
I have a daily UCS backup job running from a remote Linux box. It's as bash script kicked off by Cron_Daily every evening which connects to Big-IP appliances via SSH, executes "tmsh save sys ucs" command, does SCP and then cleans up locally and them does the same thing creating and coping SCF file. I have had this script running with great success for almost a year until I have added Viprion to environment. It appears that at some point saving UCS file hangs up, so when I try to save ucs day after something goes wrong, it fails to create an archive. If I do it from the GUI I get no error what so ever. For a while I can seen the standard icon with down pointer and default message: "Receiving configuration data from your device". After few minutes it just disappears with no errors ( no ucs file created in /var/local/ucs).
When executing tmsh save sys ucs from comand line: I get following message:&nbsp;
Waiting for process config_save (pid:23131) to complete.&nbsp;
Waiting for process config_save (pid:23131) to complete.&nbsp;
Waiting for process config_save (pid:23131) to complete.&nbsp;
Waiting for process config_save (pid:23131) to complete.&nbsp;
It looks like previous config_save job has hung up and has never finished and now unable to start a new one. So, something gets hang up during process responsible for saving ucs file which prevents concurrent jobs from running. There is nothing in LTM nor Audit logs, only reference that it's now executing config_save...  &nbsp;
Redeploying (essentially rebooting) vcmp guest corrects the issue for some time, but then it reappears again  randomly. You can never tell when the problem reappears again. Sometimes it happens few days later and sometimes I can go for a month without an issue, but sooner or later it will happen. This only happens on Viprion and never on physical boxes nor VE. Also, saving scf file never fails or hangs up on this VCMP guests only effecting UCS. &nbsp;
I have a case open with F5 support and they haven't been able to conclude on root cause. The proposed solution was to provision APM module and then remove it. According to them there was a bug in 11.4 (if I am not mistaking) which could cause saving UCS to hang up if APM module was previously provisioned on the system. I am on the latest 11.6 HF at the moment. Also, APM was never provisioned on 3 out of 4 Vcmp guests. I doubt that it's going to work..&nbsp;
So I really wonder what the heck is going on and why config_save hangs. Has anyone come across similar issue with Viprion and VCMP? Please, share your thoughts and ideas...!&nbsp;

amolari · Answer

is the archive with encryption option (enabled)?

alex100 · Answer

No, encription is dissabled. &nbsp;

nitass · Answer

According to them there was a bug in 11.4 (if I am not mistaking) which could cause saving UCS to hang up if APM module was previously provisioned on the system. I am on the latest 11.6 HF at the moment. Also, APM was never provisioned on 3 out of 4 Vcmp guests. I doubt that it's going to work..&nbsp;

i understand you mean ID453545/sol16089. if it does not fix (i.e. the problem comes back), may you ask support to also check ID521272?&nbsp;
ID521272 AuthTokenWorker causes OutOfMemory if AuthTokens requested at high rate&nbsp;

qe_102628 · Answer

11.6.0 has a detailed statistics reporting engine for troubleshooting guest details.  Maybe this will help?&nbsp;
"Description&nbsp;
As of BIG-IP 11.6.0, the vCMP hypervisor can view detailed guest performance statistics such as Disk usage, CPU usage and Network Throughput using Analytics, also called Application Visibility and Reporting (AVR). You can use AVR to view current and trending data regarding vCMP guest resource and network utilization. You can generate PDF reports and either downloaded or email them from the BIG-IP system."&nbsp;
https://support.f5.com/kb/en-us/solutions/public/15000/600/sol15684.html&nbsp;
My two cents:  if all four guests are running on the same blade, and the blade has a disk (not SSD), you might have better results by running a staggered ucs save.  That is, do guest 1 and only once complete beging the ucs save on guest 2.  This would avoid what appears to be a problem with initiating a save on guest 1 and 2 at the same time.  (probably not what you want to hear, but this should be functional until the concurrent save issue is sorted out.)&nbsp;

alex100_194614 · Answer

Thanks for the info on Analytics. I am running UCS save job in staggered way. My script runs one job at the time traversing down the inventory list.  For now I was able to find a workaround.

Forum Discussion

Saving UCS on VCMP guest hangs up

8 Replies

Recent Discussions

F5 terminal - help to run commands - disk space full

vulnerabilities-CVE-2020-16150 & CVE-2013-0169

google autenticator broken barcode

Error while running ansible

ASM instance creation

Related Content

TESTING / ARTICLE SAVE - 002LEZ

Hanging with DevCentral at RSA Conference 2023

SAVE THE DATE AppWorld 2024: F5's Premier Application and API Security Conference, Feb 6-8

Some of vCMP Guest is not shown on the Guests Status Page.

Python BigREST ucs save and download