ltm nodes down ICMP monitor fail, but ping from TMSH get success!

Question

Hi,
I have a two viprion chassi (BIG-IP C4480) with 4 vcmp, version 12.1.2. One vcmp suddenly get all nodes down and the reason is the fail of ICMP Health Monitors. If I change the probe (Health Monitors), in the node specific, to TCP the node get up! If I do a ping at tmsh CLI I get success. So the the node is UP and reachable, from the BIG-IP vcmp. 
To continue the despite the problem I see at the node the reason for the probe fail, which is:
Offline (Enabled) /Common/icmp: sendto(): Bad file descriptor; No successful responses received before deadline. @2018/06/19 11:26:04.
What is the meaning of Bad file descriptor in this context?
I get something like this in the previous version (not in the node), when I enable monitor logging at the node, but now there are no check on the monitor logging. One reason for the upgrade was to fix this issue.
At the host chassi I issue the command dmesg and I see the following messages:
SELinux: initialized (dev sda1, type ext2), uses xattr
linux-kernel-bde 0000:17:00.0: vpd r/w failed.  This is likely a firmware bug on this device.  Contact the card vendor for a firmware update.
linux-kernel-bde 0000:19:00.0: vpd r/w failed.  This is likely a firmware bug on this device.  Contact the card vendor for a firmware update.EXT2-fs warning: maximal mount count reached, running e2fsck is recommended
Jun 19 03:11:06 slot1/LB01A notice pendsect[12368]: pendsect: /dev/sda no Pending Sectors detected
Should I run e2fsck?
It's not the first time, this one has worst (2 of the 4 vcmp) all nodes! What is the reason for this? What can I do to fix, if it happen again?
I appreciate your comments.
Kind Regards,
LFR

leonardo_souza · Answer

"What is the meaning of Bad file descriptor in this context?"&nbsp;
In your case, it probably means that the bigd was trying to write to a file that has already been closed.
Maybe because the interval settings you have or a software bug.&nbsp;
Can you post here the icmp monitor settings you have?&nbsp;
There is this bug, but does not apply to the version you have:&nbsp;
https://support.f5.com/csp/article/K48693281&nbsp;
"Should I run e2fsck?"&nbsp;
The physical disk is in the vCMP host, the vCMP guests only have virtual disks that a basically files in the vCMP host disk.
If there was a problem with the disk, would probably affect all vCMP guests, and within the guests, not only the bigd process.&nbsp;
Anyway, you don't lose anything in checking that.&nbsp;
You can use the platform diagnostics for that, that is more user-friendly version:&nbsp;
https://support.f5.com/csp/article/K15442&nbsp;
However, there is also the smartctl command.
Don't forget that you test the disk in the vCMP host.&nbsp;

boneyard · Answer

It's not the first time, this one has worst (2 of the 4 vcmp) all nodes! What is the reason for this? What can I do to fix, if it happen again?
contact F5 support NOW! issues like this most likely are due to hardware issues, you want F5 support to guide you to the correct diagnostic steps and if needed initiate a hardware replacement as soon as possible.

luis_ribeiro · Answer

Hi,
I open a case at F5 support and I need to do upgrade.
There are similar bugs to my problem which has a temporary workaround.
2 most popular problems related to this error message:
https://cdn.f5.com/product/bugtracker/ID681499.html
https://cdn.f5.com/product/bugtracker/ID620079.html
In my case the problem is similar to bug:
Bug ID 620079: Removing route-domain may cause monitors to fail.
Workaround describe for this bug, is working in my case:
bigstart restart bigd

ICMP starts work - and we do not have any impact on traffic process. This only restarts process responsible for monitoring.

Forum Discussion

ltm nodes down ICMP monitor fail, but ping from TMSH get success!

3 Replies

Recent Discussions

import live updates from version x to version y

Tenant image upgrade

iRule editor partition button does not work

F5Access | MacOS Sonoma

Overwriting or adding LTM SSL Traffic cert and key using iControlREST

Related Content

BIGIP OAUTH : Transmit "Application id" to backend server after a successful atuthentication

Health monitor question

node monitor vs pool monitor?

Node Monitoring vs Pool Monitoring

create an external monitor with curl to all nodes with different host names