Forum Discussion

Jordan_Bean_883's avatar
Jordan_Bean_883
Icon for Nimbostratus rankNimbostratus
Feb 17, 2009

NAT not working, possible ARP issues

We have a pair of older 4U Big IP's running BIG-IP Kernel 4.2PTF-10 Build95 in active/passive mode. We used these previously with no issues with one interface as an external interface and the other as an internal interface. Failover, etc. worked fine.

 

Now, we have both setup in single arm mode, each homed to a different Cisco switch (we call them CORE1 and CORE2). We have Etherchannel setup and the 200 Mbps link is a trunk. All connectivity seems to be working fine.

 

We recently added a second IP address to a server that's being load balanced. We setup a NAT translation. It will work for about 5-10 minute and then we get timeouts/destination host unreachables. From the ARP table in the switches, we see that the Big IP is responding to the ping requests. If we SSH into the Big IP and run a ping test, we get a "host down", even though other hosts on the same subnet can ping the IP. An "arp -a" shows the IP address with "(incomplete)" listed. If we swap the primary/secondary IP on the server (so traffic is generated from the new IP), then things start working.

 

What is unique with this setup is that the servers having issues are blade servers. They are connected to a switch in the chassis that is then dual homed into the same 2 routers at the Big IP's running STP.

 

As I write this, the passive LB is showing the node's ICMP as being down while the primary shows the node as up.

 

I definitely think this is an ARP problem, but not sure what to do.

2 Replies

  • I'd be leery of using redundant NIC's with BIG-IP. BIG-IP records the MAC address the traffic was received from and uses that for the response as part of the "auto lasthop" functionality. You can get a bit more info from SOL3182 (Click here).

     

     

    Aaron
  • Oh, the blade switch is running STP, so one of the links is inactive. Yesterday we put a VMware VM on the VLAN behind the LB and all seems fine. Something odd seems to be happing with ARP and the blade switch. We're going to contact the vendor.