Forum Discussion

Eric_27158's avatar
Eric_27158
Icon for Nimbostratus rankNimbostratus
Nov 24, 2010

GTM and DNS caching for UDP connections

Hey all, thanks for reading....

 

 

I've run into a situation where an outage to a GTM pool node causes problems with client-side DNS caching. For example, I have two nodes in a GTM pool and these guys are running SYSLOG. When one of them dies, even though it's removed from the pool and its IP won't be handed out, the DNS cache on client side is holding onto an IP that's now unreachable. So, our design problem is a balancing act (no pun intended). In other words, when you have a protocol like SYSLOG that uses UDP and will send many more DNS requests (depending on the TTL of the record), do you lower the TTL to something like 5 and hammer the F5 to death with DNS requests or is there another solution to hack at the client-side DNS cache or something that you all have determine is a "best practice" in situations like this?

 

 

Thanks and Happy Thanksgiving everyone...

 

 

Eric

 

8 Replies

  • No LTM available for use?

     

     

    Have you thought about just disabling your DNS Cache on your client side boxes (I'm guessing they are servers?)? Here is a MS article how to do go about that:

     

     

    http://support.microsoft.com/kb/318803

     

  • yes, the LTM is available.... is there a way to handle this issue with it ?

     

  • Well it does allow for us to come up with a more flexible solution for you. Would you mind describing your setup a little more? I understand you have two syslog boxes and from the sounds of it servers are logging to them in a round robin fashion. Are they using publicly available IP's?

     

     

    The typical setup that I use is to give my servers an internal IP address. Then set them up as nodes, add them to the same pool and then create a VIP using either a public or private IP address. If it is a public address I then create wideIP in the GTM for that service. If it is a private IP I just go to my internal DNS servers, create an entry for whatever URL I am wanting to setup and aim it at the VIP that I created.

     

     

    So your servers would all be aimed at the LTM VIP that you created and it would be load balancing whatever servers you stuck in the pool that you have assigned to it.
  • Thanks for the tip... but I think our use of the GTM is different enough that we cannot do this. More specifically, we are using the GTM to basically be an LTM that does DNS-based load-balancing. We don't really use the "global" portion of the load-balancer, just the DNS stuff. We do this for one reason only - our LTM was designed to always do SNAT, which in the case of syslog, is a problem since the original SrcIP is lost. RADIUS is an even bigger problem because a SrcIP + RADIUS key is required for authentication of the NAS. Either protocol, the same problem exists. So, we've put the GTM in place of the LTM for cases like these when we want to retain the original SrcIP of the session. So.... with that requirement, is there some kind of best practice for DNS TTLs or some non-SNAT workaround to avoid the issue all-together? Thanks again for your help, it's much appreciated
  • I remembered reading a post one time about turning off SNAT for certain connections going through an LTM VIP. You might give this a read to see if it can be applied to your situation:

     

     

    http://devcentral.f5.com/Forums/tabid/1082223/asg/50/showtab/groupforums/afv/topic/aff/5/aft/24878/Default.aspx

     

     

    If it was HTTP traffic then X-Forwarded-For would be the ticket... As far as best practice for DNS TTL's, I have always used 30 seconds on my A records. I don't know if that is a best practice or not, but I haven't had any issues with that so far.
  • Thanks for your tips naladar. Unfortunately, we have a bunch of TCP VIPs and RADIUS where the transaction requires the srcIP to be intact and the response to work. Removing SNAT breaks the session if you're routing to the backend host. We are utilizing the X-Forwarded-For field for HTTP/HTTPS, and you're right, that works great. As you can see, the GTM's ability to do DNS-style load-balancing solves both of our issues but brings in the new issue of TTLs when a pool member fails.... if a backend machine fails, the outage could be as long as the TTL for certain clients. I'll throw this question back out to the masses (hopefully others are reading!), what do people do with their DNS TTLs in cases where your GTM points to actual machines as pool members ?
  • It really comes down to business requirements. How long is acceptable? If the DNS load balancing doesn't meet the requirements, then an LTM is a better candidate here as being constrained by a dns timeout on ldns (and client machines) is tricky.