Practical considerations for using Azure internal load balancer and BIG-IP

Background

I recently had a scenario that required me to do some testing and I thought it would be a good opportunity to share. A user told me that he wants to put BIG-IP in Azure, but he has a few requirements:

 

  1. He wants to use an Azure Load Balancer (ALB) to ensure HA for his BIG-IP pair. This makes failover times faster in Azure, compared to other options.
  2. He does not want to use an external Load Balancer. He has internet-facing firewalls that will proxy inbound traffic to BIG-IP, so there is no need to expose BIG-IP to internet. He needs internal BIG-IP's only to provide the app services he needs.
  3. He does not want his traffic SNAT'd. He wants app servers to see the true client IP.
  4. Ideally he does not want to automate an update of Azure routes at time of failover.
  5. He would like to run his BIG-IP pair Active/Active, but could also run Active/Standby.

 

Quick side note: Why would we use ALB's when deploying BIG-IP? Isn't that like putting a load balancer in front of a load balancer? In this case we're using the ALB as a basic, Layer 3/4 traffic disaggregator to simply send traffic to multiple BIG-IP appliances and provide failover in the case of a VM-level failure. However we want our traffic to traverse BIG-IP's when we need advanced app services: TLS handling, authentication, advanced health monitoring, sophisticated load balancing methods, etc.

 

Let's analyze this!

Firstly, I put together a quick demo to easily show him how to deploy BIG-IP behind an ALB. My demo uses an external (internet-facing) ALB and an internal ALB. It is based on the official template provided by F5, but additionally deploys an app server and configures the BIG-IP's with a route and an AS3 declaration. If it wasn't for the internal-only part, we would have met his requirements with this set up:

 

 

Constraints

However, this user's case requires no internet-facing BIG-IP. And now we hit 2 problems:

  1. Azure will not allow 2x internal LB’s in the same Availability Set, or you’ll get a conflict with error:
    NetworkInterfacesInAvailabilitySetUseMultipleLoadBalancersOfSameType
    . So the diagram above cannot be re-created with 2x Internal Azure LB's.
  2. When using only 1x internal LB, you “
    should only have one inbound rule if that rule loadbalances across all ports and protocols.
    ” Since we want at least 1 rule for all ports (for outbound traffic from server), we cannot also have individual LB rules for our apps. Trying to do so will fail with the error message in quotes.

Alternatives

This leaves us with a few options. We can meet most of his requirements but one that I have not been able to overcome is this: if your cluster of BIG-IP's is Active/Active, you will need to SourceNAT traffic to ensure the response traffic traverses the same BIG-IP. I'll discuss three options and how they meet the requirements.

  1. Use a single Azure internal LB. At BIG-IP, SNAT the traffic to the web server, and send XFF header in place of true client IP. Default route can be the Firewall, so server-initiated traffic like patch updates can still reach internet. Can be Active/Active or Active/Standby, but you must SNAT if you do not want to be updating a UDR at time of failover.
  2. Or, don’t SNAT traffic, and web server sees true source IP. You will need a UDR (User Defined Route) to point default route and any client subnets at the Active BIG-IP. You will need to automatically update this UDR at time of failover (automated via F5's Cloud Failover Extension or CFE). Can be Active/Standby only, as traffic will return following the default route.
  3. Use a single Azure internal LB, but with multiple LB rules. In our example we'll have 2x Front End IP's configured on the Azure LB, and these will be in different internal subnets. Then we'll have 2x back end pools that consist of the external SelfIP's for 1 rule, and internal SelfIP's for the other. Finally we'll have 2x LB rules. The rule for the "internal" side of the LB may listen on ALL ports (for outbound traffic) and the "external" side of the LB might listen on 80 and 443 only.

Advanced use cases (not pictured)

  1. Single NIC. If you did not want to have a 3-NIC BIG-IP, it would be possible to achieve scenario C above with a single NIC or dual NIC VM: Use a 2-nic BIG-IP (1 nic for mgmt., 1 for dataplane). Put your F5 pair behind a single internal Azure LB with only 1 LB rule which has “HA” ports checked (all ports). We can then have the default route of the server subnet point to Azure LB, which will be served by a VIP 0.0.0.0/0 on F5. Because this only allows you 1 LB rule on the Azure LB, enable DSR on the Azure LB Rule. Designate an “alien subnet range” that doesn’t exist in VNET, but only on the BIG-IP. Create a route to this range, and point the next hop at the frontend IP on the only LB rule. Then have your FW send traffic to the actual VIP on F5 that's within the alien range (not the frontendIP), which will get forwarded to Azure LB, and to F5. I have tested this but see no real advantage and prefer multi-NIC VM's.
  2. Alien range. As mentioned above, an "alien IP range" - a subnet not within the VNET but configured only on the BIG-IP for VIPs - could exist. You can then create a UDR that points this "alien range" toward the FrontEnd IP on Azure LB. An alien range may help you overcome the limit of internal IP's on Azure NIC's, but with a limit of 256 private IP's per NIC, I have not seen a case requiring this. An alien range might also allow app teams to manage their own network without bothering network admins. I would not advise going "around" network teams however - cooperation is key. So I cannot find a great use for this in Azure, but I have written about how an alien range may help in AWS.
  3. DSR. Direct Server Return in Azure LB means that Azure LB will not perform Destination NAT on the traffic, and it will arrive at the backend pool member with the true destination IP address. This can be handy when you want to create 1 VIP per application in your BIG-IP cluster, and not worry about multiple VIP's per application, or VIP's with /30 masks, or VIP's that use Shared Address lists. However, given that we have multiple options to configure VIP's when Destination NAT is performed by Azure LB (as it is, by default), I generally don't recommend DSR on Azure LB unless it's truly desired.

 

Personally, I'd recommend in this case to proceed with Option C below. I'd also point out:

  • I believe operating in the cloud requires automation, so you should not shy away from automated updates to UDR's when required. Given these can be configured by tools like F5's Cloud Failover Extension (CFE), they are a part of mature cloud operations.
  • I personally try to architect to allow for changes later. Making SNAT a requirement may be a limitation for app owners later on, so I try not to end up in a scenario where we enforce SNAT.
  • I personally like to see outbound traffic traverse BIG-IP, and not just because it allows apps to see true source IP. Outbound traffic can be analyzed, optimized, secured, etc - and a sophisticated device like BIG-IP is the place to do it.
  • Lastly, I recommend best practices we all should know:
  • Use template-based deployments for production so that you have Infrastructure as Code
  • Ideally keep your BIG-IP config off-box and deploy with AS3 templates
  • Get your app and dev teams configuring BIG-IP using declarative deployments to speed your deployment times

 

 

Conclusion

There are multiple ways to deploy BIG-IP when your requirements dictate an architecture that is not deployed by default. By keeping in mind my priorities of application services, operational support and high availability, you can decide on the best architecture for a given scenario.

 

Thanks for reading, and let me know if you have any questions.

Updated Feb 12, 2024
Version 4.0

Was this article helpful?

20 Comments