Forum Discussion

VB_95896's avatar
VB_95896
Icon for Nimbostratus rankNimbostratus
Nov 18, 2008

ASM Configurations Active-Active

Hi,

 

 

I have got a little problem with 2 units "Big-IP ASM" (platform : "4100") running version "9.4.5" (+ Hotfix).

 

 

In an Active-Active configuration, I wish I could configure two shared mac masquerade address per VLAN (one for each unit).

 

 

However, the GUI does not allow (as far as I know) configuring more than one shared mac masquerade address (per VLAN). And if I choose to use this shared mac masquerade address on both units (on a given VLAN), there will be a conflict (the units are both active and are therefore both responsible for this shared mac address).

 

 

Is there a way to configure 2 shared mac masquerade address per VLAN ?

 

(one belonging to unit-1, the other belonging to unit-2)

 

 

Any info is welcome.

 

 

Thanks in advance,

 

 

VB

 

3 Replies

  • Hi VB,

     

     

    VLAN configuration isn't sync'ed between the units, so can't you configure a MAC masquerading address for both units without affecting the other unit?

     

     

    I'd be concerned about running active-active for LTM and even more so with ASM. There is a possibility of losing redundancy using an active-active config if either unit is loaded over 50% and one unit fails. With ASM, there are quite a few host processes which use CPU0. Some of the processes (some nice'd and some not) regularly use 100% of CPU0. This makes it difficult to gauge when the units are at 50% capacity.

     

     

    Aaron
  • Hi,

     

     

    Thanks a lot for your answer.

     

     

    Actually, it works without using MAC masquerading addresses. But I might test your scenario with 2 different mac masq @ (the interest being to avoid gratuitous ARP replies and susbequent latencies).

     

     

    I am more concerned with the risks of an Active-Active configuration.

     

     

    1) Monitoring of the load seems difficult. It first requires a definition of the MRL{conf} = "Maximum Required Load under a given configuration". MRL{conf} should then be monitored to be ketp under 50%: before any configuration change, one would have to test (computation is never reliable enough...) MRL{new_conf} to make sure it is below 50%. First problem: the test could crash the unit. Second problem :

     

    certain configuration change can't be forecasted : (as far as I understood) an unknown change in a web application could cause the related security policy

     

    to increase its needs. Hence the requirement for a - possibly big - security margin...

     

     

    2) A test showed that a config sync can produce a high load: in a scenario with 2 HTTP virtual servers, 2 active security policies (one blocking, the other not),

     

    and absolutely no traffic, a config sync took around 15 minutes and consumed up to 80% of CPU0 (a confirmation of your point). Knowing that before the sync,

     

    the single difference between the 2 units was only 1 basic security policy, what one shall expect with more advanced configurations ?

     

     

     

    How to interprete this test ? Does it mean that, even in an active-standby configuration, one has to keep the load under 60% of the total CPU (CPU0 + CPU1) ?

     

     

     

    More generally, I wish I could answer the following question :

     

     

    How does the processing power of a Big-IP ASM (PF 4100, VER 9.4.5, HF2) translates in terms of :

     

     

    - max number of virtual servers/pools/nodes

     

    - max number of active/standby security policies (nb of rollback versions, nb of active attack signatures)

     

    - max number and scope of web applications (objects/parameters)

     

    - ...

     

     

     

    Any info is welcome,

     

     

    Thanks,

     

     

    VB

     

  • The first question I'd ask is why are/do you want to use active-active? It adds a lot of complexity and potential for failure.

     

     

    Your questions are valid--but not easy for me to answer. Maybe an F5 presales engineer would be able to help you more. You might also post in the Performance Testing forum (Click here). Mike Lowell has been very helpful there.

     

     

    1. It is difficult to get accurate CPU stats from LTM and ASM when there are nice'd processes. You can't say by looking solely at the performance graphs for CPU0 whether high CPU utilization is due to a real time process or processes that are set to give up CPU cycles when a higher priority process needs cycles. You'd need to look at the individual processes to see which are nice'd and how much CPU they are using. I don't know of any F5 supplied/supported tools you can use to track per process CPU/memory utilization. I've requested this previously and am planning on making a new request soon.

     

     

    Forecasting load/capacity is also very complex. It's highly dependent on the policy configuration, characteristics of the web application (size of responses, server latency, etc) and client characteristics (number of requests per second, size of requests, client latency, etc). I know F5 has been hesitant in the past to provide sizing numbers because there are so many variables to consider. Vendors seems to provide best case scenario numbers, which aren't very helpful in planning a real implementation.

     

     

    If you're doing load testing, I'd suggest using a test environment. The idea is you want to trigger a failure so you know how far you can push the BIG-IP's. Or worst case, you could test during a maintenance window.

     

     

    2. When a config sync is performed, the current configuration is saved to a UCS archive. That UCS is copied to the peer, unpacked and loaded. This is a complete config backup, so it doesn't matter how similar the peer configuration is--the entire configuration is saved/loaded. In versions older than 9.4.5, the UCS file could be enormous as it included the ASM forensics. The forensics are no longer included in the UCS, so that shouldn't be a problem. Recently, I've noticed some new ASM-related perl scripts (/ts/tools/add*.pl) are run during the config installation process. I haven't dug into them to see exactly what they're doing, but they do seem to eat up a lot of CPU time. You can see what's done for the ASM configuration in the /ts/tools/ts_configsync.pl script. I'm surprised though if it's taking 15 minutes to sync the configuration. You might want to open a case with F5 Support to troubleshoot this. On a test unit with no traffic it took 10 seconds to save the config to a UCS. When installing a UCS a backup is done first and then the new UCS is loaded. In total, this took 1min 34sec:

     

     

     

    [hoolio@test6400:Active] ts time b config save test.ucs

     

    Saving active configuration...

     

     

    real 0m10.355s

     

    user 0m4.200s

     

    sys 0m2.590s

     

     

    [hoolio@test6400:Active] ts time b config install test.ucs

     

    Saving active configuration...

     

    Current configuration backed up to /var/local/ucs/cs_backup.ucs.

     

    Installing full configuration on host test6400.example.net

     

    Installing...

     

    Installing ASM configuration...

     

    Reloading configuration... It may take a few minutes...

     

    Reading configuration from /defaults/config_base.conf.

     

    Reading configuration from /config/bigip_base.conf.

     

    Reading configuration from /config/bigip_sys.conf.

     

    Reading configuration from /usr/bin/monitors/builtins/base_monitors.conf.

     

    Reading configuration from /config/profile_base.conf.

     

    Reading configuration from /config/daemon.conf.

     

    Reading configuration from /config/bigip.conf.

     

    Reading configuration from /config/bigip_local.conf.

     

    Loading the configuration ...

     

    Loading ASM configuration...

     

     

    real 1m34.325s

     

    user 0m13.060s

     

    sys 0m7.420s

     

     

     

     

    It's been a best practice to not sync a UCS to an active unit while in production, as it does take some time to load the configuration. In the process, connections held by TMM *should be* maintained as the config is reloaded. In practice, I'd never sync the config to an active unit as I've seen traffic stall--particularly when the ASM config is being loaded.

     

     

    As far as keeping load under 50% on an active-active pair, you'd need to keep both CPUs under 50% independent of each other. There is no cross-CPU usage.

     

     

    Regarding sizing, you'd need to go to F5 for recommendations. If you do get more info on this, please reply here to share it.

     

     

    Hope this helps,

     

    Aaron