Forum Discussion

Jason_40733's avatar
Jason_40733
Icon for Cirrocumulus rankCirrocumulus
Jul 24, 2012

LTM Object Limitations ( vips, self-ips etc )

We are deploying massive numbers of small pods in our infrastructure. In each one we are assigning a strict route-domain, vlan, 2 self-ips for the bigips, 1 floating self-ip, 1 virtual, 1 pool, 1 node and one default gateway.

 

 

What we have discovered is that when using individual partitions TMM starts to panic and have issues once we have deployed 1200 pods. In addition, syncing the configs becomes problematic and results in one or both nodes going inoperative. ( We suspect that when the configs are written, a file handle is opened for every single file, the data written, then the file handles are closed. This gives some atomic structure to the save config or config sync, but leads to significant I/O issues that cause a pause in traffic and often a complete tmm bounce at high numbers of partitions )

 

 

 

When using a single partition, our scalability starts breaking at ~ 1400 pods. Keep in mind, both of these break points are with zero traffic.

 

 

 

 

 

I've googled, searched Devcentral ( anyone else having zero search results out of the search function every time? ), opened a case with F5, and done my best to pillage all available resources. I'm hoping someone knows of these limitations or can point me to some documentation that lays out these limitations and best practices for large scale implementations ( object wise, not traffic wise ).

 

 

 

We've heard that VCMP will allocate more memory for holding objects, so our Viprions could scale higher based off our number of guests. Can anyone confirm if this is a linear scaling or are there diminishing returns as we add VCMP guests?

 

 

 

Thank you everyone for any assistance,

 

 

 

Jason

 

 

 

2 Replies

  • Hi Jason,

     

     

    I suggest working with your F5 or partner SE to go over the architecture and sizing requirements.

     

     

    Aaron
  • Definitely a good idea Aaron. We have already contacted them and are awaiting a meeting. We are also awaiting a response to a support ticket that has been opened for many days. Our normal experience with open tickets is excellent, I think this one has fallen into a black hole that even repeated calls into the call center cannot generate any response on.

     

     

    I was hoping there might be some knowledge or documentation readily available on object definitions and object limits. Or that someone might have crossed this bridge before. Our hypothesis is that VCMP will give us linear increase in of objects available on the same hardware. But going live on a large project with only a hypothesis isn't the best of ideas.

     

     

     

    Based off of our current testing, we're going to monitor what we "think" are objects ( vlans, self-ips, pools, nodes, irules, datagroups, trunks, snats, nats, snatpools, monitors, ssl certs, profiles.... basically anything we configure ) and stay 20% under our 'safe' value we roughly estimated through testing. We'll post numbers here for any future people having scalability issues once we've gotten some more testing/feedback.

     

     

     

     

     

    Thanks again,

     

     

     

    Jason