Forum Discussion

Dan_Sheehan_841's avatar
Dan_Sheehan_841
Icon for Nimbostratus rankNimbostratus
Sep 28, 2012

"Drain stopping" an Exchange 2010 CAS array w/o user getting prompts.

Greetings,

 

My team has established an Exchange 2010 load balanced presence using the template in our F5 3600 LTM HA Pair. Everything is working as we would expect, except for the ability to remove a CAS server from the load balancing pool without causing prompts to end users in Outlook.

 

When we have taken a CAS node offline using the "Forced Offline" function, like we used to do in Exchange 2007 with no issue but there was no CAS array, users are prompted with reconnecting with Exchange pop-ups in Outlook. When we tried the "Disabled" function, we waited 24 hours and there were still active sessions to the "Disabled" CAS node.

 

Right now we can't take a CAS offline in the middle of the day without causing Outlook connectivity pop-ups, so we are stuck with doing them in the middle of the night which is not ideal.

 

FYI - our MAPI VIP is using the EX2010_rpc_persist_profile which has the timeout setting of 28800 to match the default 8 hour timeout Outlook and Exchange negotiate. The profile has inherited the other settings of Hash Algorithm: Default, Mask: None, Map Proxies: Enabled.

 

The BigIP OS version is BIG-IP 10.2.3 Build 123.0 Hotfix HF1.

 

Thanks for any assistance!

 

10 Replies

  • mikeshimkus_111's avatar
    mikeshimkus_111
    Historic F5 Account
    Hi Dan, with MAPI, afaik there's no way to avoid users getting auth prompts when reconnecting to a different pool member. We have logic built into the persistence profiles to persist all connections from the same client to the same node because otherwise you would see non-stop auth prompts within the Outlook client.

     

     

    About the long-lived connections, you might try decreasing the timeout setting and also deleting any persistence records for that node; administratively downing a pool member will not terminate existing connections and new connections with a persistence record for the node will continue to be sent to that node as well. The command to do that would be something like:

     

     

    (tmos)del ltm persistence persist-records node-addr 10.133.20.81 mode source-address

     

     

    thanks

     

    Mike

     

  • Wow... thank you for the VERY quick reply and for your help.

     

    Thank you for clarifying that the MAPI sessions have to stay with the current node until it ends correctly, otherwise the client will get a prompt when reconnecting to a CAS member.

     

    With persistence profile timeout setting of 8 hours, shouldn't all connections to the CAS member be removed 8 hours after putting the node into a Disabled state? We would be fine with an 8 hour delay, but even 24 hours after disabling a node we still saw connections to it which made no sense to us.

     

    Or should be using the "Force Offline" versus "Disabled", and if so should the "Force Offline" cause users to get a prompt as long as we wait 8 hours after selecting it?

     

    What is the side effect of deleting the persistence records? I.E. Will it cause any Outlook MAPI prompts?

     

    Thanks again for your help!

     

  • mikeshimkus_111's avatar
    mikeshimkus_111
    Historic F5 Account
    The persistence profile timeout determines how long BIG-IP will persist new connections to that node. Any existing connections will not be removed until they are terminated by the client. So if you have someone who's left Outlook open, they are going to remain connected to the same node even though their persistence record has expired.

     

     

    Forcing the node offline will not kill the existing connections either, but it does prevent new connections from going to that node, persistence be damned. Again, your client left open will not disconnect.

     

     

    Deleting the persistence records won't cause prompts for existing or new connections to the same node, only for new connections to a different node.

     

     

    The only way to truly kill all existing connections would be to delete them from tmsh using the delete sys connection command. You might disable the node, delete the persistence entries, and then right before you actually take the server offline delete the open connections to it. Those users will get prompted when they reconnect to another node.

     

     

    • Venezz_242243's avatar
      Venezz_242243
      Icon for Nimbostratus rankNimbostratus
      Forcing the node offline will not kill the existing connections either, but it does prevent new connections from going to that node, persistence be damned. Again, your client left open will not disconnect. Sorry for the harsh words but this is just bullshit! If you force the node offline it WILL kill ALL existing connections. Of course an Outlook 2010/2013 is able to switch the connection to other cas servers. but first ALL connections are deleted. Every user gets the message "Outlook disconnected". As an F5 technician you should stop telling about things that are not true. TY
  • FYI - I found this detail in our documentation of how we configured things differently from the default template settings:

     

    The CAS Array F5 VIPs in each datacenter will be modified to use the RPC persistence profile timeout of 8 hours instead of the F5 default 3 minutes, and the TCP protocol idle timeout (for both WAN and LAN optimized profiles) of 1 hour instead of the F5 default 5 minutes. These changes are necessary to support proper BlackBerry operations and overall MAPI client communications to the CAS Arrays.

     

     

    If I understand you correctly, neither the "Disabled" or the "Forced Offline" will cause the MAPI sessions to time out after 8 hours, and that essentially as long as the client and server keep the channel open the session will stay active (which explains why we saw sessions 24 hours after we disabled a CAS).

     

    Subsequently there is no effective way to drainstop a CAS from the MAPI CAS Array within the defined timeout period, and that essentially we have to break the client connections to the back end either by deleting them or outright rebooting the server.

     

    If that's all correct, then that sucks as we can't take action in the middle of the day that might cause connection prompts to users as the help desk will get calls.

     

    I appreciate your continued advice and responses.

     

  • Dan:

     

     

    I too am interested in the best procedure for patching and restarting Exchange 2010 CAS nodes behind F5 load balancers without the user being prompted. The last time I tried putting one of my three CAS nodes in disabled state, it interrupted all users. There has to be a way to do this and still maintain session or at least drain off all the users.

     

     

    thanks,

     

     

    Perry
  • It is my understanding the the CAS Array isn't going to cause issues, as one of the benefits of the CAS array is to allow a CAS to die and the RPC endpoint will move to another CAS without disrupting services.

     

    Setting your node to disabled should allow (over time) connections to "drain". Unfortunately, if you've got long timeouts on your peristance profile it's possible that extremely active/chatty clients may stay connected "forever". The whole point of persistence profiles it to keep connections from drifting between nodes, so it's kind of a catch 22. Our policy is to not perform CAS maintenances during the day, but we're a 24/7 shop so that's a little more realistic for us.

     

    John

     

  • @perrdiddy - I have not been able to find a succesful way to "drainstop" our CASs so that we can perform maintenance on them without users noticing. I have all but given up hope.

     

    We work on them in the middle of the night so that most users are reconnected by the time the get to work in the morning. We have users working 7x24 365 accross the nation as that is our business, so we always impact someone with this maintenance.

     

     

    @John Matlock - AFAIK our sessions have an 8 hour timeout for MAPI traffic, as that's what the client expects by default, so I don't know why they are staying connected for more than 24 horus as in theory they should switch to another CAS at some point. We are a 7x24 shop as well, but we like to avoid staying up late and working in the middle of the night if at all possible.

     

     

    Essentially I don't think this is fixable until we go to Exchange 2013 where all the clients talk HTTPS to the CAS array exclusively (MAPI is dead except for internal Exchange server communication).

     

  • Agreed, this is one of the reasons they made the architectural changes they did in Exchange 2013-- sessionless/stateless CASs solve a lot of issues. They've also substantially decreased the impact of moving databases between DAG members.

     

     

    As for sessions not timing out after 8 hours, the timer is reset everytime a packet is matched to the persistence record. If the client checks in atleast once every 8 hours their persistence record will never expire. When I mentioned we were 24/7, I meant that we have a 3rd shift that works overnight and they do a lot of our maintenances. Obviously, this is a rare situation.

     

     

    John

     

  • Thanks John, that's probably the best answer I have received todate on why the persistence is never clearing and why we can't "drain stop" the CASs efectively.

     

     

    Dan.