Forum Discussion

johnfilo_45702's avatar
johnfilo_45702
Icon for Nimbostratus rankNimbostratus
Jul 21, 2010

Long Persistent Sessions and no outage deployments

OK, my first post so go easy!!

 

 

My company is about to embark on deploying F5's in our environment and I would appreciate some thoughts on how I can go about solving a problem I have with long sessions and being able to provide a no outage deployments amongst a pair of backend J2EE WebSphere servers.

 

 

==== Start Scenario ====

 

 

F5 is load balancing 2 backend J2EE servers and we have session persistence to one of the two J2EE servers using the infamous JSESSIONID cookie. Session timeout for the persistent sessions is a massive 10 hours, mainly becuase of the call centre style environment and the business don't want our users having to repeat login through out their working day (yes I know, they are long but work with me!).

 

 

What we would like to achieve is no outage deployments, where by we can place one of the backend J2EE servers in offline mode, permit no new sessions or connection establishment on this server AND (the kicker) be able to transfer the existing persisted sessions to the other active server so we have no outage from a client point of view.

 

 

** We don't want to wait for the persistent sessions to time out on the server we would like to update

 

 

Once the new version of the webapp has been deployed, tested etc, we would like to route all new sessions to the newly updated server and let the other server "gracefully" drainstop all existing sessions whilst not allowing new sessions or connection establishment.

 

 

==== End Scenario ====

 

 

What I'm looking for I guess are some practical examples of how to acheive this, F5 setup recommendations and hopefully constructive conversation with people who have done this before already and are willing to share their lessons learned.

 

 

All comments are welcomed.

 

 

Thanks in advance.

 

 

John

8 Replies

  • Does your back end have the provision to know how to deal with a session that it hasn't seen before? An example is shared sessions in a clustered WebSphere environment. This will be a challenge if you don't have the ability to do this. If you do have this setup on the app server side, this requirement doesn't sound too tricky.

     

     

    -Matt
  • Are you currently using the JSESSIONID iRule here? I imagine there's some way to add an "active_members" type of check to tell it to use persistence unless that specific pool member is down.
  • Hi John,

     

     

    Welcome to the forums. That's a very well thought out and described scenario.

     

     

    Matt makes a good point. The other app servers would need to have access to the session details in order for this to work with minimal to no impact on the client experience.

     

     

    As far as transferring the TCP connection to another server, you should be able to use the reselect option for the "action on service down" pool setting and possibly a custom "always fail" or reverse monitor to handle this:

     

     

    SOL10640: Pool member reselection options

     

    https://support.f5.com/kb/en-us/solutions/public/10000/600/sol10640.html

     

     

    You can check this post for helpful info from Chris and Michael on action on service down:

     

     

    Action on service down question

     

    http://devcentral.f5.com/tabid/1082223/asg/44/showtab/groupforums/aff/32/afv/topic/aft/1172557/Default.aspx

     

     

    Or instead of adding a monitor to the pool member that will always fail when you want to take a specific pool member down, you could have a page which is monitored by LTM on the application changed before maintenance begins so that the monitor check fails. The "action on service down" should then be triggered which could be to reselect a new pool member.

     

     

    On a slight tangent: the JSESSIONID persistence iRule should always be deployed with a OneConnect profile to ensure each HTTP request is persisted to the correct pool member. If you're not using SNAT on the virtual server, you can set the OneConnect source mask to a host: 255.255.255.255.

     

     

    Aaron
  • L4L7 -

     

    Yes, WebSphere is configured to replicate sessions and the application can deal with a session is hasn't seen before when a user is sent its way

     

     

    Chris Miller -

     

    We haven't put the devices in yet, I'm going through the thought process of how I would set them up to support this scenario. I like the thinking though, I'll read up on the JSESSIONID iRule. thanks

     

     

    hoolio -

     

    Thanks. The other app servers will have access to sessions created by any of the servers through WebSphere session replication. That part already works and is tested. I had read SOL10640 last night. It definitely has the means to do what I need in a normal running state, but I didn't make the leap to using the LTM monitored page as the trigger to placing the node in "maintenance" mode - I like that idea. Thanks for the FYI on the JSESSIONID iRULE as well.

     

     

    Thanks to all who replied. I think I have enough now to go away and put an solution together. I will update this topic with my final solution also.
  • OK, just been reading up on "LTM - Action when service down" here - http://devcentral.f5.com/Tutorials/TechTips/tabid/63/articleType/ArticleView/articleId/179/LTM-Action-on-Service-Down.aspx

     

     

    Does this feature give you the flexibility to take various actions depending on the string returned from the monitored page?

     

     

    For example, lets say I defined 3 different status strings returned by the monitored page in the backend:

     

     

    RUNNING - Normal operating server

     

    MAINTENANCE - Server in maintenance mode so reselect another server from the pool, don't RST the client connection and send no new sessions or connections to this server. (assume active TCP connections can complete?)

     

    DRAINING - Server will be placed in maintenance mode once all active sessions have cleared, so keep sending existing sessions to this server but don't allow any new sessions to be created.

     

     

    RUNNING and MAINTENANCE are easy and I think "LTM - Action when service down" is perfect to help me do this.

     

     

    The problem is once I have upgraded the server that was in MAINTENANCE and add it back to the pool, I would then need to place the other server into DRAINING so that existing sessions complete and die a natural death, whilst all new sessions get directed to the newly upgraded server.

     

     

    Is this possible?

     

  • Hamish's avatar
    Hamish
    Icon for Cirrocumulus rankCirrocumulus
    OK. Re-reading the requirements a few times, I originally though no... But wasn't considering the None option of action on service down. That would implement your draining scenario perfectly.

     

     

    I'm not sure you have a problem when you've finished upgrading the first server and then want to drain the second... Once the first is up and taking new sessions, place the second server into DOWN mode... AOSD with None will continue to send existing connections to it just like it did for the first option.

     

     

    I'm not sure you'd want to leave it in None when in normal/normal operation... But it should work. You'll need to monitor elsewhere, and if the DRAINING server suddenly goes down, you'll have to set to 'Reselect' or 'RST' so people with existing sessions recover. But overall I think it's doable.

     

     

     

    H
  • Hamish -

     

    Thanks for your thoughts. Not sure I want to run the AOSD in "None" all the time..... I am trying to find a solution where by our operators don't need to access the F5 GUI or CLI, hence my thinking around changing the string returned in the monitored page - the operator could do this though scripts on the backend server they already have access to.

     

     

    The plot thickens.....

     

     

    I guess I could have AOSD set to "Reselect" all the time, have the member marked down by changing the string returned to the monitor to MAINTENANCE. This will reroute, on the server side, client sessions so there's no impact to them. Do the upgrade and then put the server back into the pool by changing the returned string back to RUNNING so new sessions can be sent to it and then through CLI on the F5 for the other server call LocalLB.PoolMember.set_session_enabled_state() and change its state to "STATE_DISABLED" knowing that this will not fully disable the pool member but will just stop new connections from being established and I get my existing sessions completing naturally. Once all sessions have been finsihed, change it's returned string to MAINTENANCE, update the server, call LocalLB.PoolMember.set_session_enabled_state() and change its state to "STATE_ENABLED" and change its returned string to RUNNING - hey presto I have two updated servers with - in theory - no client session impacts.

     

     

    What do you think?