Tech Tips on DevCentral
   
You are here: Tutorials > Tech Tips

Current Articles | Categories | Search | Syndication

Problems Overcome During a Major LTM Software/Hardware Upgrade

by smp - 6747 views Article Rating

I recently completed a successful major LTM hardware and software migration which accomplished two high-level goals: 

·         Software upgrade from v9.3.1HF8 to v10.1.0HF1
·         Hardware platform migration from 6400 to 6900
 
I encountered several problems during the migration event that would have stopped me in my tracks had I not (in most cases) encountered them already during my testing. This is a list of those issues and what I did to address them. While I may not have all the documentation about these problems or even fully understand all the details, the bottom line is that they worked. My hope is that someone else will benefit from it when it counts the most (and you know what I mean).
 
 
Problem #1 – Unable to Access the Configuration Utility (admin GUI)
The first issue I had to resolve was apparent immediately after the upgrade finished. When I tried to access the Configuration utility, I was denied:
 
Access forbidden!
You don't have permission to access the requested object.
Error 403
 
I happened to find the resolution in SOL7448: Restricting access to the Configuration utility by source IP address. The SOL refers to bigpipe commands, which is what I used initially:
 
bigpipe httpd allow all add
bigpipe save
 
Since then, I’ve developed the corresponding TMSH commands, which is F5’s long-term direction toward managing the system:
 
tmsh modify sys httpd allow replace-all-with {all}
tmsh save / sys config
 
 
Problem #2 – Incompatible Profile
I encountered the second issue after the upgraded configuration was loaded for the first time:
 
[root@bigip2:INOPERATIVE] config # BIGpipe unknown operation error: 01070752:3: Virtual server vs_0_0_0_0_22 (forwarding type) has an incompatible profile.
 
By reviewing the /config/bigip.conf file, I found that my forwarding virtual servers had a TCP profile applied:
 
virtual vs_0_0_0_0_22 {
 destination any:22
 ip forward
 ip protocol tcp
 translate service disable
 profile custom_tcp
}
 
Apparently v9 did not care about this, but v10 would not load until I manually removed these TCP profile references from all of my forwarding virtual servers.
 
 
Problem #3 – BIGpipe parsing error
Then I encountered a second problem while attempting to load the configuration for the first time:
 
BIGpipe parsing error (/config/bigip.conf Line 6870): 012e0022:3: The requested value (x.x.x.x:3d-nfsd {) is invalid (show | <pool member list> | none) [add | delete]) for 'members' in 'pool'
 
While examining this error, I noticed that the port number was translated into a service name – “3d-nfsd”. Fortunately during my initial v10 research, I came across SOL11293 - The default /etc/services file in BIG-IP version 10.1.0 contains service names that may cause a configuration load failure. While I had added a step in my upgrade process to prevent the LTM from service translation, it was not scheduled until after the configuration had been successfully loaded on the new hardware. Instead I had to move this step up in the overall process flow:
 
bigpipe cli service number
b save
 
The corresponding TMSH commands are:
 
tmsh modify cli global-settings service number
tmsh save / sys config
 
 
Problem #4 – Command is not valid in current event context
This was the final error we encountered when trying to load the upgraded configuration for the first time:
 
BIGpipe rule creation error: 01070151:3: Rule [www.mycompany.com] error: line 28: [command is not valid in current event context (HTTP_RESPONSE)] [HTTP::host]
 
While reviewing the iRule it was obvious that we had a statement which didn’t make any sense, since there is no Host header in an HTTP response. Apparently it didn’t bother v9, but v10 didn’t like it:
 
when HTTP_RESPONSE {
 switch -glob [string tolower [HTTP::host]] {
    <do some stuff>
 }
}
 
We simply removed that event from the iRule.
 
 
Problem #5: Failed Log Rotation
After I finished my first migration, I found myself in a situation where none of the logs in the /var/log directory were not being rotated. The /var/log/secure log file held the best clue about the underlying issue:
 
warning crond[7634]: Deprecated pam_stack module called from service "crond"
 
I had to open a case with F5, who found that the PAM crond configuration file (/config/bigip/auth/pam.d/crond) had been pulled from the old unit:
 
#
# The PAM configuration file for the cron daemon
#
#
auth    sufficient      pam_rootok.so
auth    required        pam_stack.so service=system-auth
auth    required        pam_env.so
account required        pam_stack.so service=system-auth
session required        pam_limits.so
#session        optional        pam_krb5.so
 
I had to update the file from a clean unit (which I was fortunate enough to have at my disposal):
 
#
# The PAM configuration file for the cron daemon
#
#
auth       sufficient pam_rootok.so
auth       required   pam_env.so
auth       include    system-auth
account    required   pam_access.so
account    sufficient pam_permit.so
account    include    system-auth
session    required   pam_loginuid.so
session    include    system-auth
 
and restart crond:
 
bigstart restart crond
 
or in the v10 world:
 
tmsh restart sys service crond


Problem #6: LTM/GTM SSL Communication Failure

This particular issue is the sole reason that my most recent migration process took 10 hours instead of four. Even if you do have a GTM, you are not likely to encounter it since it was a result of our own configuration. But I thought I’d include it since it isn’t something you’ll see documented by F5. One of the steps in my migration plan was to validate successful LTM/GTM communication with iqdump. When I got to this point in the migration process, I found that iqdump was failing in both directions because of SSL certificate verification despite having installed the new Trusted Server Certificate on the GTM, and Trusted Device Certificates on both the LTM and GTM. After several hours of troubleshooting,  I decided to perform a tcpdump to see if I could gain any insight based on what was happening on the wire. I didn’t notice it at first, but when I looked at the trace again later I noticed the hostname on the certificate that the LTM was presenting was not correct. It was a very small detail that could have easily been missed, but was the key in identifying the root cause.
 
Having dealt with Device Certificates in the past, I knew that the Device Certificate file was /config/httpd/conf/ssl.crt/server.crt. When I looked in that directory on the filesystem, there I found a number of certificates (and subsequently, private keys in /config/httpd/conf/ssl.key) that should not have been there. I also found that these certificates and keys were pulled from the configuration on the old hardware. So I removed the extraneous certificates and keys from these directories and restarted the httpd service (“bigstart restart httpd”, or “tmsh restart sys service crond”). After I did that, the LTM presented the correct Device Certificate and LTM/GTM communication was restored. I'm still not sure to this day how those certificates got there in the first place...


Rate This Article:

COMMENTS

posted @ Tuesday, June 29, 2010 10:20 AM by jpedley   

Great work! Another thing to worry about iRules is datagroup naming. In 9.4.X, I have datagroups such as ::dg_name. This works fine in 9.4 but fails in 10.2. Fix is to add the $. You can also do this pre-migration, but be warned, if your datagroup name has a hyphen in it, it will truncate the variable in 9.4, and since it doesn't match a valid datagroup, it will abort.

Clients really don't like to see TCP resets...
Only registered users may post comments.
  
Subscriptions: Video  |  Audio  |  Tutorials  |  Tech Tips  |  Features  | 

More...

 

 

Essentials Quick Start Guides
iRules Wiki | iControl SDK | WebAccelerator Wiki iRules | iControl
FirePass Wiki | Advanced Design & Config Wiki WebAccelerator | FirePass

 

Videos

  

Audio

Cache in with LTM and iRules
Can iRules fix my cert mismatch errors?
Concurrent iControl Programming Explained
Cookie LoJack vi iRules
Creating An iControl PowerShell Monitoring Dashboard With Google Charts
Custom SNMP Traps
Exchange Persistence Duality and iRules
FTPS Offload via iRules
Getting Started with pyControl
iControl 101 - #19 - Time Conversions
iControl 101 - #20 - Port Lockdown
iControl 101 - #21 - Rate Classes
iControl 101 - #22 - GTM Data Centers
iControl Apps - #04 - Graceful Server Shutdown
iControl Apps - #05 - Rate Based Statistics
iControl Apps - #06 - Configuration Archiving
iControl Apps - #07 - System Http Statistics
iControl Apps - #08 - System IP Statistics
iControl Apps - #09 - TMM Statistics
iControl Apps - #10 - Bigpipe List
iControl Apps - #11 - Global GTM Statistics
iControl Apps - #12 - Global SSL Statistics
iControl Apps - #13 - System PVA Statistics
iControl Apps - #14 - Global Statistics
iControl Apps - #18 - Virtual Server Reverse Lookup
Investigating the LTM TCP Profile: Acknowledgements
Investigating the LTM TCP Profile: Congestion Control Algorithms
Investigating the LTM TCP Profile: ECN &amp; LTR
Investigating the LTM TCP Profile: Max Syn Retransmissions &amp; Idle Timeout
Investigating the LTM TCP Profile: Nagle’s Algorithm
Investigating the LTM TCP Profile: The Finish Line
Investigating the LTM TCP Profile: Windows &amp; Buffers
iRules 101 - #13 - TCL String Commands Part 1
iRules 101 - #14 - TCL String Commands Part 2
iRules 101 - #15 - TCL List Handling Commands
iRules Event Order
Managing The System Boot Location with iControl
Persisting SSL Connections
Replacing the WebSphere Apache Plugin with iRules
Ruby meets iControl: Creating VIPs
Ruby meets iControl: Making Wide IPs
Ruby Meets iControl: Switching Policies
Ten Steps to iRules Optimization
Unbind your LDAP servers with iRules
v.10 - A new iRules Namespace
v.10 - FastHTTP and Cookie Persistence
v.10 - iRules and the after command
v.10 - New class features in iRules
v.10 - Remote Authorization via TACACS&#43;
v10.1 - Configuring GTM's DNS Security Extensions

  

Features

  

Tutorials

  

iControl

  

iRules

  

Monitoring & Management

  

Advanced Design & Config

  

93,050 Members in 191 Countries and Growing!

Join DevCentral Today!

About DevCentral

F5 DevCentral is your source for the best technical documentation, discussion forums, blogs, media and more related to application delivery networking.

So dive in, meet your peers, and get familiar with DevCentral. We hope it makes your job easier and helps you get more from your F5 investment. If new to DevCentral, check out the Getting Started section. And if you have any problems, or think something could be easier to use, let us know.

Got It !

We've received your comment and transmitted it directly to DevCentral HQ.

Thanks for taking time to let us know what's on your mind. At DevCentral | Community Matters!

Get In Touch With Us

Have questions, suggestions or just want to get something off your chest?

Use our handy form below to Direct Connect with DevCentral Mission Control.

Send Us Feedback      or