This'll be a long post so get a cup of coffee or whatever your poison of choice is.
Sooooo, from all my digging around, Python based External Monitors are something that nobody talks about, has gotten right or maybe they just keep the solution to themselves or have realised the futility of python for advanced external monitors.
I am unfortunately pretty stubborn when I am faced with an interesting challenge so like a dog with a bone persisted until I got everything working except the actual F5 monitoring part. Bear in mind I am no coder, I'm ok at python, I know F5's relatively well and this is my first external monitor that I have written.
The shyte part is that there is, as far as I have seen, no information on using python for external monitors other than a couple of mentions here and there saying that it is possible.
As many of you know F5's current Radius monitor will only mark a member up when it has a successful login against a user account and has no means of establishing a connection then simply testing for ANY valid radius response regardless of whether the account supplied is valid or not.
This effectively shuts out the use of the F5 Radius monitor for Two Factor Auth systems where security is stringent enough to disallow the use of a single factor auth account with a fixed password for monitoring purposes.
To try solve this I wrote up this python script that will establish a connection to a radius server regardless of the number of auth factors it requires, it will then fire off a bogus(or not) authentication attemmpt to the radius server.
It's pretty crude, but as long as the connection doesn't time out there will be some sort of response from the radius server which will result in the script writing to stdout so that the F5 will mark the member as up. If there is a timeout the script's exit code is zero with no output so that the member will be marked as down.
After importing the script it works like a charm when run from /config/filestore/files_d/Common_d/external_monitor_d/ with the various options. It happily handles the ::ffff: prefix to the IPv4 addresses as well as strips, for now, the routing domain tag in case it's appened to the address, but when it is executed as part of a monitor it fails.
For the life of me I can't figure out why and nor have I figured out how to do a detailed debug of the script and what parameters are being handed to it at execution by the F5. So this is where you guys and gals come in and hopefully you can help.
The issues I have are as follows:
I don't know if I'm envoking the python shebang correctly for F5.
Unable to debug the script and /var/log/monitors/Common_radius_monitor.log doesn't have anything relating to script runtime errors.
Inspite of using a Syslog handler to try write to syslog-ng I'm unable to output to /var/log/ltm so there isn't anything useful in there (for now).
This entire thing might be a complete non-starter if the F5 is super strict about code execution and is shutting out my use of the six and radius modules stored in /config/eav/, but without a run-time debug I can't tell what it's problem is.
Unfortunately because of the length of the code and this description I've had to add my script via code share:
External Radius Monitor using Python
I found part of the problem around the execution, I setup a raw text dump in the script and noticed that because I had enabled monitor logging on my pool member in the test pool the F5 was supplying two additional command line arguments that were breaking my code.
So for future awareness, if you enable monitor logging on the pool member you'll get two additional arguments, one stating "log", the second being the the log file which will in /var/log/monitor/...
Aside from that I can see my code is working and connecting, but it is failing to generate an output that is acceptable to the F5. Once I have fixed this and built in some input sanitation I'll update my codeshare.
I think I have it all nailed down now. I switched over to using ENV variables entirely, cleaned up the script quite a bit and added additional capabilities and options.
I've updated the CodeShare and will update the notes on it and I'll update the answer field for this question tomorrow :)
This is not a non-starter, I got the script up and running as a working monitor. Using the environment variables turns out to be the cleanest way to hand information into the script at run time however I have added the ability to override the environment variables when running from the command line.
The remaining issue with the monitor is that when run as a monitor, valid credentials with a current dynamic token value results in a failed auth while the exact same credentials via command line while the token is still valid results in a successful auth.
This happens whether one uses the Environment Variable or the Arguments field in the monitor. Either way the response is a valid radius response so it marks the pool members up as it should.
I feel that this problem is outside of the scope of this question and is likely due to some idiosyncrasy with the monitor variables/arguments so I'll close this query off, but if you have any comments or solutions feel free to post.