I recently had the opportunity to create a few custom BIG-IP health monitors for use in monitoring web sites hosted on a SharePoint 2010 farm.  The default HTTP monitor could not be used because as it is configured the sites require you to log in via NTLM.

Not having a default monitor to turn to in this situation and having only tinkered with external monitors before, I began searching around for a way to setup an external monitor that could log on to the SharePoint sites to perform the health check.  Naturally I turned to DevCentral and did a little digging around on the forums.   That is where I found a wonderful post by stp1978 that laid out the basics of what I needed to do.

I will try to write this post in a way that will explain to someone who has never setup an external monitor how to set one up and who knows there may be someone out there who is looking for a way to monitor a SharePoint 2010 web site that uses NTLM.

The basic installation steps are:

1.  Prepare the script that will run.
2.  Create a service account so the BIG-IP can log on to the SharePoint Farm.  This will be used by the monitor to log into the various websites.
2.  Copy the script over to your BIG-IP and change the permissions so that it can be executed 0777.
3.  Log on to the BIG-IP GUI and create the external monitor.
4.  Apply the monitor to the pool.

If you are running a highly available pair in a sync group, it is ok to do this on the active unit and when you are done run a config sync.  This will copy the monitor and script over to the standby unit and you will be good to go if you have a failover event.  You don’t have to manually copy this over to the other unit.

The script (code supplied by stp1978)

# This removes the IPv6/IPv4 compatibility prefix.  This has to be done because the LTM passes addresses in IPv6 format.
IP=`echo ${1} | sed 's/::ffff://'`
PIDFILE="/var/run/`basename ${0}`.${IP}_${PORT}.pid"
# This will kill off the last instance of this monitor if it is hung and logs current PID
if [ -f $PIDFILE ]
kill -9 `cat $PIDFILE` > /dev/null 2>&1
echo "$$" > $PIDFILE
# This is the meat of the code, it is responsible for sending the request & checking for the expected response.
curl -fNs --ntlm -k -v --user '[email protected]:YourPassword' http://${IP}:${PORT}/_layouts/RecycleBin.aspx -H "Host: YourWebsite.com" | grep -i "deleted" 2>&1 > /dev/null
# This part of the code will mark the node UP if the expected response was received.
if [ $? -eq 0 ]
echo "UP"
rm -f $PIDFILE

The code above is commented very well and explains what each step does so I will not reiterate it here.  The parts that you will have to modify are of course your username, password and domain.  I created a service account in the domain and I use it to log onto the site with.  That way you don’t have to worry about the password expiring and you can limit your security risk by giving the service account only enough access to be able to get to the recycle bin on the SharePoint 2010 site in question.

You will also need to modify the URL string and the text that the BIG-IP searches for when it logs in and opens the page.  I thought it would be good to search for something simple and something that will likely never change.  In SharePoint 2010, your safest bet is probably to utilize the RecycleBin.aspx and search for the word “deleted”.  The way I see it this is the safest thing to check for.  This way it doesn’t matter what content gets changed or deleted on the site by the users, they can’t accidentally delete the recycle bin!

A small suggestion at this point… I HIGHLY recommend that you use something like Textpad to edit the file.  Using wordpad can have unintended consequences and may even mess the file up so much that the monitor will not work correctly.  Also be sure not to include a file extension on the end as it does not need one to work properly.

Using a program like WINSCP, copy the script over to the BIG-IP into the /usr/bin/monitors folder.  Then right click the file you just copied over and click properties.  Edit the permissions on the file to allow root to execute the file.  I just set the permissions on the file to 0777 as seen in the screenshot below.

Then log on to the BIG-IP GUI and create a new monitor.  Click create new monitor, select external monitor from the drop down menu, give it a name and then in the “External Program” field type the name of the file you copied over.  You don’t need to include the directory or a file extension, just the name.  Adjust the timing settings to your preferred time settings, I use 10/32 as seen in the screen shot below:

Then go and apply the monitor to your pool.  That’s it!  Now you have a fully functional external monitor that can check the health of your NTLM SharePoint 2010 web sites.

Thanks again to stp1978 for his hard work on this and for putting it out there in the community for others to utilize.


8 comments so far

Add Your Comment
  1. You have a flaw in this code, the IP address is first assigned correctly then reassigned incorrectly. I’ve updated this code to be more generic and resolve this issue. I will have this posted in my site shortly. Thanks for your excellent site, you are a wealth of information!

  2. Thanks for the heads up Zachary. Using this code in production though and not having any real problems with it just yet. Would you mind e-mailing me and letting me know what part in particular that is not working for you?

  3. Sorry, I realized I never rounded back with you about this.

    The issue is that what you posted has the IP variable being assigned twice:
    IP=`echo ${1} | sed ‘s/::ffff://’`

    I updated the code and reposted it (with a link back to your site of course). I expanded upon it a bit as well, http://www.the-little-things.net/blog/2011/01/21/big-ip-sharepoint-2010-monitor/

    Thanks again for the initial script.

  4. it doesn’t work on 9.3.0 version. Please help if you find any bugs

  5. Thank you for this. It works perfectly across all our NTLM authenticated servers. Being able to curl command from the F5 to see the expected returned text helps. The above code worked for me but I could not get Zachery’s code to work.

  6. Hello,
    I’m in v10.2.4 and uses this script.
    In a lot of pool monitored by this script, i very often have in logs
    echo “EAV myNTLMscript: exceeded monitor interval, needed to kill myip:myport with PID xxx
    Is it normal?
    I don’t really understand on what that is corresponding and if it is a problem. Monitor seems be functionnal but I have a a lot of theses logs. Could you explain me it?

  7. Maverick, I have not seen this issue myself, but I will certainly follow up and see if I can figure something out about this.

  8. Thank yo for your answer. To my mind, we sent the first monitoring test and a second is sent, but we don’t have receive the answer of the first test, that’s why we have this log message.
    Correct me if it is other thing.