Crawler Issue with SharePoint 2010 and Windows Server 2008 R2

I recently identified an issue with crawling anonymous web sites (non-SharePoint) with SharePoint 2010 where it would always fail with the error “Access Denied”.   I did, however, have one SharePoint 2010 farm that could crawl anonymous web sites without any issue.   The task quickly became identifying the differences between the farms.

Using Fiddler I was able to intercept all of the HTTP traffic from the SharePoint search crawler and inspect the outgoing and incoming headers.   I immediately noticed that on the server farm that was having issues the crawler was attempting to pass an authorization header during the first connection attempt to the web site.   To me this was a huge red flag and indicated a problem.   To ensure that I was not seeing something normal (which I was sure it wasn’t), I used fiddler against the other server farm that was crawling correctly.   On that farm I did not see the authorization header during any of the connections.

This made sense.  HTTP connections usually always start as anonymous and then if authentication is required the web server provides an appropriate response.  It is then up to the client application (in this case the crawler) to respond back with the authorization header including data appropriate for the web server.    For some reason on my one SharePoint 2010 farm it was trying to force the web site to immediately take an authorization header.   The web site had no understanding of the data and responded with “Access Denied”.

The only differences I could identify between the two farms was the operating systems.   The one that was working was built on Windows Server 2008 and the problem farm was built on Windows Server 2008 R2.    Since Windows Server 2008 R2 and Windows 7 are built on the same core code I decided to do some testing on a couple Windows 7 SharePoint development boxes I had.    Both of them exhibited the exact same problem as the Windows Server 2007 R2 farm.

I contacted Microsoft support and provided them with Fiddler logs and all of the data I collected during my troubleshooting.   Within 24 hours they were able to reproduce the exact same issues I was seeing.   Today I received a call back from Microsoft and they informed me that this issue is not a bug in SharePoint 2010 but instead is a problem with the actual operating system.   I was told to expect a call back within 2 weeks and hopefully by then they will have a hotfix to resolve this issue.  Once I have the hotfix and can validate that it is working I will update this post.

On a side note, according to Microsoft I was the first to identify this issue.   Maybe I can get the hotfix named after me.  :,,)

July 21 Update
After several conversations with Microsoft it appears that the issue is related to security changes made in Windows Server 2008 R2 and Windows 7.  Microsoft is still determining what the long term resolution will be, however, I do have a temporary work around.  The issue only seems to effect web servers running IIS that are configured with both Windows Integrated Authentication and Anonymous Authentication.   If you disable Windows Integrated Authentication the crawl will complete successfully.    If I get any additional updates I will post them here.

August 1 Update
Late last week I received a call from Microsoft support and they are still trying to determine how they are going to proceed.  The change in behavior in Windows Server 2008 R2 and Windows 7 was intentional, however, they are seeing now how it impacted SharePoint’s ability to search.   I was told that the Microsoft teams responsible for Windows Server and Windows 7 needs to decided if they are going to fix the issue.   Any changes they make to resolve this one specific issue could end up causing a whole mess of other problems.  Keep it tuned here for any future updates regarding this issue.

Final Update
There are a couple of options to resolve the issue:

  1. Turn off integrated authentication on the site that you are trying to crawl
  2. Install the August (or later) cumulative update for SharePoint 2010.

If you still continue to experience issues after installing the cumulative update it is recommended that you contact Microsoft Support and have a case open to track and resolve your specific issue.

13 thoughts on “Crawler Issue with SharePoint 2010 and Windows Server 2008 R2”

    1. There isn””t a KB article yet but I have been informed that as soon as a fix has been created and tested a KB article will be posted.

  1. I have a SharePoint site with NT Autehentication that indexes just fine. When I add client certificate authentication to the IIS site as optional, the crawler gets access denied client certificate required. 🙁

  2. I””m encouraged by your post, I look forward to any updates. We””ve been burning out trying to get the crawl to work. Extra layers for us is it we use https and that we are using Federated STS for claims based auth. Still, no windows service account we””ve created is able to crawl the site even though they can browse it.

    1. If you are using claims based authentication make sure you also have NTLM enabled on the web application or search crawling will also fail. SharePoint is a beast of a product with a lot of security dependencies. I am glad to see more blog posts and articles about SharePoint 2010 popping up that expalin how people are doing more complex configurations.

  3. To work around the optional IIS client certificate authentication problem I remove that type from IIS on one of the application servers. I then pointed the crawler to that application server using the crawlers host file.

    Did Microsoft give up on the patch? Seems like they been slipping ever since Ballmer took over.

  4. I had the same issue but was able to find a workaround.

    After I got the access denied error, I set up a crawl rule and tried different options in the “Specify Authentication” section and it appears the selecting last option “Use cookie for crawling” will prevent the authentication headers from being sent. To test this, I created a “cookie.txt” file with the word “test” in the file and selected this as the cookie to send for authentication. Now the full crawl of the anonymous public sites executes successfully. Hopefully this will buy us some time until an official hot fix from MS is released.

    hope this helps,
    sharepoint kb

  5. the tip from sharepoint kb didn””t help me!

    I got a “The filtering process has been terminated” message after 1 minute of crawling. Where must be placed this cookie.txt file?

Leave a Reply