I recently identified an issue with crawling anonymous web sites (non-SharePoint) with SharePoint 2010 where it would always fail with the error “Access Denied”. I did, however, have one SharePoint 2010 farm that could crawl anonymous web sites without any issue. The task quickly became identifying the differences between the farms.
Using Fiddler I was able to intercept all of the HTTP traffic from the SharePoint search crawler and inspect the outgoing and incoming headers. I immediately noticed that on the server farm that was having issues the crawler was attempting to pass an authorization header during the first connection attempt to the web site. To me this was a huge red flag and indicated a problem. To ensure that I was not seeing something normal (which I was sure it wasn’t), I used fiddler against the other server farm that was crawling correctly. On that farm I did not see the authorization header during any of the connections.
This made sense. HTTP connections usually always start as anonymous and then if authentication is required the web server provides an appropriate response. It is then up to the client application (in this case the crawler) to respond back with the authorization header including data appropriate for the web server. For some reason on my one SharePoint 2010 farm it was trying to force the web site to immediately take an authorization header. The web site had no understanding of the data and responded with “Access Denied”.
The only differences I could identify between the two farms was the operating systems. The one that was working was built on Windows Server 2008 and the problem farm was built on Windows Server 2008 R2. Since Windows Server 2008 R2 and Windows 7 are built on the same core code I decided to do some testing on a couple Windows 7 SharePoint development boxes I had. Both of them exhibited the exact same problem as the Windows Server 2007 R2 farm.
I contacted Microsoft support and provided them with Fiddler logs and all of the data I collected during my troubleshooting. Within 24 hours they were able to reproduce the exact same issues I was seeing. Today I received a call back from Microsoft and they informed me that this issue is not a bug in SharePoint 2010 but instead is a problem with the actual operating system. I was told to expect a call back within 2 weeks and hopefully by then they will have a hotfix to resolve this issue. Once I have the hotfix and can validate that it is working I will update this post.
On a side note, according to Microsoft I was the first to identify this issue. Maybe I can get the hotfix named after me. :,,)
July 21 Update
After several conversations with Microsoft it appears that the issue is related to security changes made in Windows Server 2008 R2 and Windows 7. Microsoft is still determining what the long term resolution will be, however, I do have a temporary work around. The issue only seems to effect web servers running IIS that are configured with both Windows Integrated Authentication and Anonymous Authentication. If you disable Windows Integrated Authentication the crawl will complete successfully. If I get any additional updates I will post them here.
August 1 Update
Late last week I received a call from Microsoft support and they are still trying to determine how they are going to proceed. The change in behavior in Windows Server 2008 R2 and Windows 7 was intentional, however, they are seeing now how it impacted SharePoint’s ability to search. I was told that the Microsoft teams responsible for Windows Server and Windows 7 needs to decided if they are going to fix the issue. Any changes they make to resolve this one specific issue could end up causing a whole mess of other problems. Keep it tuned here for any future updates regarding this issue.
There are a couple of options to resolve the issue:
- Turn off integrated authentication on the site that you are trying to crawl
- Install the August (or later) cumulative update for SharePoint 2010.
If you still continue to experience issues after installing the cumulative update it is recommended that you contact Microsoft Support and have a case open to track and resolve your specific issue.