6

I want to create a single robots.txt file and have it served for all sites on my IIS (7 in this case) instance.

I do not want to have to configure anything on any individual site.

How can I do this?

Tim Erickson
  • 163
  • 1
  • 1
  • 5

4 Answers4

5

An alternative to the robots.txt file is the X-Robots-Tag HTTP header, as detailed here:

http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html

Which can be applied server-wide on IIS by adding a custom HTTP Header

IIS 6: right-click on the "Web Sites" folder > Properties > HTTP Headers

IIS 7: on the server home screen, click on HTTP Request Headers, choose "add"

Unlike robots.txt, this appears to be proprietary to Google and like robots.txt it is only useful against "compliant" search engine indexers.

Ben
  • 151
  • 1
  • 2
4

It can be done using the Url Rewrite module for IIS.

Create these folders:

\Inetpub\wwwroot\allsites
\Inetpub\wwwroot\site1
\Inetpub\wwwroot\site2

Create 2 websites using the path of site# above. Inside each website, create a virtual directory called allsites pointing to \Inetpub\wwwroot\allsites

Next, create these files. Each should have unique content to verify this is working during testing:

\Inetpub\wwwroot\allsites\robots.txt
\Inetpub\wwwroot\site2\robots.txt

Install the Url Rewrite module for IIS if you have not done so already.

Place this in the web.config of each website:

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>
        <rewrite>
            <rules>
                <clear />
                <rule name="Rewrite robots.txt">
                    <match url="^(robots.txt)$" />
                    <conditions logicalGrouping="MatchAll" trackAllCaptures="false">
                        <add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
                    </conditions>
                    <action type="Rewrite" url="/allsites/robots.txt" />
                </rule>
            </rules>
        </rewrite>
        <directoryBrowse enabled="true" />
    </system.webServer>
</configuration>

What this rule does is matches a url such as http://mysite/robots.txt, and rewrite it to request http://mysite/allsites/robots.txt instead. However, it will ONLY do this if the robots.txt file doesn't exist on the filesystem at that location.

So you can put a common robots.txt in allsites, but override it any site you want by placing a custom robots.txt in the website root.

This is a not a redirect. The remote web crawler will have no idea that IIS is doing this behind the scenes.

Update:

I haven't done this on my configuration, but the Url Rewrite module does support global rules which can be defined at the server level. So you would not need to define this for each site.

http://learn.iis.net/page.aspx/460/using-the-url-rewrite-module/

"Global and distributed rewrite rules. URL Rewrite uses global rules to define server-wide URL rewriting logic. These rules are defined within the applicationHost.config file, and they supercede rules configured at lower levels in the configuration hierarchy. The module also uses distributed rules to define URL rewrite logic specific to a particular configuration scope. This type of rule can be defined on any configuration level by using Web.config files."

  • The is optional... just easier to see what's happening during testing. – Casey Plummer Jul 29 '10 at 21:55
  • This is about the best I expected could be done, but still requires configuration changes on each website. I may have to go that route, but still would hope not to. – Tim Erickson Jul 29 '10 at 22:49
  • Added update about global rules, and you could probably script the virtual directory creation. – Casey Plummer Jul 30 '10 at 01:07
  • Wow - I didn't know you could do this with the URL Rewrite module. Nice answer :) – Pure.Krome Jul 30 '10 at 13:39
  • I have accepted this even though I have not tested it's implementation as yet. It sounds right, though, and I had insufficient rep to simply upvote the answer. So unless and until it's proven otherwise, this is the right answer. Thanks, Casey! – Tim Erickson Jul 30 '10 at 17:23
1

Can you use symbolic links? Would that work?

http://www.howtogeek.com/howto/windows-vista/using-symlinks-in-windows-vista/

Dale C. Anderson
  • 577
  • 1
  • 5
  • 13
  • No - that would require manual creation of a symlink for each existing and then each new added site - i.e. "configuration on an individual site". – Tim Erickson Jul 29 '10 at 15:57
0

Unfortunately, because the robots.txt file must be in the root of the site, there's no simple way that I can think of doing what you want. If it was something that was one directory down then you could configure a virtual directory in each site, but that's just not applicable for the robots.txt file.

Therefore, short of writing an app/service that would xcopy a robots.txt file into each site on a periodic basis, you could possibly configure up a rewrite rule in each site that would rewrite (not redirect) the ~/robots.txt request to serve up a file from a virtual directory, or possibly a different URL altogether.

Ted
  • 248
  • 2
  • 5
  • 16