I want to create a single robots.txt file and have it served for all sites on my IIS (7 in this case) instance.
I do not want to have to configure anything on any individual site.
How can I do this?
I want to create a single robots.txt file and have it served for all sites on my IIS (7 in this case) instance.
I do not want to have to configure anything on any individual site.
How can I do this?
An alternative to the robots.txt file is the X-Robots-Tag HTTP header, as detailed here:
http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html
Which can be applied server-wide on IIS by adding a custom HTTP Header
IIS 6: right-click on the "Web Sites" folder > Properties > HTTP Headers
IIS 7: on the server home screen, click on HTTP Request Headers, choose "add"
Unlike robots.txt, this appears to be proprietary to Google and like robots.txt it is only useful against "compliant" search engine indexers.
It can be done using the Url Rewrite module for IIS.
Create these folders:
\Inetpub\wwwroot\allsites
\Inetpub\wwwroot\site1
\Inetpub\wwwroot\site2
Create 2 websites using the path of site# above. Inside each website, create a virtual directory called allsites pointing to \Inetpub\wwwroot\allsites
Next, create these files. Each should have unique content to verify this is working during testing:
\Inetpub\wwwroot\allsites\robots.txt
\Inetpub\wwwroot\site2\robots.txt
Install the Url Rewrite module for IIS if you have not done so already.
Place this in the web.config of each website:
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
<rewrite>
<rules>
<clear />
<rule name="Rewrite robots.txt">
<match url="^(robots.txt)$" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{REQUEST_FILENAME}" matchType="IsFile" negate="true" />
</conditions>
<action type="Rewrite" url="/allsites/robots.txt" />
</rule>
</rules>
</rewrite>
<directoryBrowse enabled="true" />
</system.webServer>
</configuration>
What this rule does is matches a url such as http://mysite/robots.txt
, and rewrite it to request http://mysite/allsites/robots.txt instead
. However, it will ONLY do this if the robots.txt file doesn't exist on the filesystem at that location.
So you can put a common robots.txt in allsites, but override it any site you want by placing a custom robots.txt in the website root.
This is a not a redirect. The remote web crawler will have no idea that IIS is doing this behind the scenes.
Update:
I haven't done this on my configuration, but the Url Rewrite module does support global rules which can be defined at the server level. So you would not need to define this for each site.
http://learn.iis.net/page.aspx/460/using-the-url-rewrite-module/
"Global and distributed rewrite rules. URL Rewrite uses global rules to define server-wide URL rewriting logic. These rules are defined within the applicationHost.config file, and they supercede rules configured at lower levels in the configuration hierarchy. The module also uses distributed rules to define URL rewrite logic specific to a particular configuration scope. This type of rule can be defined on any configuration level by using Web.config files."
Can you use symbolic links? Would that work?
http://www.howtogeek.com/howto/windows-vista/using-symlinks-in-windows-vista/
Unfortunately, because the robots.txt file must be in the root of the site, there's no simple way that I can think of doing what you want. If it was something that was one directory down then you could configure a virtual directory in each site, but that's just not applicable for the robots.txt file.
Therefore, short of writing an app/service that would xcopy a robots.txt file into each site on a periodic basis, you could possibly configure up a rewrite rule in each site that would rewrite (not redirect) the ~/robots.txt request to serve up a file from a virtual directory, or possibly a different URL altogether.