2

I want to prevent web crawlers from using an apache site that is configured to forward all requests to a ProxyPass. I have tried the BrowserMatchNoCase directive to set an environment variable block_spider. When I change my user-agent in my web browser to masquerade as a search bot it still allowes me access to the site.

BrowserMatchNoCase "^bingbot" block_spider
BrowserMatchNoCase "^msnbot" block_spider
    <Proxy *>
      Order deny,allow
      Deny from env=block_spider
      Allow from all
    </Proxy>
RewriteEngine On
RewriteOptions Inherit
Eric
  • 31
  • 3

1 Answers1

1

Well this is embarrassing. I kept thinking the the order deny,allow meant the list was treated like an ACL or firewall rule when it really was not. The last rule being "Allow from all" overrode everything I denied. The correct config is this:

    BrowserMatchNoCase "^bingbot" block_spider
    BrowserMatchNoCase "^msnbot" block_spider
    <Proxy *>
      Order Allow,Deny
      Allow from all
      Deny from env=block_spider
    </Proxy>
Eric
  • 31
  • 3