2

In the Drupal logs on my dev site there are "page not found" reports that are obviously from a bot trying out well known urls (e.g. /wp-login). But I have set up apache basic auth and I am the only person who knows the password! If I go to those URLs in a browser, I get a 401, not a 404. What could possibly be happening?

I tried asking this on the Drupal stack exchange and they weren't having any of it, but I can't help thinking it's some Drupal weirdness.

EDIT:

The auth config is in /etc/apache2/ and is:

<Directory /var/www/html>
  AllowOverride All
  AuthType Basic
  AuthName "Authentication Required"
  AuthUserFile "/etc/htpasswd/.htpasswd"
  Require valid-user
  Order allow,deny
  Allow from all
</Directory>

When I go to a nonexistent url in a browser, I get the Basic Auth popup and I see this in the apache logs:

x.x.x.x - - [20/Feb/2019:23:44:15 +0100] "GET /horde3/imp/test.php HTTP/1.1" 401 920 "-" "Mozilla/5.0 (X11; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0"

But when the bot did it, it got a 404!

 80.87.85.75 - - [20/Feb/2019:21:43:57 +0100] "GET /horde3/imp/test.php HTTP/1.1" 404 21401 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0"

I'm less convinced it's a Drupal issue now, after seeing this in the Apache logs.

naomi
  • 121
  • 3
  • could you please add the link to the question at the Drupal stack exchange? – LLub Feb 20 '19 at 22:13
  • It was the exact same question. I got 3 downvotes and then it was put on hold, so I deleted it. – naomi Feb 20 '19 at 22:18
  • Could there be any user agent based logic on the server side? Different pages presented to browsers compared to crawlers? – Natanael Feb 20 '19 at 22:18
  • Could you please also add more details, for example, in each situation (401, 404), what are the urls, i.e. do they all start with http: or https: ? – LLub Feb 20 '19 at 22:19
  • 1
    Someone suggested it might be that the basic auth is on https but not http. There is a redirect from http -> https in the vhost config, so I'd have thought it woudn't matter. But I've added the basic auth lines to the http vhost as well and will see if that helps. – naomi Feb 20 '19 at 22:21
  • 404 can possibly be used for non-admin users due to privacy reasons, are you logged in with a specific user or role when 401 is returned? – LLub Feb 20 '19 at 22:22
  • @Natanael I don't think so - it's a standard Drupal install so presumably if Drupal did that then someone on the Drupal site would have spotted it – naomi Feb 20 '19 at 22:22
  • @Refineo They start with https (so ignore my comment above about http) – naomi Feb 20 '19 at 22:23
  • are you logged in with a specific user or role when 401 is returned vs bot that is likely on anonymous or guest privileges? – LLub Feb 20 '19 at 22:24
  • @Refineo the situation in which I get a 401 is when I use a fresh browser (no cookies, not logged in) and navigate to one of the URLs in the logs. I get the Apache auth popup – naomi Feb 20 '19 at 22:24
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/90004/discussion-between-refineo-and-naomi). – LLub Feb 20 '19 at 22:27

0 Answers0