Highest Voted 'googlebot' Questions - Server Fault Stack Exchange

8

votes

3 answers

AH01797: client denied by server configuration: /usr/share/doc

Since quite a while (over a month now) I see lines like the following in the apache logs: 180.76.15.138 - - [24/Jun/2015:16:13:34 -0400] "GET /manual/de/mod/module-dict.html HTTP/1.1" 403 396 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0;…

apache-2.4 googlebot

asked Jun 25 '15 at 05:07

matpen

387
2
4
10

3

votes

3 answers

fail2ban ignoreip DNS host example?

I would like to add ".googlebot.com" to the ignore iplist for fail2ban since the ignoreip explanation mentions DNS host as an accepted input. Is this a proper format? # "ignoreip" can be an IP address, a CIDR mask or a DNS host. Fail2ban will not #…

fail2ban googlebot

asked Dec 13 '13 at 04:34

giorgio79

1,747
9
25
36

3

votes

1 answer

Why is googlebot requesting robots.txt from my SSH server?

I run ossec on my server and periodically I receive a warning like this: Received From: myserver->/var/log/auth.log Rule: 5701 fired (level 8) -> "Possible attack on the ssh server (or version gathering)." Portion of the log(s): Nov 19 14:26:33…

ssh web-crawler robots.txt googlebot

asked Nov 19 '13 at 19:40

Brian

766
1
6
14

2

votes

1 answer

What's with random-character queries coming from googlebot, e.g., vvytnoxvontwusz.html?

One of my sites has been getting queries from googlebot, on the order of: example-log:66.249.79.216 - - [06/Apr/2016:15:36:56 -0700] "GET /vvytnoxvontwusz.html HTTP/1.1" 404 15136 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;…

apache-2.2 robots.txt googlebot

asked Apr 07 '16 at 00:43

Jim Miller

713
2
11
23

2

votes

1 answer

Google-Bot fell in love with my 404-page

Every day my access-log looks kind of this: 66.249.78.140 - - [21/Oct/2013:14:37:00 +0200] "GET /robots.txt HTTP/1.1" 200 112 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" 66.249.78.140 - - [21/Oct/2013:14:37:01…

http-status-code-404 robots.txt googlebot

asked Oct 21 '13 at 20:28

32bitfloat

253
2
3
9

2

votes

1 answer

Nginx Googlebot rewrite rules failing with 404

Our site is based on Angular which makes it almost completely JavaScript based, therefore we need to serve static HTML snapshots to the Googlebot in order for it to crawl us. At the moment, we have this implementation in place: location / { #…

nginx rewrite googlebot

asked Jul 26 '13 at 14:45

Tyler Alex

21
1

2

votes

1 answer

Apache : with Googlebot connections, a single process takes all server memory

Following https://serverfault.com/questions/418735/unbelievable-issue-a-single-apache-process-takes-4-gb-of-memory I post as a new question because I was able to identify the fact that it happens when the client connecting is Googlebot. By "it", I…

apache-2.2 php memory googlebot

asked Aug 17 '12 at 23:35

db_ch

638
5
14
20

2

votes

1 answer

How can a nameserver block Google bot?

Background: Our domain page.et is not accessible by Google's mobile-friendly checking tool and search console. The same seems to be true for all other .et domains I tested. The reason is not the robots.txt. Google bot does not even try to make a…

domain-name-system domain google googlebot

asked Dec 28 '21 at 13:57

Alex

476
13
35

1

vote

1 answer

Block googlebot on a specific page using nginx

We're currently being crawled at a greater rate than we can handle. I can't seem to get nginx blocking the googlebot server { location /ajax/sse.php { if ($http_user_agent ~* "Mozilla/5.0 (compatible; Googlebot/2.1;…

nginx http-status-code-403 googlebot

asked Mar 23 '17 at 18:33

Aidan Ewen

271
1
4
11

1

vote

0 answers

Enabling TLS/SSL with SNI on a subset of websites, without losing SEO ranking on the non-TLS sites

We run a number of LAMP servers on AWS with a few dozen websites on them, that customers pay us to design, build and host. They're Ubuntu 14.04 servers with Varnish, Apache and PHP. Currently, if a customer wanted to have SSL/TLS for their website,…

ssl https sni seo googlebot

asked Mar 01 '17 at 13:42

Martijn Heemels

7,438
6
39
62

1

vote

1 answer

How to prevent Google Favicon bot to call to my site?

I have some backend url that I use for myself in google chrome only. It's not open public. However for some reason, this bot "Google Favicon" ip located at Google call this URL which I do not want. My guess is Google get this URL from my Google…

googlebot

asked Apr 20 '16 at 03:54

Paiboon Panusbordee

167
1
9

1

vote

1 answer

Allow Google To Bypass Firewall Nginx

So I am looking for a system which essentially returns a 401 for every visitor that doesn't have a certain cookie. I would like to make it so if the visitor/requester is google then it does not return the 401. So here is the following code that I…

nginx google googlebot

asked Feb 24 '16 at 00:15

Eddie Chrisman

11
2

1

vote

2 answers

block fake google bots

How could I block DDOS attacks with fake Google bots? I found 2 solutions on the net. But both seems to block also correct google bots. # Block fake google when it's not coming from their IP range's (A fake googlebot) [F] => Failure RewriteCond…

.htaccess ddos googlebot

asked Nov 02 '15 at 19:22

Matthias Jaekle

111
2

1

vote

2 answers

Googlebot requesting pages of 1 site on another site

Problem: Using Prerender.io to index/store pages of one site, I keep getting path requests that only exist on my old site Example: on Prerender I'll see that Googlebot requested http://www.new-site.com/old/site/path I have an old website…

seo googlebot

asked May 15 '15 at 11:17

Maruf

159
9

1

vote

1 answer

Trouble filtering googlebot from apache access log

Though it seems like it should be pretty straightforward, I have been unable to configure apache so that googlebot's requests are not stored in the access log. I've tried the following lines: SetEnvIfNoCase User-Agent googlebot…

logging apache-2.4 googlebot

asked Apr 07 '15 at 12:42

Jonathan Basile

123
5

Questions tagged [googlebot]