Questions tagged [googlebot]

30 questions
1
vote
1 answer

Google-bot trips on a perfectly normal robots.txt, then on a nonexistent robots.txt

I have two domain names pointing to the same virtual server. One of them, http://ilarikaila.com, is a working brochure website I made for a friend. I used the other one, http://teemuleisti.com, to test-drive the site before making it public – in…
Teemu Leisti
  • 123
  • 7
1
vote
1 answer

Googlebot cant access my site webmaster tools reply Unreachable robots.txt

When I try to fetch my site as a googlebot in webmaster tools it return Unreachable robots.txt, after investigate I understood google bot can see my server: tcpdump | grep google It returns that google can access my server with IP aa.bb.cc.xx or…
1
vote
1 answer

Googlebot repeatedly looks for files that aren't on my server

I'm hosting a site for a volunteer organization. I've moved the site to WordPress, but it wasn't always that way. I suspect at one point it was hacked badly. My Apache error log file has grown to 122 kB in just the past 18 hours. The large…
John
  • 157
  • 5
1
vote
1 answer

High CPU load caused by bot traffic

Google bot crawl rate is every 2 seconds and it creates about 1.0-1.5 CPU load (average of 1 min) on a KVM host and a VM(web server) until the bot stops around 4AM. If you see the graph, there is not much traffic outgoing through Firewall's WAN…
0
votes
1 answer

Moved website to new server - updated DNS - web crawlers still hitting old site by IP

About ten days ago I moved a site - mostly a Joomla discussion board - to a new server at a different IP address. During a brief scheduled downtime I replicated the content over and completed DNS switchover (via Cloudflare) as usual, and most…
Ryan
  • 81
  • 1
  • 8
0
votes
1 answer

Google bot cannot read my web site

I am getting from time to time a message from Google bot that it cannot access my web site. Over the last 24 hours, Googlebot encountered 1 errors while attempting to retrieve DNS information for your site. The overall error rate for DNS…
Gravity
  • 325
  • 3
  • 10
0
votes
0 answers

Apache duplicate every GET request made by Googlebot

System: Linux 3.10.47.core2.24 Apache: Most likely version 2.2 (can`t check that) Server API: Apache 2.0 Handler Apache API Version: 20051115 In logs requests looks like this: 94.*.*.* - - [26/Nov/2014:01:06:52 +0100] "GET…
0
votes
0 answers

nginx serve different html file for googlebot

I have an angular app served through nginx. For googlebot I want to serve a different static html file so that it can index properly, is the following nginx config correct? (I don't want to complicate the setup using phantomjs, I want to explore…
0
votes
1 answer

apache rewrite syntax

Trying to block Google bot and others from accessing some of my sites. Thing is I have one box that has a ton of virtual host files that do nothing more than do a proxy pass to other servers. I would like to block googlebot and would like avoid…
skeelime
  • 1
  • 1
0
votes
1 answer

Googlebot incrementing page id

So here is an example of a hit I'm getting from the googlebot: 66.249.73.171 - - [19/Feb/2013:16:12:39 -0500] "GET /eghm-blah.php?pid=2855 HTTP/1.1" 200 1684 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" My posts…
BOMEz
  • 103
  • 4
0
votes
1 answer

How to fix googlebot Server Connectivity

I get 'Server Connectivity' error at google webmaster tool. I suspect it is because of iptables rules that I've set to counter some DDoS attacks, thugh I'm not sure which rules could be relevant. This may also help to know that I use Varnish/nginx…
alfish
  • 3,027
  • 15
  • 45
  • 68
0
votes
2 answers

Will I be blocking the IP of some google related service?

In my sites I have created a script that sends me an email every time a new ip claiming to be google visits the site. When I see the email I go to check (for example on whois.com) if the ip that claims to be google is really google, and if not, I…
alebal
  • 67
  • 3
0
votes
0 answers

WAF(modsecurity) / Plesk IP Banned, is it Googlebot? Is it a false positive? Is it a malicious IP?

I was alerted by my Plesk server that an IP Address had been banned. Normally I don't check banned IPs, but this one happened to coincide with our site going down for 1 minute at the same time. Banned the following ip addresses on Mon Jul 27…
-2
votes
1 answer

Googelbot finds my original URI, although I have a working rewrite directive

I have : RewriteRule ^Article/([^/]*)$ /article.php?newsid=$1 [L] Which means that the URL must be //example.com/Article/855563 but Google crawls //example.com/article.php?newsid=855563. Is there anything I can do to prevent this? Or to redirect…
-4
votes
1 answer

How Can I Encourage Google to scan New robots.txt File?

I just updated my robots.txt file on a new site; Google Webmaster Tools reports it read my robots.txt 2 days before my last update. my last robots.txt had a "disallow: all" raw. Is there any way I can encourage Google to re-read my robots.txt as…
1
2