Every day my access-log looks kind of this:
66.249.78.140 - - [21/Oct/2013:14:37:00 +0200] "GET /robots.txt HTTP/1.1" 200 112 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.78.140 - - [21/Oct/2013:14:37:01 +0200] "GET /robots.txt HTTP/1.1" 200 112 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.78.140 - - [21/Oct/2013:14:37:01 +0200] "GET /vuqffxiyupdh.html HTTP/1.1" 404 1189 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
or this
66.249.78.140 - - [20/Oct/2013:09:25:29 +0200] "GET /robots.txt HTTP/1.1" 200 112 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.75.62 - - [20/Oct/2013:09:25:30 +0200] "GET /robots.txt HTTP/1.1" 200 112 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.78.140 - - [20/Oct/2013:09:25:30 +0200] "GET /zjtrtxnsh.html HTTP/1.1" 404 1186 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
The bot calls the robots.txt twice and after that tries to access a file (zjtrtxnsh.html, vuqffxiyupdh.html, ...) which cannot exist and must return a 404 error. The same procedure every day, just the unexisting html-filename changes.
The content of my robots.txt:
User-agent: *
Disallow: /backend
Sitemap: http://mysitesname.de/sitemap.xml
The sitemap.xml is readable and valid, so there seems to be no reason why the bot should want to force a 404-error.
How should I interpret this behaviour? Does it point to a mistake I've done or should I ignore it?
UPDATE
@malware I scanned my website with several online-tools, nothing was found.
I have none of the standard-apps on the server like wordpress or phpmyadmin.
I receive a logwatch every day and there was no unauthorized ssh-access or something like that.
I have fail2ban set up.
I have restricted ssh-access to publickeys, no root-login allowed.
There was none of the sudo-commands which logwatch reported which I could not recognize as things that I've done that day.
There is no file in my web-directory which is new or not created by me or looks kinda weired (okay I cannot guarantee that 100%, but all looks okay).
I've done a full clamscan on the server without any result.
The softwarepackages are up-to-date.
What else can I do?