Find a log entry that you suspect is the Googlebot and make a note of the IP address.
Next do a lookup on that IP address with the following command:
host 66.249.64.156
Don't forget to substitute the IP address you recorded earlier with this command.
If the result looks something like this then you know it's the Googlebot. You want make sure it ends in googlebot.com
:
156.64.249.66.in-addr.arpa domain name pointer crawl-66-249-64-156.googlebot.com.
Next, go to your Apache2 Virtualhost and add these directives adapted for your site:
SetEnvIf Remote_Addr "66.249.64.156" AND User-Agent "Googlebot" do_not_log
CustomLog ${APACHE_LOG_DIR}/access.log combined env=!do_not_log
You can repeat this process for the bingbot:
host 157.55.39.247
The entry should have something that ends in search.msn.com
like this
247.39.55.157.in-addr.arpa domain name pointer msnbot-157-55-39-247.search.msn.com.
So you would add the additional line in the Virtualhost file after the Googlebot line:
SetEnvIf Remote_Addr "157.55.39.247" AND User-Agent "bing" do_not_log
Usually the Googlebot and MSN bot will use the same IP to check your pages, but if not you may need to add additional entries. You may just want to use "^66"
out of convenience.
https://support.google.com/webmasters/answer/80553
https://blogs.bing.com/webmaster/2012/08/31/how-to-verify-that-bingbot-is-bingbot/