New Bookstore! Save 20% on books with discount code: LAUNCH
Web Dev + WordPress + Security

Example of a Spoofed Search Engine Bot

While solving the recent search engine spoofing mystery, I came across two excellent examples of spoofed search engine bots. This article uses the examples to explain how to identify any questionable bots hitting your site.

Spoofed Search Engine Bot #1

The first example I have for you today was reported like this in my site’s access log (note that the requested domain has been changed to example.com):

TIME: February 20th 2016, 12:29am
REQUEST: http://example.com/info.php.suspected_
SITE: http://example.com/
REFERRER: http://www.googlebot.com/bot.html
QUERY STRING: undefined
REMOTE ADDRESS: 117.26.86.90
PROXY ADDRESS: 117.26.86.90
HOST: 90.86.26.117.broad.pt.fj.dynamic.163data.com.cn
REMOTE IDENTITY: undefined
USER AGENT: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Breakdown: Here we have a bot that reports itself as Googlebot:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

According to my list of user agents for top search engines, this is one of the actual user agents used by Google. So if you had been paying attention only to the user agent, the request may seem legit. But then again, look at the request:

http://example.com/info.php.suspected_

That may be something that Google is looking for, but doubtful; it looks more like something that a bad bot would want to find. So digging a little deeper into the logged data, we see that the referrer also looks legit. At first glance anyway.

http://www.googlebot.com/bot.html

If you understand what a referrer actually is, then you question why it’s reported this way. It’s as if the bot is trying to convince you that the request was the first one made when Googlebot clocked in this morning. Yep, boot up and head over to http://example.com/info.php.suspected_ first thing. Riiight.

So what is the dead giveaway that this search-engine bot is spoofed? After all, from the user agent and referrer, it looks like the real deal. And the IP address doesn’t really say anything without doing an actual lookup, and who is going to bother with that. No, the real key to identifying this request as bogus is the reported host name:

90.86.26.117.broad.pt.fj.dynamic.163data.com.cn

Yeah, that doesn’t even look like anything Google would be using. It’s a Chinese TLD, after all. To verify the illegitimacy of this bot, we can do a forward-reverse DNS lookup to get the following results:

Details of 117.26.86.90
IP Address : 117.26.86.90
Location   : China (95% accuracy)
Host Name  : 90.86.26.117.broad.pt.fj.dynamic.163data.com.cn

Bingo. It’s a fake bot. Nice try, but I think I will to ban you using BBQ Pro. Next..

Spoofed Search Engine Bot #2

Next example of a spoofed search-engine bot, we see the following report in our server’s access logs:

TIME: February 10th 2016, 10:52pm
REQUEST: http://example.com/wp-content/plugins/Login-wall-etgFB/login_wall.php?login=cmd&z3=aW5mb3MucGhw&z4=L3dwLWNvbnRlbnQvcGx1Z2lucy8%253d
SITE: http://example.com/
REFERRER: example.com
QUERY STRING: login=cmd&z3=aW5mb3MucGhw&z4=L3dwLWNvbnRlbnQvcGx1Z2lucy8%253d
REMOTE ADDRESS: 195.154.194.111
PROXY ADDRESS: 195.154.194.111
HOST: 195-154-194-111.rev.poneytelecom.eu
REMOTE IDENTITY: undefined
USER AGENT: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

This pretty much is the same sort of deal as before, only this time the spoofed request is coming from a well-known (and obnoxious) proxy/spam server:

195-154-194-111.rev.poneytelecom.eu

So in this case, the user agent reports as legit Googlebot, but there two giant red flags:

  • The requested URI is typical of an exploit scan
  • And of course the host name isn’t something associated with Google

Looking up the IP address, we confirm the fakeness:

Details of 195.154.194.111
IP Address : 195.154.194.111
Location   : France (95% accuracy)
Host Name  : 195-154-194-111.rev.poneytelecom.eu

So moral of the story: just because some bot claims to be Googlebot or some other legit bot, it doesn’t mean that it’s true. If in doubt, examine your logs and then forward/reverse lookup to reveal true identity.

Jeff Starr
About the Author
Jeff Starr = Web Developer. Security Specialist. WordPress Buff.
GA Pro: Add Google Analytics to WordPress like a pro.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
USP Pro: Unlimited front-end forms for user-submitted posts and more.
Thoughts
Take a screenshot with Firefox (no extension required). Open Developer Tools Settings and enable the “Take a screenshot” button. Then click the button :)
Take a screenshot with Chrome (no extension required). Open DevTools, type Cmd + Shift + P, then type screenshot.
After 10 years working on my 2010 iMac, my upgrade finally arrived. Shiny new iMac shipped from Ireland :)
Too much caffeine weirds me out. But I love the taste of coffee. So once in a while I enjoy a small cup of decaf. Hits the spot.
Chris Coyier is a truly awesome person. One of the finest people I've ever worked with. Just #gottasayit
Excel won't open CSV file because SYLK format? Open it with text editor and add an apostrophe ' at the beginning of the file, save changes, done.
Displaying too many social media buttons and links all over the place imho makes you look desperate and frankly kinda sad.
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.