Spring Sale! Save 30% on all books w/ code: PLANET24
Web Dev + WordPress + Security

Example of a Spoofed Search Engine Bot

While solving the recent search engine spoofing mystery, I came across two excellent examples of spoofed search engine bots. This article uses the examples to explain how to identify any questionable bots hitting your site.

Spoofed Search Engine Bot #1

The first example I have for you today was reported like this in my site’s access log (note that the requested domain has been changed to example.com):

TIME: February 20th 2016, 12:29am
REQUEST: http://example.com/info.php.suspected_
SITE: http://example.com/
REFERRER: http://www.googlebot.com/bot.html
QUERY STRING: undefined
REMOTE ADDRESS: 117.26.86.90
PROXY ADDRESS: 117.26.86.90
HOST: 90.86.26.117.broad.pt.fj.dynamic.163data.com.cn
REMOTE IDENTITY: undefined
USER AGENT: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Breakdown: Here we have a bot that reports itself as Googlebot:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

According to my list of user agents for top search engines, this is one of the actual user agents used by Google. So if you had been paying attention only to the user agent, the request may seem legit. But then again, look at the request:

http://example.com/info.php.suspected_

That may be something that Google is looking for, but doubtful; it looks more like something that a bad bot would want to find. So digging a little deeper into the logged data, we see that the referrer also looks legit. At first glance anyway.

http://www.googlebot.com/bot.html

If you understand what a referrer actually is, then you question why it’s reported this way. It’s as if the bot is trying to convince you that the request was the first one made when Googlebot clocked in this morning. Yep, boot up and head over to http://example.com/info.php.suspected_ first thing. Riiight.

So what is the dead giveaway that this search-engine bot is spoofed? After all, from the user agent and referrer, it looks like the real deal. And the IP address doesn’t really say anything without doing an actual lookup, and who is going to bother with that. No, the real key to identifying this request as bogus is the reported host name:

90.86.26.117.broad.pt.fj.dynamic.163data.com.cn

Yeah, that doesn’t even look like anything Google would be using. It’s a Chinese TLD, after all. To verify the illegitimacy of this bot, we can do a forward-reverse DNS lookup to get the following results:

Details of 117.26.86.90
IP Address : 117.26.86.90
Location   : China (95% accuracy)
Host Name  : 90.86.26.117.broad.pt.fj.dynamic.163data.com.cn

Bingo. It’s a fake bot. Nice try, but I think I will to ban you using BBQ Pro. Next..

Spoofed Search Engine Bot #2

Next example of a spoofed search-engine bot, we see the following report in our server’s access logs:

TIME: February 10th 2016, 10:52pm
REQUEST: http://example.com/wp-content/plugins/Login-wall-etgFB/login_wall.php?login=cmd&z3=aW5mb3MucGhw&z4=L3dwLWNvbnRlbnQvcGx1Z2lucy8%253d
SITE: http://example.com/
REFERRER: example.com
QUERY STRING: login=cmd&z3=aW5mb3MucGhw&z4=L3dwLWNvbnRlbnQvcGx1Z2lucy8%253d
REMOTE ADDRESS: 195.154.194.111
PROXY ADDRESS: 195.154.194.111
HOST: 195-154-194-111.rev.poneytelecom.eu
REMOTE IDENTITY: undefined
USER AGENT: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

This pretty much is the same sort of deal as before, only this time the spoofed request is coming from a well-known (and obnoxious) proxy/spam server:

195-154-194-111.rev.poneytelecom.eu

So in this case, the user agent reports as legit Googlebot, but there two giant red flags:

  • The requested URI is typical of an exploit scan
  • And of course the host name isn’t something associated with Google

Looking up the IP address, we confirm the fakeness:

Details of 195.154.194.111
IP Address : 195.154.194.111
Location   : France (95% accuracy)
Host Name  : 195-154-194-111.rev.poneytelecom.eu

So moral of the story: just because some bot claims to be Googlebot or some other legit bot, it doesn’t mean that it’s true. If in doubt, examine your logs and then forward/reverse lookup to reveal true identity.

About the Author
Jeff Starr = Creative thinker. Passionate about free and open Web.
WP Themes In Depth: Build and sell awesome WordPress themes.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
Digging Into WordPress: Take your WordPress skills to the next level.
Thoughts
I live right next door to the absolute loudest car in town. And the owner loves to drive it.
8G Firewall now out of beta testing, ready for use on production sites.
It's all about that ad revenue baby.
Note to self: encrypting 500 GB of data on my iMac takes around 8 hours.
Getting back into things after a bit of a break. Currently 7° F outside. Chillz.
2024 is going to make 2020 look like a vacation. Prepare accordingly.
First snow of the year :)
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.