Simple IP-Detection Bad for SEO
In general, Perishable Press enjoys generous ranking in Google’s search-engine results. The site’s many pages bring in lots of traffic for some great keywords, and a direct search for “Perishable Press” returns the first spot, with eight featured site links even. And recently, after switching servers, traffic increased even further. Things were going well, and it seemed like the perfect opportunity to finally renovate and redesign the site. So I dive in..
And then approximately 24-48 hours after beginning work on the new design, BAM – suddenly Google cuts my traffic by 75% and removes most of my pages from appearing in the search results. For example, the home page was not among the search results for “perishablepress.com
” – so obviously something bad had happened, and my long-standing, reputable website had been penalized by Google.
What happened
While designing the new site, I needed a way to detect IP address for any requests for the home page:
https://perishablepress.com/
During development, requests for that URL returned the root index.php
file with the following PHP logic:
<?php // IP-based WP-loading
if ($_SERVER['REMOTE_ADDR'] == '123.456.789.0') {
define('WP_USE_THEMES', true);
require_once("./perish/wp-blog-header.php");
} else {
define('WP_USE_THEMES', true);
require_once("./press/wp-blog-header.php");
} ?>
The new design is happening via second installation of WordPress in its own subdirectory, /perish/
. The previous site also exists in its own subdirectory, /press/
. So during development, I needed a way to load the new WordPress installation for my IP address, and the old WordPress installation for all other IPs. And that’s exactly what the above logic handles so elegantly.
Google don’t like it
After implementing the IP-detection, I continued site development and everything was working great, until about 24-48 hours later when I noticed that my pages were being excluded from the search results, seriously decreasing traffic from Google. Just prior to the traffic drop, only three significant changes were made to the site:
- Installed new subdirectory WordPress
- Setup IP-based loading of WordPress
- Removed a canonical redirect of
/press/
to root
The removal of the canonical redirect resulted in one page of duplicate content (on both /press/
and home page), but that wouldn’t be reason for such drastic measures from Google. After much scrambling to determine the issue, it became apparent that Google had detected the IP-detection script that I was using to conditionally load WordPress for the site’s home page.
I don’t have any solid evidence to support this, but my best guess is that Google somehow detected the script, disapproved, and penalized my site by dropping it from the search results. I would have contacted someone at Google to verify this, but apparently they are too big to be bothered with us humans.
Why Google hates them
Since discovering/reasoning all this, I’ve removed the IP-detection script and will continue with the redesign live & in real-time. While we wait and see whether or not that in fact resolves the issue, it is interesting to consider why Google penalizes something as simple as an IP-detection script. Here’s what Google Webmaster Central has to say about cloaking, sneaky Javascript redirects, and doorway pages:
Cloaking refers to the practice of presenting different content or URLs to users and search engines. Serving up different results based on user agent may cause your site to be perceived as deceptive and removed from the Google index.
Although it doesn’t mention IP addresses, the take-home message seems to imply that any form of cloaking – via user-agent, IP, referrer, etc. – is strictly forbidden. I get the logic behind this policy, but a quick message in the Webmaster Tools dashboard would have been so absolutely helpful and time-saving.
Here is an example of a simple message that would have saved significant time, energy, and resources:
We have detected you detecting us. Please stop or we will shut you down. – Love, Google
Something as simple and automated as that would alleviate much stress:
- You can’t just “contact” Google and ask them what’s up
- You’d know why your pages no longer appear in the search results
- You’d know that Google requires action
- You’d have a good idea of how to resolve the issue
- You’d know that Google has the “shoot first, you deal with it” mentality
And so even better than an “oh-by-the-way” message would be Google sending notification before killing your site. Why not give people a chance to resolve potential issues before just sending in the terminators to wipe them out.
Lesson learned, moving on
Moral of the story: If you need to serve different content to different users, use something more stealth than a simple PHP script to make it happen. If Google even gets a whiff of anything it doesn’t approve, it will shut you down with absolutely zero notice.
40 responses to “Simple IP-Detection Bad for SEO”
Hey Jeff,
Don’t mean to pile on, but “What everybody else starting with Jan said.” :) No way for Google to know.
UNLESS your changes had some other affect and actually broke portions of your site. What I’d like to see is the before and after code? What did your code look like before you created your detection script? Was it just this, or something else?
define('WP_USE_THEMES', true);
require_once("./press/wp-blog-header.php");
-Mike
It is certainly good to know that Google can’t detect PHP logic. I don’t know why I went along with the idea that they could (I had read somewhere), but it does seem kind of silly now.
But, the site was down due to network outages in Seattle about three days prior to losing traffic. Checking in with Webmaster Tools reveals thousands of 403 crawl errors.
The basic logic of the detection code was a simple if/else statement, with me getting WP served from the
/perish/
directory and everyone else getting the/press/
stuff, which included additional scripting to account for multiple/alternate themes.I’ll post the code in its entirety here soon. It’s useful as-is, but may also provide some clues as to why this all happened in the first place.
Jeff,
Maybe this had something to do with it:
http://www.mattcutts.com/blog/type/googleseo/
Hi Gert,
Yes, I definitely think that was a factor in what happened, which seems now in hindsight to be a combination of changing servers, site restructuring, and the algorithm changes.
Either way, traffic is returned to normal levels, so it’s all good, as they say.
Jeff, was google analitycs active on your blog while using the ip-based trick?
If yes, perhaps google analytics gather some info about the page you visit, and this could explain the problem you had… but i’ll bet my 2 cents on some other factor.
I wish, but no – I don’t use Google Analytics here at Perishable Press. BUT it would have been so useful to have that information. I bet you are right – looking back & zooming out, I’m pretty sure that the traffic drop was due to multiple factors all sort of happening around the same time: server change, structure change, site downtime, and algorithm changes.
i wonder how can Google realize the cloaks
do they check a page with multiple IPs ?
They can cross-check user-agents and IPs, but as discussed above, they would have no way of spoofing my IP address, and so probably didn’t detect the script.
Most likely the traffic drop was the result of recent site, server, and structural changes combined with recent period of downtime and changes in the Google algorithm.
That makes sense why Google did that because they would see that script as what is typical of scam sites to do (which I have seen myself). Scam sites will typically redirect you (like that script) hiddenly (is that even a word?) to a page dependent on if you are a search engine or dependent on what country you are from. I have seen this typical with fake news sites talking about a wonderful job opportunity in your particular area .. or .. phishing pharma sites that are trying to scam people.
So even though Google was wrong about your script it does make sense why Google did that and makes me realize that Google is trying to do things to prevent scam sites from being ranked in search engine results.
I know I did read something about this in Matt Cutts blog (he works for Google).
I love your redesign. Clean and classy
You are one of my very favorite designations and the redesign is a pleasure to behold!!
Thanks so much for sharing your wisdom
Unh… SEO is such a pain!
The new site is beautimous, btw ;)
I, as many others before me, agree that your IP-based switcharoo was probably not the source of your agony.
Where I disagree is when everyone assumes that Google can’t see what you see from your browser. Anyone with a Google tool bar – or other similar tool bar – does open the door…
Whether they use it or how they use it is something I’m not privy to. However, there’s no doubt that it gives them so access while you’re browsing your site – even if only trivial stuff like meta keywords or descriptions.
Best,
Bob