Stop 404 Requests for Mobile Versions of Your Site

If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files:

  • http://domain.tld/apple-touch-icon.png
  • http://domain.tld/iphone
  • http://domain.tld/mobile
  • http://domain.tld/mobi
  • http://domain.tld/m

So some bot comes along, assumes that your site includes a mobile version, and then tries its hand at guessing the location. In the common request-set listed above, we see the bot looking first for an “apple-touch icon,” and then for mobile content in various directories. If this only happens once in awhile, it’s no big deal. But these days I’ve been seeing many different bots requesting these nonexistent resources.

Even worse, these mobile-hungry bots can’t seem to remember where they’ve been – they typically request the same resources repeatedly, and in multiple locations within the directory structure. I frequently see hundreds of these types of requests in my weekly error-log analyses. Needless to say, this is an incredible waste of time, bandwidth, and server resources.

It would be so nice..

So what’s the best solution? Well, obviously the ideal scenario would involve bots and scripts stopping this malicious behavior. Here are just a few ideas for confused bot masters:

  • Stop programming your bots to “assume and guess.”
  • Perform the crawl using a recognized “mobile” user-agent.
  • Remember the results of your initial crawl to avoid repeat 404 requests.

Basically, the engineers programming this sort of behavior into their bots need to realize:

  • If I take the time to setup a mobile site, rest assured that I’ll tell you where it’s at.
  • From the server’s perspective, there is no difference between guessing for directories and scanning for exploits.
  • By constantly scanning websites for nonexistent directories, you are wasting everyone’s time, money, and resources.

Unfortunately, we both know that this sort of nefarious scumbaggery is not going to stop. So that means is up to us as administrators to protect against this sort of maliciousness and resolve the issue ourselves.

Robots.txt vs Bad Bots

In a perfect world, 404 errors don’t exist and all bots obey robots.txt directives. But it’s not, and something like this that should work, doesn’t work:

User-agent: *
Disallow: /*/iphone/$
Disallow: /*/iphone$
Disallow: /*/mobile/$
Disallow: /*/mobile$
Disallow: /*/mobi/$
Disallow: /*/mobi$
Disallow: /*/m/$
Disallow: /*/m$

Unfortunately, very few bots obey robots.txt rules. Google does. Yahoo certainly doesn’t, and neither do other bad bots. It’s pretty much impossible to stop malicious requests using the robots.txt file. Fortunately, it’s easy to do with a few lines of HTAccess.. ;)

How to resolve the “I’m a confused bot that can’t find your mobile site” problem

First of all, note that there are probably many ways of dealing with this nonsense. It doesn’t make sense to block IPs or user-agents because they are always changing and/or easily spoofed. But we can either block any requests for nonexistent “mobile-ish” resources, or else redirect such requests to a common location, such as the Home Page. Let’s examine both of these techniques – they are quite similar.

Deny all requests for non-existent mobile content

To use this technique, you’ll need access to your site’s root HTAccess file. In it, place the following code 1:

# BLOCK 404 MOBILE REQUESTS
<ifmodule mod_rewrite.c>
 RewriteCond %{REQUEST_URI} /iphone/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /mobile/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /mobi/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /m/?$ [NC]
 RewriteRule (.*) - [F,L]
</ifmodule>

Properly placed, this code will deny all requests for the following resources:

  • http://domain.tld/iphone
  • http://domain.tld/mobile
  • http://domain.tld/mobi
  • http://domain.tld/m

Note that I’m not including the apple-touch icon here because it is better to actually create that file for users who would like to bookmark your site on an Apple device. Additional directories and/or files are easily added to the list by emulating this pattern:

RewriteCond %{REQUEST_URI} ^/whatever/?$ [NC,OR]

Replace the “whatever” string to, well, whatever you would like to match against. Then, include the line before the others in your code.

I have tested this on Apache 2.2.14 and it works perfectly. Even so, I recommend testing that it works for your particular setup just to be sure it works as advertised.

Redirect all requests for non-existent mobile content

Rather than denying access to mobile-ish requests, we can always redirect them to the page of your choice. Here is how to redirect such requests to your Home Page:

# REDIRECT 404 MOBILE REQUESTS
<ifmodule mod_rewrite.c>
 RewriteCond %{REQUEST_URI} /iphone/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /mobile/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /mobi/?$ [NC,OR]
 RewriteCond %{REQUEST_URI} /m/?$ [NC]
 RewriteRule (.*) http://domain.tld/ [R=301,L] 
</ifmodule>

Same as before, only here we change the RewriteRule in the last line to redirect to “http://domain.tld/”. Simply change the domain to that of your own and you’re all set.

As before, and always, test thoroughly.

How does it work?

The logic for either of these methods goes something like this:

  1. If the request is for /iphone or /iphone/ OR…
  2. If the request is for /mobile or /mobile/ OR…
  3. If the request is for /mobi or /mobi/ OR…
  4. If the request is for /m or /m/
  5. Then either deny the request or redirect it to the Home Page (depending on which method you are using)

Pretty simple, but very effective for eliminating malicious mobile requests.

El Wrapz

This technique is useful for saving bandwidth and server resources, not just for non-existent mobile-ish requests, but also for any resource that you would like to block – just add a RewriteCond with the target character string of your choice. Hopefully this technique will help you run a cleaner, safer, and more secure website.

Note

1 Note: If you are using WordPress, place the HTAccess rules before the permalink rules.