Tag: 404

Stop 404 Requests for Mobile Versions of Your Site

Posted on April 26, 2010 in Function by Jeff Starr

If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files:

  • http://domain.tld/apple-touch-icon.png
  • http://domain.tld/iphone
  • http://domain.tld/mobile
  • http://domain.tld/mobi
  • http://domain.tld/m

So some bot comes along, assumes that your site includes a mobile version, and then tries its hand at guessing the location. In the common request-set listed above, we see the bot looking first for an “apple-touch icon,” and then for mobile content in various directories. If this only happens once in awhile, it’s no big deal. But these days I’ve been seeing many different bots requesting these nonexistent resources.

Even worse, these mobile-hungry bots can’t seem to remember where they’ve been – they typically request the same resources repeatedly, and in multiple locations within the directory structure. I frequently see hundreds of these types of requests in my weekly error-log analyses. Needless to say, this is an incredible waste of time, bandwidth, and server resources.

Continue Reading

Pimp Your 404: Presentation and Functionality

Posted on November 2, 2009 in Function, Presentation by Jeff Starr

I have been wanting to write about 404 error pages for quite awhile now. They have always been very important to me, with customized error pages playing a integral part of every well-rounded web-design strategy. Rather than try to re-invent the wheel with this, I think I will just go through and discuss some thoughts about 404 error pages, share some useful code snippets, and highlight some suggested resources along the way. In a sense, this post is nothing more than a giant “brain-dump” of all things 404 for future reference. Hopefully you will find it useful in pimping your own 404.

When requested page is not found by server, error message is returned; this is the essence of the 404 — Ancient Chinese proverb

Continue Reading

Best Practices for Error Monitoring

Posted on May 3, 2009 in Websites by Jeff Starr

Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about my error monitoring practices, I decided to share my response here at Perishable Press, and hopefully get some good feedback concerning best practices for error monitoring. Here is my email response to the question:

Continue Reading

Plenty of Errors to Chew On..

Posted on November 6, 2007 in Perishable by Jeff Starr

Alrighty then! Looks like recent changes to site structure have really dropped a bomb on quite a few regular visitors out there. After switching over to the new default theme last night, I had setup an email notification system to alert me of all errors encountered at this domain (i.e., the main site and all test sites). Of course, I knew that at least a few errors would be inevitable, but I had no idea that I would receive nearly 300 of them!

After examining the nature of these errors, it appears that the bulk of them are the result of either Google showing confusion over the new image directory structure, or people visiting the site with a browser cache full of old theme files. Apparently, a few visitors were actually using some of the alternate themes that I had provided via the previous default theme. So, now that the alternate themes have been removed (temporarily, for a few months), visitors are experiencing errors when they visit the site. Uhh, not so good, especially for them.

Continue Reading

Eliminate 404 Errors for PHP Functions

Posted on August 27, 2007 in Function by Jeff Starr

Recently, I discussed the suspicious behavior recently observed by the Yahoo! Slurp crawler. As revealed by the site’s closely watched 404-error logs, Yahoo! had been requesting a series of nonexistent resources. Although a majority of the 404 errors were exclusive to the Slurp crawler, there were several instances of requests that were also coming from Google, Live, and even Ask. Initially, these distinct errors were misdiagnosed as existing URLs appended with various JavaScript functions. Here are a few typical examples of these frequently observed log entries:

http://perishablepress.com/press/category/websites/feed/function.opendir
http://perishablepress.com/press/category/websites/feed/function.array-rand
http://perishablepress.com/press/category/websites/feed/function.mkdir
http://perishablepress.com/press/category/websites/feed/ref.outcontrol

Fortunately, an insightful reader named Bas pointed out that the errors were actually PHP functions. Bas explains:

The two functions (array_rand and opendir) you define as javascript functions are PHP functions. Some servers generate clickable links to the php manual (which uses function.NAMEOFFUNCTION in their URL’s) in php scripting error messages. Maybe that’s also the cause of these problems.

Continue Reading

Suspicious Behavior from Yahoo! Slurp Crawler

Posted on August 13, 2007 in Websites by Jeff Starr

[ Image: Black and white illustration of the upper half of a man's suspicious, paranoid face ] Most of the time, when I catch scumbags attempting to spam, scrape, leech, or otherwise hack my site, I stitch up a new voodoo doll and let the cursing begin. No, seriously, I just blacklist the idiots. I don’t need their traffic, and so I don’t even blink while slamming the doors in their faces.

Of course, this policy presents a bit of a dilemma when the culprit is one of the four major search engines. Slamming the door on Yahoo! would be unwise, but if their Slurp crawler continues behaving suspiciously, I may have no choice. Check out the following records, pulled directly from one of my error logs, where Yahoo! exhibits some extremely questionable behavior.

Continue Reading

Standards-Compliance Throwdown: MS-IE5/6 DNS/404 Error-Page Redesign

Posted on May 1, 2007 in Presentation, Standards by Jeff Starr

Screenshot: default IE 404 error page
Default DNS Error page for Internet Explorer

First of all, congratulations if you are geeky enough to understand the title of this article. Many would be like, "CSS, MS.. IE, error ..what..?" Whatever. If you get the title, you will get the point of this utterly pointless exercise. If that is the case, prepare for a delightful romp through geekland. Otherwise, save your precious time and stop reading here (exit strategy).

Well, okay, for the seriously unenlightened, let us explain the object of our present focus:

The default "DNS Error" page for Internet Explorer unfortunately remains a familiar sight for millions of Microsoft users. Typically, the default MS DNS Error page loads whenever a browser is unable to connect to the internet or other networked resource. Once loaded, the error page announces itself with a message that reads "The page cannot be displayed." The page then presents several options: refresh browser, retype address, check connection, check configuration, etc.     — Monzilla Media (i.e., me)

Still interested? Well, okay. Actually, it’s no big deal. Just a nice, standards-compliant, CSS-based redesign of that old, nappy Internet Explorer 404 Error page. You know the one. Whenever you can’t connect to the internet, it jumps up at you, sticks out it’s tongue and mocks you. Yes, we hate it, too. But alas, with the release of Internet Explorer 7 comes a ‘brand new’ 404 error page. Surely, it’s just a matter of time before that dumpy old 404 error page circa IE5/6 disappears forever. So, before that tragedy unfolds..

Continue Reading

Disobedient Robots and Company

Posted on January 1, 2007 in Perishable, Websites by Jeff Starr

In our never-ending battle against spammers, leeches, scrapers, and other online undesirables, we have implemented several powerful security measures to improve the operational integrity of our perpetual virtual existence. Here is a rundown of the new behind-the-scenes security features of Perishable Press:

  • Automated spambot trap, designed to identify bots (and/or stupid people) that disobey rules specified in the site’s robots.txt file.
  • Automated disobedient-robot identification (via reverse IP lookup), admin-notification (via email) and blacklist inclusion (via htaccess).
  • Automated inclusion of disobedient robot identification on our now public "Disobedient Robots" page.
  • Imroved htaccess rules, designed to eliminate scum-sucking worms and other useless vermin.
  • Automated tracking tools, designed to keep a close eye on any suspicious or questionable activity.
  • Automated 404-error statistics, designed to optimize the elimination of 404 errors.
  • Plus a few other secret-agent tricks that we are not at liberty to discuss ;)

As you can see, we have been pretty busy around here — fortunately, the new security features have been working flawlessly, reducing stolen bandwidth, potential spam, disobedient robots, and 404 errors. Hopefully, the end result of these new features will involve smoother site functionality and better browsing for everyone.