« Newer Posts • 12345 • Previous Posts »

Yahoo! Slurp in My Blackhole (Yet Again)

Yup, ‘ol Slurp is at it again, flagrantly disobeying specific robots.txt rules forbidding access to my bad-bot trap, lovingly dubbed the “blackhole.” As many readers know, this is not the first time Yahoo has been caught behaving badly. This time, Yahoo was caught trespassing five different times via three different IPs over the course of four different days. Here is the data recorded in my site’s blackhole log (I know, that sounds terrible): Continue reading »

Yahoo! in my Blackhole

Okay, I realize that the title sounds a bit odd, but nowhere near as odd as my recent discovery of Slurp ignoring explicit robots.txt rules and digging around in my highly specialized bot trap, which I have lovingly dubbed “the blackhole”. What is up with that, Yahoo!? — does your Slurp spider obey robots.txt directives or not? I have never seen Google crawling around that side of town, neither has MSN nor even Ask ventured into the forbidden realms. Has […] Continue reading »

Comprehensive Reference for WordPress No-Nofollow/Dofollow Plugins

Recently, while deliberating an optimal method for eliminating nofollow link attributes from Perishable Press, I collected, installed, tested and reviewed every WordPress no-nofollow/dofollow plugin that I could find. In this article, I present a concise, current, and comprehensive reference for WordPress no-nofollow and dofollow plugins. Every attempt has been made to provide accurate, useful, and complete information for each of the plugins represented below. Further, as this subject is a newfound interest of mine, it is my intention to keep […] Continue reading »

Stop WordPress from Leaking PageRank to Admin Pages

During the most recent Perishable Press redesign, I noticed that several of my WordPress admin pages had been assigned significant levels of PageRank. Not good. After some investigation, I realized that my ancient robots.txt rules were insufficient in preventing Google from indexing various WordPress admin pages. Specifically, the following pages have been indexed and subsequently assigned PageRank: Continue reading »

Eliminate 404 Errors for PHP Functions

Recently, I discussed the suspicious behavior recently observed by the Yahoo! Slurp crawler. As revealed by the site’s closely watched 404-error logs, Yahoo! had been requesting a series of nonexistent resources. Although a majority of the 404 errors were exclusive to the Slurp crawler, there were several instances of requests that were also coming from Google, Live, and even Ask. Initially, these distinct errors were misdiagnosed as existing URLs appended with various JavaScript functions. Here are a few typical examples […] Continue reading »

Suspicious Behavior from Yahoo! Slurp Crawler

[ Image: Black and white illustration of the upper half of a man's suspicious, paranoid face ]

Most of the time, when I catch scumbags attempting to spam, scrape, leech, or otherwise hack my site, I stitch up a new voodoo doll and let the cursing begin. No, seriously, I just blacklist the idiots. I don’t need their traffic, and so I don’t even blink while slamming the doors in their faces. Of course, this policy presents a bit of a dilemma when the culprit is one of the four major search engines. Slamming the door on […] Continue reading »

CSS Throwdown: Preload Images without JavaScript

Clean, easy, effective. You don’t need no stinking JavaScript to preload your images. Nope. Try some tasty CSS and (X)HTML instead! Here’s how to do it with only two easy steps.. Step 1 — Place this in your CSS file: div#preloaded-images { position: absolute; overflow: hidden; left: -9999px; top: -9999px; height: 1px; width: 1px; } Step 2 — Place this at the bottom of your (X)HTML document: <div id="preloaded-images"> <img src="https://perishablepress.com/image-01.png" width="1" height="1" alt="" /> <img src="https://perishablepress.com/image-02.png" width="1" height="1" alt="" […] Continue reading »

Search Engine Registration Notes

In his excellent book, Search Engine Optimization for Dummies, Peter Kent explains that many search engines actually get their search results from one (or more) of the larger search engines, such as Google or The Open Directory Project. Therefore, the author concludes that it may not be necessary to spend endless hours registering with thousands of the smaller search sites. Rather, the author provides a brief list of absolutely essential search sites with which it is highly recommended to register. […] Continue reading »

URL Character Codes

URLs frequently employ potentially conflicting characters such as question marks, ampersands, and pound signs. Fortunately, it is possible to encode such characters via their escaped hexadecimal ASCII representations. For example, we would write ? as %3F. Here are a few more URL character codes (case-insensitive), for easy copy/paste reference. Continue reading »

Embed QuickTime Notes Plus

This post contains random notes and code snippets for embedding QuickTime within HTML web pages. Simply copy, paste, and customize according to your needs. Happy embedding! Continue reading »

Invite Only: Traffic Control via Whitelist

Web developers trying to control comment-spam, bandwidth-theft, and content-scraping must choose between two fundamentally different approaches: selectively deny target offenders (the “blacklist” method) or selectively allow desirable agents (the “opt-in”, or “whitelist” method). Currently popular according to various online forums and discussion boards is the blacklist method. The blacklist method requires the webmaster to create and maintain a working list of undesirable agents, usually blocking their access via htaccess or php. The downside of blacklisting is that it requires considerable […] Continue reading »

Keep it Dark: Hiding and Filtering CSS

Hiding and filtering CSS rules for specifically targeted browsers is often a foregone conclusion when it comes to cross-browser design considerations. Rather than dive into some lengthy dialogue concerning the myriad situations and implications of such design hackery, our current scheduling restraints behoove us to simply cut to the chase and dish the goods. Having said that, we now consider this post a perpetually evolving repository of CSS filters.. Continue reading »

Embed External Content via iframe and div

By using an <iframe></iframe> within a <div></div>, it is possible to include external web content in most any web document. This method serves as an excellent alternative to actual frames, which are not as flexible and definitely not as popular. Indeed, with CSS, the placement, sizing, and styling of div’s provides endless possibilities for embedding external or even internal web content into pages that would otherwise require the use of frames, Flash, or JavaScript. This method works on any modern […] Continue reading »

Disobedient Robots and Company

In our never-ending battle against spammers, leeches, scrapers, and other online undesirables, we have implemented several powerful security measures to improve the operational integrity of our perpetual virtual existence. Here is a rundown of the new behind-the-scenes security features of Perishable Press. Continue reading »

Stop Bitacle from Stealing Content

If you have yet to encounter the content-scraping site, bitacle.org, consider yourself lucky. The scum-sucking worm-holes at bitacle.org are well-known for literally, blatantly, and piggishly stealing blog content and using it for financial gains through advertising. While I am not here to discuss the legal, philosophical, or technical ramifications of illegal bitacle behavior, I am here to provide a few critical tools that will help stop bitacle from stealing your content. Continue reading »

Theme Edits for IE7

This post is a working repository of code edits and other changes made to Perishable Press themes in order for them to function properly in Internet Explorer 7 (IE7). If supporting older versions of IE is your thang. Continue reading »

« Newer Posts • 12345 • Previous Posts »