Yup, ‘ol Slurp is at it again, flagrantly disobeying specific robots.txt rules forbidding access to my bad-bot trap, lovingly dubbed the “blackhole.” As many readers know, this is not the first time Yahoo has been caught behaving badly. This time, Yahoo was caught trespassing five different times via three different IPs over the course of four different days. Here is the data recorded in my site’s blackhole log (I know, that sounds terrible): Continue reading »
Hello! As many of you already know, the popular WordPress anti-spam plugin, Bad Behavior, caused some problems yesterday, and as a result many bloggers and users were locked out of their favorite sites, including this one. As for now, the problem seems to be fixed, however, the experience of being locked out of my own site has left a rather unpleasant taste in my mouth. Needless to say, I will be reconsidering the continued use of Bad Behavior as a […] Continue reading »
In this brief article I explain the atrocity that is UserCash and then provide the JavaScript needed to protect your site. Continue reading »
After implementing any of the hotlink-prevention techniques described in our previous article, you may find it necessary to disable hotlink-protection for a specific directory. By default, htaccess rules apply to the directory in which it is located, as well as all subdirectories contained therein. There are (at least) three ways to enable selective hotlinking: Place hotlink images in an alternate directory This method works great if your hotlink-protection rules are located in a directory other than the site root. Simply […] Continue reading »
In this brief tutorial, we are going to enable users to access any file or directory of a site that is password-protected via htaccess. There are many reasons for wanting to employ this technique, including: Share public resources from an otherwise private site Enable visitors to access content during site maintenance Testing and formatting of layout and design during development As a webmaster, I have used this technique on several occasions. This trick works great for allowing access to any […] Continue reading »
Okay, I realize that the title sounds a bit odd, but nowhere near as odd as my recent discovery of Slurp ignoring explicit robots.txt rules and digging around in my highly specialized bot trap, which I have lovingly dubbed “the blackhole”. What is up with that, Yahoo!? — does your Slurp spider obey robots.txt directives or not? I have never seen Google crawling around that side of town, neither has MSN nor even Ask ventured into the forbidden realms. Has […] Continue reading »
When I wrote my article, Stupid htaccess Tricks, a couple of years ago, hotlink-protection via htaccess was becoming very popular. Many webmasters and bloggers were getting tired of wasting bandwidth on hotlinked resources, and therefore turned to the power of htaccess to protect their content. At that time, there were only a couple of different hotlink-protection methods available on the internet, and the functional difference between them was virtually insignificant. All that was necessary for up-and-coming bloggers-slash-site-administrators to eliminate leaking […] Continue reading »
“Oh no, not again!” It looks like another one of my non-existent bank accounts has been blocked at Bank of America. But that’s cool, because I like, totally graduated from third grade. Knowing best for all grammar and words in email. Let’s examine yet another idiotic phishing attempt, shall we? First, let’s have a look at the full-meal deal (sans bank logos, links, and other forged minutia): Continue reading »
In our original htaccess blacklist article, we provide an extensive list of bad user agents. This so-called “Ultimate htaccess Blacklist” works great at blocking many different online villains: spammers, scammers, scrapers, scrappers, rippers, leechers — you name it. Yet, despite its usefulness, there is always room for improvement. Continue reading »
Keeping track of your access and error logs is a critical component of any serious security strategy. Many times, you will see a recorded entry that looks legitimate, such that it may easily be dismissed as genuine Google fare, only to discover upon closer investigation a fraudulent agent. There are many such cloaked or disguised agents crawling around these days, mimicking various search engines to hide beneath the radar. So it’s always a good idea to implement a procedure for […] Continue reading »
In the hellish battle against spam, many WordPress users have adopted a highly effective trinity of anti-spam plugins: Akismet Bad Behavior Spam Karma This effective triage of free WordPress plugins has served many a WP-blogger well, eliminating virtually 99% of all automated comment-related spam. When spam first became a problem for me, I installed this triple-threat arsenal of anti-spam plugins and immediately enjoyed the results. Although Spam Karma seemed a little invasive and resource-intensive, too much protection seemed far better […] Continue reading »
Insanity reigns in the blogosphere! Check out this sweet little spam comment that found its way to my moderation queue.. Cloth Hello to all, its my new pages about cloth cloth diaper You can buy here 24\7. Yes indeed, “Cloth diaper”!! Come on now, is the competition really that fierce in the cloth diaper industry that companies must turn to the slimy spam cartel for scummy comment links? “its my new pages about cloth” — WTF?!!! Dude! I can’t wait […] Continue reading »
Most of the time, when I catch scumbags attempting to spam, scrape, leech, or otherwise hack my site, I stitch up a new voodoo doll and let the cursing begin. No, seriously, I just blacklist the idiots. I don’t need their traffic, and so I don’t even blink while slamming the doors in their faces. Of course, this policy presents a bit of a dilemma when the culprit is one of the four major search engines. Slamming the door on […] Continue reading »
Figuratively speaking, hunting down and killing spammers, scrapers, and other online scum remains one of our favorite pursuits. Once we have determined that a particular IP address is worthy of banishment, we generally invoke the magical powers of htaccess to lock the gates. When htaccess is not available, we may summon the versatile functionality of PHP to get the job done. This method is straightforward. Simply edit, copy and paste the following code example into the top of any PHP […] Continue reading »
Of all the bizarre, nonsensical, and pointless spam we have received so far this year, this one takes the cake. It was delivered to our designated spam account earlier this month as a plain-text email, which opens with an explanation. Apparently, “Bob Diamond” is “an Hiring Manager” looking to advertise a couple of important items. The first ad seems remotely realistic, but the second ad.. it’s like, “teddy bear features” out of nowhere — you can’t be serious. Continue reading »
Web developers trying to control comment-spam, bandwidth-theft, and content-scraping must choose between two fundamentally different approaches: selectively deny target offenders (the “blacklist” method) or selectively allow desirable agents (the “opt-in”, or “whitelist” method). Currently popular according to various online forums and discussion boards is the blacklist method. The blacklist method requires the webmaster to create and maintain a working list of undesirable agents, usually blocking their access via htaccess or php. The downside of blacklisting is that it requires considerable […] Continue reading »