New book on WordPress Theme Development: WordPress Themes In Depth
blacklist
Tag Archive

Block Tough Proxies

If you want to block tough proxies like hidemyass.com, my previously posted .htaccess methods won’t work. Those methods will block quite a bit of proxy visits to your site, but won’t work on the stealthier proxies. Fortunately, we can use a bit of PHP to keep them out. Read more »

10 Characters for Your WordPress Blacklist

Quick WordPress tip for easily and quietly blocking a ton of comment spam. Akismet and other programs are good at catching most spam, but every now and then a bunch of weird, foreign-language spam will sneak past the filters and post live to your site. Here’s a good example of the kind of stuff that’s easy to block: Read more »

5G Firewall Beta

Update: Check out the new and improved 5G Blacklist 2013! (The beta version provided in this post is now for reference only.) Updating the 4G Blacklist, the new 5G Firewall is now open for beta testing. The new code is better than ever, providing wider protection with less code and fewer false positives. I’ve had much success with this new firewall, but more testing is needed to ensure maximum compatibility and minimal issues. Read more »

What a Malicious Server Scan Looks Like

Like most sites on the Web, Perishable Press is scanned constantly by malicious scripts looking for vulnerabilities and exploit opportunities. There is no end to the type and variety of malicious URL requests. It all depends on the script, the target, and the goal of the attack. Malicious scripts generally seek one of two things: Read more »

Latest Blacklist Entries

Recently cleared several megabytes of log files, detecting patterns, recording anomalies, and blacklisting gross offenders. Gonna break it down into three sections: User Agents Character Strings IP Addresses User Agents User-agents come and go, and are easily spoofed, but it’s worth a few lines of htaccess to block the more persistent bots that repeatedly scan your site with malicious requests. # Nov 2010 User Agents SetEnvIfNoCase User-Agent “MaMa ” keep_out SetEnvIfNoCase User-Agent “choppy” keep_out SetEnvIfNoCase User-Agent “heritrix” keep_out SetEnvIfNoCase User-Agent “Purebot” keep_out SetEnvIfNoCase User-Agent “PostRank” keep_out SetEnvIfNoCase User-Agent “archive.org_bot” keep_out SetEnvIfNoCase User-Agent “msnbot.htm)._” keep_out <limit GET POST PUT> Order Allow,Deny […] Read more »

2010 User-Agent Blacklist

Update: Check out the new and improved 2013 User Agent Blacklist! The 2010 User-Agent Blacklist blocks hundreds of bad bots while ensuring open-access for the major search engines: Google, Bing, Ask, Yahoo, et al. Blocking bad user-agents is an effective addition to any security strategy. It works like this: your site is getting hammered by rogue bots that waste valuable server resources and bandwidth. So you grab a copy of the 2010 UA Blacklist from Perishable Press, include it in your site’s root .htaccess file, and enjoy a more secure and better performing website. It’s that easy. Proven Security The […] Read more »

Protect Your Site with a Blackhole for Bad Bots

One of my favorite security measures here at Perishable Press is the site’s virtual Blackhole trap for bad bots. The concept is simple: include a hidden link to a robots.txt-forbidden directory somewhere on your pages. Bots that ignore or disobey your robots rules will crawl the link and fall into the trap, which then performs a WHOIS Lookup and records the event in the blackhole data file. Once added to the blacklist data file, bad bots immediately are denied access to your site. I call it the “one-strike” rule: bots have one chance to follow the robots.txt protocol, check the […] Read more »

2010 IP Blacklist

Update: Check out the new and improved 2013 IP Blacklist! Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive: […] Read more »

Stop 404 Requests for Mobile Versions of Your Site

If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files: http://domain.tld/apple-touch-icon.png http://domain.tld/iphone http://domain.tld/mobile http://domain.tld/mobi http://domain.tld/m So some bot comes along, assumes that your site includes a mobile version, and then tries […] Read more »

HTAccess Privacy for Specific IPs

Running a private site is all about preventing unwanted visitors. Here is a quick and easy way to allow access to multiple IP addresses while redirecting everyone else to a custom message page. To do this, all you need is an HTAccess file and a list of IPs for which you would like to allow access. Edit the following code according to the proceeding instructions and place into the root HTAccess file of your domain: # ALLOW ONLY MULTIPLE IPs <limit GET POST PUT> Order Deny,Allow Deny from all Allow from 123.456.789 Allow from 456.789.123 Allow from 789.123.456 </limit> ErrorDocument […] Read more »

Block Multiple IP Addresses with PHP

Let’s face it. There’s just as much scum on the Internet as there is out there in the “real world.” Maybe even more, who knows. From scammers and spammers to scrapers and crackers, the Web is just crawling with all sorts of pathetic scumbags. As predictably random as much of the malicious activity happens to be, it is virtually guaranteed that you will be hounded by at least a few persistent IP addresses that, for whatever reason, have latched on and just won’t let go. Like satanic parasites, they plague you night and day, haunting you and making your online […] Read more »

Best Practices for Error Monitoring

Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about my error monitoring practices, I decided to share my response here at Perishable Press, and hopefully get some good feedback […] Read more »

4G Series: The Ultimate Referrer Blacklist, Featuring Over 8000 Banned Referrers

You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers. For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the Mainstream Media website and happened to follow a link to your site (of all places), you would look at your access logs, notice Johnny’s visit, and speak out loud (slowly): “hmmm.. it looks […] Read more »

4G Series: The Ultimate User-Agent Blacklist, Featuring Over 1200 Bad Bots

As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against malicious behavior, it may certainly provide another layer of protection. Even so, there are several things to consider before choosing […] Read more »

The Perishable Press 4G Blacklist

At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your website against a wide range of malicious activity. Like its 3G predecessor, the 4G Blacklist is designed for use on Apache servers and is easily implemented via HTAccess or the httpd.conf configuration file. In order to function properly, the 4G Blacklist requires two specific Apache modules, mod_rewrite and mod_alias. As with the third generation of the blacklist, the 4G Blacklist consists of multiple parts: Update […] Read more »

Yahoo! Slurp too Stupid to be a Robot

I really hate bad robots. When a web crawler, spider, bot — or whatever you want to call it — behaves in a way that is contrary to expected and/or accepted protocols, we say that the bot is acting suspiciously, behaving badly, or just acting stupid in general. Unfortunately, there are thousands — if not hundreds of thousands — of nefarious bots violating our websites every minute of the day. For the most part, there are effective methods available enabling us to protect our sites against the endless hordes of irrelevant and mischievous bots. Such evil is easily blocked with […] Read more »

Latest Tweets Book Giveaway! Win a free copy of Digging Into WordPress - includes updates and all bonus material: digwp.com/2014/11/book-giveawa…