Block Bad Bots with Blackhole Pro + Save 25% with code: CENTAURUS Get plugin »

Upload Large Files or Die Trying

I recently spent some time wrestling with various e-commerce/shopping-cart/membership plugins. One of them was of course the popular WP e-Commerce plugin, which uses a directory named “downloadables” to store your precious goods. I had some large files that needed to go into this folder, but the server’s upload limit stopped me from using the plugin’s built-in file uploader to do so. Read more »


One thing I love about Twitter is the instant feedback. For the past few weeks I’ve been seeing lots of 404 requests like this: At first I thought it was some skript kiddie getting creative, you know as a play on the robots.txt file, which is also located in the root of many websites. So it seemed interesting enough to tweet about: Read more »

Clean Up Malicious Links with HTAccess

I recently spent some time analyzing Perishable Press pages as they appear in the search results for Google, Bing, et al. Google Webmaster Tools provides a wealth of information about crawl errors, as well as the URLs of any pages that link to missing content. Combined with your site’s access/error logs, you have everything needed to track down 404 errors and clean up your listings in the search engine results. Read more »

5G Firewall Beta

Update: Check out the new and improved 6G Firewall 2016! Updating the 4G Blacklist, the new 5G Firewall is now open for beta testing. The new code is better than ever, providing wider protection with less code and fewer false positives. I’ve had much success with this new firewall, but more testing is needed to ensure maximum compatibility and minimal issues. Read more »

Canonical URLs and Subdomains with Plesk

I am in the process of migrating my sites from A Small Orange to Media Temple. Part of that process involves canonicalizing domain URLs to help maximize SEO strategy. At ASO, URL canonicalization required just a few htaccess directives: # enforce no www prefix <ifmodule mod_rewrite.c> RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC] RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L] </ifmodule> When placed in the web-accessible root directory’s htaccess file, that snippet will ensure that all requests for your site are not prefixed with www. There’s also a force-www technique if that’s how you roll. Either way, the point is that on most shared hosting, URL […] Read more »

Latest Blacklist Entries

Recently cleared several megabytes of log files, detecting patterns, recording anomalies, and blacklisting gross offenders. Gonna break it down into three sections: User Agents Character Strings IP Addresses User Agents User-agents come and go, and are easily spoofed, but it’s worth a few lines of htaccess to block the more persistent bots that repeatedly scan your site with malicious requests. # Nov 2010 User Agents SetEnvIfNoCase User-Agent “MaMa ” keep_out SetEnvIfNoCase User-Agent “choppy” keep_out SetEnvIfNoCase User-Agent “heritrix” keep_out SetEnvIfNoCase User-Agent “Purebot” keep_out SetEnvIfNoCase User-Agent “PostRank” keep_out SetEnvIfNoCase User-Agent “archive.org_bot” keep_out SetEnvIfNoCase User-Agent “msnbot.htm)._” keep_out <limit GET POST PUT> Order Allow,Deny […] Read more »

How to Deal with Content Scrapers

Chris Coyier of CSS-Tricks recently declared that people should do “nothing” in response to other sites scraping their content. I totally get what Chris is saying here. He is basically saying that the original source of content is better than scrapers because: it’s on a domain with more trust. you published that article first. it’s coded better for SEO than theirs. it’s better designed than theirs. it isn’t at risk for serious penalization from search engines. If these things are all true, then I agree, you have nothing to worry about. Unfortunately, that’s a tall order for many sites on […] Read more »

2010 User-Agent Blacklist

Update: Check out the new and improved 2013 User Agent Blacklist! The 2010 User-Agent Blacklist blocks hundreds of bad bots while ensuring open-access for the major search engines: Google, Bing, Ask, Yahoo, et al. Blocking bad user-agents is an effective addition to any security strategy. It works like this: your site is getting hammered by rogue bots that waste valuable server resources and bandwidth. So you grab a copy of the 2010 UA Blacklist from Perishable Press, include it in your site’s root .htaccess file, and enjoy a more secure and better performing website. It’s that easy. Proven Security The […] Read more »

Protect Your Site with a Blackhole for Bad Bots

Update: Pro version of Blackhole for Bad Bots (WordPress security plugin) now available! Check out Blackhole Pro » One of my favorite security measures here at Perishable Press is the site’s virtual Blackhole trap for bad bots. The concept is simple: include a hidden link to a robots.txt-forbidden directory somewhere on your pages. Bots that ignore or disobey your robots rules will crawl the link and fall into the honeypot trap, which then performs a WHOIS Lookup and records the event in the blackhole data file. Once added to the blacklist data file, bad bots immediately are denied access to […] Read more »

htaccess Code for WordPress Multisite

For the upcoming Digging into WordPress update for WordPress 3.0, I have been working with WordPress’ multisite functionality. Prior to version 3.0, WordPress came in two flavors: “original” and “multisite” (MU). Most designers probably work with regular, one-blog installations of “regular” WordPress. The htaccess rules for all single-blog installations of WordPress haven’t changed. They are the same for WordPress 3.0 as they are for all previous versions. But now that multisite has merged with regular-flavored WordPress, we can stick with single-blog installs (which is how things are setup by default), or we can activate multisite functionality and create an unlimited […] Read more »

2010 IP Blacklist

Update: Check out the new and improved 2013 IP Blacklist! Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive: […] Read more »

htaccess Redirect to Maintenance Page

Redirecting visitors to a maintenance page or other temporary page is an essential tool to have in your tool belt. Using HTAccess, redirecting visitors to a temporary maintenance page is simple and effective. All you need to redirect your visitors is the following code placed in your site’s root HTAccess: # MAINTENANCE-PAGE REDIRECT <ifmodule mod_rewrite.c> RewriteEngine on RewriteCond %{REMOTE_ADDR} !^123\.456\.789\.000 RewriteCond %{REQUEST_URI} !/maintenance.html$ [NC] RewriteCond %{REQUEST_URI} !\.(jpe?g?|png|gif) [NC] RewriteRule .* /maintenance.html [R=302,L] </ifmodule> That is the official copy-&-paste goodness right there. Just grab, gulp and go. Of course, there are a few more details for those who may be unfamiliar […] Read more »

Stop 404 Requests for Mobile Versions of Your Site

If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files: http://domain.tld/apple-touch-icon.png http://domain.tld/iphone http://domain.tld/mobile http://domain.tld/mobi http://domain.tld/m So some bot comes along, assumes that your site includes a mobile version, and then tries […] Read more »

Is it Secret? Is it Safe?

Whenever I find myself working with PHP or messing around with server settings, I nearly always create a phpinfo.php file and place it in the root directory of whatever domain I happen to be working on. These types of informational files employ PHP’s handy phpinfo() function to display a concise summary of all of your server’s variables, which may then be referenced for debugging purposes, bragging rights, and so on. While this sort of thing is normally okay, I frequently forget to remove the file and just leave it sitting there for the entire world to look at. This of […] Read more »

HTAccess Privacy for Specific IPs

Running a private site is all about preventing unwanted visitors. Here is a quick and easy way to allow access to multiple IP addresses while redirecting everyone else to a custom message page. To do this, all you need is an HTAccess file and a list of IPs for which you would like to allow access. Edit the following code according to the proceeding instructions and place into the root HTAccess file of your domain: # ALLOW ONLY MULTIPLE IPs <limit GET POST PUT> Order Deny,Allow Deny from all Allow from 123.456.789 Allow from 456.789.123 Allow from 789.123.456 </limit> ErrorDocument […] Read more »

Disable Trace and Track for Better Security

The shared server on which I host Perishable Press was recently scanned by security software that revealed a significant security risk. Namely, the HTTP request methods TRACE and TRACK were found to be enabled on my webserver. The TRACE and TRACK protocols are HTTP methods used in the debugging of webserver connections. Although these methods are useful for legitimate purposes, they may compromise the security of your server by enabling cross-site scripting attacks (XST). By exploiting certain browser vulnerabilities, an attacker may manipulate the TRACE and TRACK methods to intercept your visitors’ sensitive data. The solution, of course, is disable […] Read more »

Latest Tweets #WordPress Popular Posts Shortcode:…