Block Bad Bots with Blackhole Pro + Save 25% with code: CENTAURUS Get plugin »
Tag Archive

Tools to check your site’s health

Perishable Press is now over 12 years old. It is a lot of work keeping everything updated, maintained, and well-secured. Fortunately there are a gazillion free online tools for checking your site’s health. Everyone has their favorites. In this quick article, I share mine. Read more »

xy.css moved to Perishable Press

Recently I’ve been implementing SSL on my domains and have been streamlining and updating some projects along the way. Consolidating properties is a great way to simplify workflow and boost productivity, so I’ve went ahead and moved xyCSS from its own domain, xy.css, to its new home here at Perishable Press. Read more »

2013 User Agent Blacklist

The 2013 User Agent Blacklist blocks hundreds of the worst bots while ensuring open-access for normal traffic, major search engines (Google, Bing, et al), good browsers (Chrome, Firefox, Opera, et al), and everyone else. Compared to blocking threats by IP, blocking by user-agent is more effective as a general security strategy. Although it’s trivial to spoof any user agent, many bad requests continue to report user-agent strings that are known to be associated with malicious activity. For example, the notorious “httrack” user agent has been widely blocked since at least 2007, yet it continues to plague sites to this day. […] Read more »

WP-Mix – A fresh mix of code snippets and tutorials

Wrapping up 2012, I finally launched xyCSS, which is all about responsive, grid-based design. To showcase xy.css, I used it to design, which also serves to house a growing collection of choice code snippets. Currently WP-Mix features over 100 snippets, tutorials, and other useful bits to help with WordPress development and web design in general. The topics are similar to those at Perishable Press (e.g., WordPress, PHP, JavaScript, CSS, etc.), but the posts are less-involved and aimed at intermediate to advanced developers. Read more »

xy.css – Responsive Grid Design

For the past year or so, I’ve been heavy into responsive, grid-based design. In December, I “soft-launched” my new site, xyCSS with a simple tweet: Bringing it all together: As implied (and explained), xy.css is a lightweight CSS template for creating semantic HTML5 designs on a responsive liquid matrix. Read more »

What I did in 2012

It’s been an amazing year across the board. Here is a quick recap of some of the things I did in 2012. I don’t keep a journal of every little detail, but here are some of the things I remember specifically setting out to do, sort of organized by month. Read more »

Notes on Switching Servers

Switching servers & migrating sites can be a HUGE deal (or not), depending on things like: Number of sites to transfer Size and complexity of sites Who is hosting your sites Experience I recently did this, switching from a 3-year run at ASO to my new home at Media Temple. Total of 24 properties, with WordPress running on around 10 sites. Past experience with VPS servers really had me paranoid about running out of memory. A few years ago, Perishable Press alone gobbled up 256MB of RAM at WiredTree, so add another 23 sites on top of that and needless […] Read more »

Canonical URLs and Subdomains with Plesk

I am in the process of migrating my sites from A Small Orange to Media Temple. Part of that process involves canonicalizing domain URLs to help maximize SEO strategy. At ASO, URL canonicalization required just a few htaccess directives: # enforce no www prefix <ifmodule mod_rewrite.c> RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC] RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L] </ifmodule> When placed in the web-accessible root directory’s htaccess file, that snippet will ensure that all requests for your site are not prefixed with www. There’s also a force-www technique if that’s how you roll. Either way, the point is that on most shared hosting, URL […] Read more »

Recent Drama, News, and Other Stuff

Okay so it’s been awhile. That’s a good thing because it means I’m busy. But it also sucks because life moves too fast to blog about everything that happens. Especially with web design: you get started blogging about your discoveries, and then you find yourself learning and doing too much to post or tweet about even just the big stuff. But now I have some time to write and share some of the awesome and insane things that have happened since my boring 2009 personal update. So much has happened since then but I will try to stay focused because […] Read more »

Latest Blacklist Entries

Recently cleared several megabytes of log files, detecting patterns, recording anomalies, and blacklisting gross offenders. Gonna break it down into three sections: User Agents Character Strings IP Addresses User Agents User-agents come and go, and are easily spoofed, but it’s worth a few lines of htaccess to block the more persistent bots that repeatedly scan your site with malicious requests. # Nov 2010 User Agents SetEnvIfNoCase User-Agent “MaMa ” keep_out SetEnvIfNoCase User-Agent “choppy” keep_out SetEnvIfNoCase User-Agent “heritrix” keep_out SetEnvIfNoCase User-Agent “Purebot” keep_out SetEnvIfNoCase User-Agent “PostRank” keep_out SetEnvIfNoCase User-Agent “archive.org_bot” keep_out SetEnvIfNoCase User-Agent “msnbot.htm)._” keep_out <limit GET POST PUT> Order Allow,Deny […] Read more »

How to Deal with Content Scrapers

Chris Coyier of CSS-Tricks recently declared that people should do “nothing” in response to other sites scraping their content. I totally get what Chris is saying here. He is basically saying that the original source of content is better than scrapers because: it’s on a domain with more trust. you published that article first. it’s coded better for SEO than theirs. it’s better designed than theirs. it isn’t at risk for serious penalization from search engines. If these things are all true, then I agree, you have nothing to worry about. Unfortunately, that’s a tall order for many sites on […] Read more »

Country, Regional, and State Abbreviations

Creating dropdown menus for web forms is such a fun way to spend the afternoon. One of the funnest things for me is adding all of the regional, state, and country codes when they’re required. Here are a few lists to make my web-dev life a little easier. Here’s a quick jump menu: Country and Regional Abbreviations US State Abbreviations US States Download as plain text file Read more »

Lessons Learned after 5 Years of Blogging

This Fall, I celebrate five years of blogging. I have written tons of web development stuff at Perishable Press, lots of helpful WordPress stuff at Digging into WordPress, some creative/artistic stuff at Dead Letter Art, jQuery stuff at jQuery Mix, and some business-related web-design stuff at Monzilla Media. Plus a bunch of interviews, guest posts, and other blogging projects. So yeah, lots of blogging and writing during the past five years. And they just flew by. Despite what the haters may say, there are some tangible benefits to blogging. As I write, I continue to learn a great deal – […] Read more »

2010 User-Agent Blacklist

Update: Check out the new and improved 2013 User Agent Blacklist! The 2010 User-Agent Blacklist blocks hundreds of bad bots while ensuring open-access for the major search engines: Google, Bing, Ask, Yahoo, et al. Blocking bad user-agents is an effective addition to any security strategy. It works like this: your site is getting hammered by rogue bots that waste valuable server resources and bandwidth. So you grab a copy of the 2010 UA Blacklist from Perishable Press, include it in your site’s root .htaccess file, and enjoy a more secure and better performing website. It’s that easy. Proven Security The […] Read more »

Protect Your Site with a Blackhole for Bad Bots

Update: Pro version of Blackhole for Bad Bots (WordPress security plugin) now available! Check out Blackhole Pro » One of my favorite security measures here at Perishable Press is the site’s virtual Blackhole trap for bad bots. The concept is simple: include a hidden link to a robots.txt-forbidden directory somewhere on your pages. Bots that ignore or disobey your robots rules will crawl the link and fall into the honeypot trap, which then performs a WHOIS Lookup and records the event in the blackhole data file. Once added to the blacklist data file, bad bots immediately are denied access to […] Read more »

2010 IP Blacklist

Update: Check out the new and improved 2013 IP Blacklist! Over the course of each year, I blacklist a considerable number of individual IP addresses. Every day, Perishable Press is hit with countless numbers of spammers, scrapers, crackers and all sorts of other hapless turds. Weekly examinations of my site’s error logs enable me to filter through the chaff and cherry-pick only the most heinous, nefarious attackers for blacklisting. Minor offenses are generally dismissed, but the evil bastards that insist on wasting resources running redundant automated scripts are immediately investigated via IP lookup and denied access via simple htaccess directive: […] Read more »

Latest Tweets How to block bad bots:…