I am in the process of migrating my sites from A Small Orange to Media Temple. Part of that process involves canonicalizing domain URLs to help maximize SEO strategy. At ASO, URL canonicalization required just a few htaccess directives:
# enforce no www prefix
<IfModule mod_rewrite.c>
RewriteCond %{HTTP_HOST} !^domain\.tld$ [NC]
RewriteRule ^(.*)$ http://domain.tld/$1 [R=301,L]
</IfModule>
When placed in the web-accessible root directory’s htaccess file, that snippet will ensure that all requests for your site are not prefixed with www. There’s also a force-www technique if that’s how you roll. Either way, the point is that on most shared hosting, URL canonicalization is simple.
Continue Reading
Running a private site is all about preventing unwanted visitors. Here is a quick and easy way to allow access to multiple IP addresses while redirecting everyone else to a custom message page.
To do this, all you need is an HTAccess file and a list of IPs for which you would like to allow access.
Edit the following code according to the proceeding instructions and place into the root HTAccess file of your domain:
# ALLOW ONLY MULTIPLE IPs
<Limit GET POST PUT>
Order Deny,Allow
Deny from all
Allow from 123.456.789
Allow from 456.789.123
Allow from 789.123.456
</Limit>
ErrorDocument 403 path/custom-message.html
<Files path/custom-message.html>
Order Allow,Deny
Allow from all
</Files>
To prepare this code for use on your site, do these three things:
- Edit the three IP addresses to suit your needs. Feel free to add more IPs or remove any that aren’t needed.
- Edit both instances of “
path/custom-message.html” to match the path and file name of the file that will contain your custom message. This may be anything, anywhere, with any functionality you desire.
- That’s it. Copy/paste into your site’s root htaccess file, upload, test, and get out!
Continue Reading
Recently a reader asked about how to password-protect a directory for every specified IP while allowing open access to everyone else. In my article, Stupid htaccess Tricks, I show how to password-protect a directory for every IP except the one specified, but not for the reverse case. In this article, I will demonstrate this technique along with a wide variety of other useful password-protection tricks, including a few from my Stupid htaccess Tricks article. Before getting into the juicy stuff, we’ll review a few basics of HTAccess password protection.
Continue Reading
Normally, when visitors post a comment to your site, specific types of client data are associated with the request. Commonly, a client will provide a user agent, a referrer, and a host header. When any of these variables is absent, there is good reason to suspect foul play. For example, virtually all browsers provide some sort of user-agent name to identify themselves. Conversely, malicious scripts directly posting spam and other payloads to your site frequently operate without specifying a user agent. In the Ultimate User-Agent Blacklist, we account for the “no-user-agent” case in the very first directive, preventing a host of anonymous visitors from hitting the site.
In addition to empty user-agent strings, malicious requests for site content frequently fail to provide any referrer information. Unless special privacy software is being used, the web page from which a visitor has arrived at your site will be specified in the header information for that request. Likewise, when a visitor posts a comment at your site, the referrer string for that post request will be the URL of that particular page. Thus, as with blank user-agent requests, no-referrer requests are frequently indicative of spam and other malicious behavior.
Another important piece of information provided by all legitimate clients is the host request header. The host header specifies the Internet host and port number of the requested resource. This information is required for all clients making HTTP/1.1 requests. Thus, requiring the host request-header field for all posts to your site safely eliminates illicit requests from hitting your server.
Continue Reading
Just like last year, this Spring I have been taking some time to do some general maintenance here at Perishable Press. This includes everything from fixing broken links and resolving errors to optimizing scripts and eliminating unnecessary plugins. I’ll admit, this type of work is often quite dull, however I always enjoy the process of cleaning up my HTAccess files. In this post, I share some of the changes made to my HTAccess files and explain the reasoning behind each modification. Some of the changes may surprise you! ;)
Continue Reading
Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about my error monitoring practices, I decided to share my response here at Perishable Press, and hopefully get some good feedback concerning best practices for error monitoring. Here is my email response to the question:
Continue Reading
You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers.
For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the Mainstream Media website and happened to follow a link to your site (of all places), you would look at your access logs, notice Johnny’s visit, and speak out loud (slowly): “hmmm.. it looks like the Mainstream Media website referred my good pal Johnny to my Alka-Seltzer sales page.” In such a bizarre case, the Mainstream Media website — or specific page — is referred to as (no pun intended) the referrer.
Continue Reading
As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against malicious behavior, it may certainly provide another layer of protection.
Even so, there are several things to consider before choosing to implement an extensive user-agent blacklist on your site. First and most importantly is the transient nature of the user agent itself. On most systems, the user-agent variable is easy to change, making it possible for bot owners to use any user-agent name they wish. Once a bad bot makes the rounds, becomes known, and is blacklisted, the bot owner need only modify or change its declared user agent and they’re back in business. User-agent names are constantly invented, spoofed, or otherwise altered in order to operate beneath — or above — the virtual radar. Thus, a user-agent blacklist is a high-maintenance affair, requiring continuous cultivation in order to maintain relevancy and effectiveness.
Continue Reading
At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your website against a wide range of malicious activity. Like its 3G predecessor, the 4G Blacklist is designed for use on Apache servers and is easily implemented via HTAccess or the httpd.conf configuration file. In order to function properly, the 4G Blacklist requires two specific Apache modules, mod_rewrite and mod_alias. As with the third generation of the blacklist, the 4G Blacklist consists of multiple parts:
Update Feb 22, 2011: The 5G version of the blacklist is available now in beta.
Continue Reading
Last year, after much research and discussion, I built a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Unfortunately, these mega-lists eventually became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..
Continue Reading
With the imminent release of the next series of (4G) blacklist articles here at Perishable Press, now is the perfect time to examine eight of the most commonly employed blacklisting methods achieved with Apache’s incredible rewrite module, mod_rewrite. In addition to facilitating site security, the techniques presented in this article will improve your understanding of the different rewrite methods available with mod_rewrite.
Blacklist via Request Method
This first blacklisting method evaluates the client’s request method. Every time a client attempts to connect to your server, it sends a message indicating the type of connection it wishes to make. There are many different types of request methods recognized by Apache. The two most common methods are GET and POST requests, which are required for “getting” and “posting” data to and from the server. In most cases, these are the only request methods required to operate a dynamic website. Allowing more request methods than are necessary increases your site’s vulnerability. Thus, to restrict the types of request methods available to clients, we use this block of Apache directives:
Continue Reading
Beautify your default directory listings! Displaying index-less file views is a great way to share files, but the drab, bare-bones interface is difficult to integrate into existing designs. While there are many scripts available to customize the appearance and functionality of default directory navigation, most of these methods are either too complicated, too invasive, or otherwise insufficient for expedient directory styling. In this comprehensive tutorial, you will learn how to use the built-in functionality of Apache’s mod_autoindex module to style and enhance your default directory views with a smorgasbord of stylistic and functional improvements.
Continue Reading
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
From time to time on the show, a contestant places a bid that is so absurd and so asinine that you literally laugh out loud, point at the monitor, and openly ridicule the pathetic loser. On such occasions, even the host of the show will laugh and mock the idiocy. Of course, this same situation happens frequently here at Perishable Press, where the scumbags that manage to escape the 3G Blacklist are proving themselves to be increasingly desperate and pathetic. Such is the case with this month’s official Blacklist Candidate Number 2008-10-19:
Come on down — you’re the next cracker whore to get banished from the site!
Continue Reading
One of the most useful techniques in my HTAccess toolbox involves URL redirection using Apache’s RedirectMatch directive. With RedirectMatch, you get the powerful regex pattern matching available in the mod_alias module combined with the simplicity and effectiveness of the Redirect directive. This hybrid functionality makes RedirectMatch the ideal method for highly specific redirection. In this tutorial, we will explore the application of RedirectMatch as it applies to one of the most difficult redirect scenarios: redirecting all requests for a specific subdirectory (or any subordinate directory or file) to the root (or any parent) directory. We will explore how to accomplish this redirect using PHP in a subsequent article.
Continue Reading
Before Summer arrives, I need to post the conclusion to my seasonal article, Perishable Press HTAccess Spring Cleaning, Part 1. As explained in the first post, I recently spent some time to consolidate and optimize the Perishable Press site-root and blog-root HTAccess files. Since the makeover, I have enjoyed better performance, fewer errors, and cleaner code. In this article, I share some of the changes made to the blog-root HTAccess file and provide a brief explanation as to their intended purpose. Granted, most of the blog-root directives affected by the renovation involve redirecting broken/missing URLs, but there are some other gems mixed in as well. In sharing these deprecated excerpts, I hope to inspire others to improve their own HTAccess and/or configuration files. What an excellent way to wrap up this delightful Spring season! :)
Continue Reading
Welcome to the Perishable Press “Blacklist Candidate” series. In this post, we continue our new tradition of exposing, humiliating and banishing spammers, crackers and other worthless scumbags..
Just under the wire! Even so, this month’s official Blacklist-Candidate article may be the last monthly installment of the series. Although additional BC articles may appear in the future, it is unlikely that they will continue as a regular monthly feature. Oh sure, I see the tears streaming down your face, but think about it: this is actually good news. After implementing the 3G Blacklist, finding decent blacklist candidates is becoming increasingly difficult. This is a good thing because it indicates that the new blacklist strategy is working. Nonetheless, every now and then, some brainless bozo will magically appear and poke around where it shouldn’t be poking. Such is the case with this month’s Blacklist Candidate Number 2008-05-31:
Come on down! You’re the next worthless turd to get banished from the site!
Continue Reading
In the now-complete series, Building the 3G Blacklist, I share insights and discoveries concerning website security and protection against malicious attacks. Each article in the series focuses on unique blacklist strategies designed to protect sites transparently, effectively, and efficiently. The five articles culminate in the release of the next generation 3G Blacklist.
For the record, here is a quick summary of the entire Building the 3G Blacklist series:
Continue Reading
While developing the 3G Blacklist, I completely renovated the Perishable Press site-root and blog-root HTAccess files. Since the makeover, I have enjoyed better performance, fewer errors, and cleaner code. In this article, I share some of the changes made to the root HTAccess file and provide a brief explanation as to their intended purpose and potential benefit. In sharing this information, I hope to inspire others to improve their own HTAccess and/or configuration files. In the next article, I will cover some of the changes made to the blog-root HTAccess file. As always, suggestions and questions are always welcome — just drop a comment below! Have fun!! :)
Continue Reading
As you know, HTAccess files are powerful tools for manipulating site performance and functionality. Protecting your site’s HTAccess files is critical to maintaining a secure environment. Fortunately, preventing access to your HTAccess files is very easy. Let’s have a look..
Different Methods
If you search around the Web, you will probably find several different methods of protecting your HTAccess files. Here are a few examples, along with a bit of analysis:
Case-sensitive protection — As far as I know, this is the most widespread method of protecting HTAccess files. Very straightforward, this code will prevent anyone from accessing any file named precisely “.htaccess”. This is not ideal because the match is case sensitive. On certain systems, HTAccess files protected with this method may remain accessible via “HTACCESS”, for example.
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-release.gif)
After much research and discussion, I have developed a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Over time, these mega-lists became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..
The 3G Blacklist
Work on the 3G Blacklist required several weeks of research, testing, and analysis. During the development process, five major improvements were discovered, documented, and implemented. Using pattern recognition, access immunization, and multiple layers of protection, the 3G Blacklist serves as an extremely effective security strategy for preventing a vast majority of common exploits. The list consists of four distinct parts, providing multiple layers of protection while synergizing into a comprehensive defense mechanism. Further, as discussed in previous articles, the 3G Blacklist is designed to be as lightweight and flexible as possible, thereby facilitating periodic cultivation and maintenance. Sound good? Here it is:
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-series_05.gif)
In this continuing five-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. Wrapping up the series with this article, I provide the final key to our comprehensive blacklist strategy: selectively blocking individual IPs. Previous articles also focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. In the next article, these five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Selectively Blocking Individual IPs
The final component of the 3G Blacklist establishes a vehicle through which individual IPs may be blocked. As previously discussed, the goal of the 3G Blacklist involves moving away from extensive lists of blacklisted user agents, IP addresses, and other identifying attributes. Nonetheless, a comprehensive security strategy benefits from the ability to quickly and easily neutralize threats by directly blocking individual IP addresses.
In my experience, an effective IP blacklist should be kept as current as possible. Unfortunately, once lists grow to over 100 entries, they become quite unmanageable. Periodic verification of blocked IPs becomes increasingly difficult as lists grow in size. Uncultivated blacklists that continue to grow in size will eventually do more harm than good. Beyond the burden placed on server resources, constantly changing IP ownership inevitably results in unintentional blocking of legitimate traffic.
Just as with individual blocking of user agents, blocking individual (or ranges of) IP addresses provides a secondary layer of protection against direct attacks, corrupt networks, and other case-specific situations. This level of granular control enables defensive fine-tuning for targeting and blocking specific machines. Known repeat offenders should remain permanently blacklisted, whereas short-term offenders should be blocked temporarily. Examples of known repeat offenders typically include:
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-series_04.gif)
In this continuing five-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. In this fourth article, I build upon previous ideas and techniques by improving the directives contained in the original, 2G Blacklist. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.
Improving the RedirectMatch Directives of the Original 2G Blacklist
In the first version (2G) of the next-generation blacklist, a collection of malicious attack strings were compiled from access logs, blocked via Apache’s RedirectMatch directive, tested for effectiveness and transparency (i.e., non-interference), and finally converted into the 2G Blacklist. This 2G Blacklist continues to serve as my primary line of defense here at Perishable Press, replacing several other lists, including a million individual IP addresses and even the Ultimate htaccess Blacklist itself. So far, the results have extraordinary.
Of course, as discussed in Building the 3G Blacklist, Part 3, static, “fix-it-and-forget-it”-type strategies ultimately fail when dealing with the perpetually evolving nature of the enemy (i.e., malicious attackers, crackers, and other scumbags). Thus, as stated in the original 2G article, the blacklist is a work in progress, designed for periodic updates as new attack patterns and trends are identified. In the remainder of this article, we consider various aspects of the previous blacklist and consider various improvements. Note that it is not necessary to assemble and transcribe any of the new blacklist directives presented herein. At the conclusion of this series, each of the different strategies will be combined into a single, comprehensive security strategy: The 3G Blacklist!
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-series_03.gif)
In this continuing five-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. In this third article, I discuss targeted, user-agent blacklisting and present an alternate approach to preventing site access for the most prevalent and malicious user agents. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Selectively Blocking Rogue User Agents
Several months ago, while developing improved methods for protecting websites against malicious attacks, I decided to remove the Ultimate htaccess Blacklist from the Perishable Press domain. In its place, I have been using the 2G Blacklist, individual IP blocking, and an extremely concise selection of blacklisted user-agents. So far, this new, streamlined strategy has proven just as (if not more) effective as the previous “ultimate-blacklist” method.
There are several reasons why using extensive lists of blocked user agents inevitably fails to provide worthwhile protection. First, creating and maintaining a current list of every nasty agent consumes far too much time. Once lists become significantly populated, efficient cultivation becomes increasingly difficult. Once an agent is added to the list, it typically remains there indefinitely, unless the required time is spent to periodically verify and edit each and every entry.
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-series_02.gif)
In this continuing five-article series, I share insights and discoveries concerning website security and protecting against malicious attacks. In this second article, I present an incredibly powerful method for eliminating malicious query string exploits. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Preventing Malicious Query String Exploits
A vast majority of website attacks involves appending malicious query strings onto legitimate, indexed URLs. Any webmaster serious about site security is well-familiar with the following generalized access-log entries:
Continue Reading
![[ 3G Stormtroopers ]](http://perishablepress.com/press/wp-content/images/2008/blacklist-3g/blacklist-series_01.gif)
In this series of five articles, I share insights and discoveries concerning website security and protecting against malicious attacks. In this first article of the series, I examine the process of identifying attack trends and using them to immunize against future attacks. Subsequent articles will focus on key blacklist strategies designed to protect your site transparently, effectively, and efficiently. At the conclusion of the series, the five articles will culminate in the release of the next generation 3G Blacklist.
Improving Site Security by Recognizing and Exploiting Server Attack Patterns
Crackers, spammers, scrapers, and other attackers are getting smarter and more creative with their methods. It is becoming increasingly difficult to identify elements of an attack that may be effectively manipulated to prevent further attempts. For example, attackers almost exclusively employ decentralized botnets to execute their attacks. Such decentralized networks consist of large numbers of compromised computer systems that are remotely controlled to post data, run scripts, and request resources. Each compromised computer used in an attack is associated with a unique IP, thus making it impossible to prevent future attempts by blocking one or more addresses:
Continue Reading