Normally, when visitors post a comment to your site, specific types of client data are associated with the request. Commonly, a client will provide a user agent, a referrer, and a host header. When any of these variables is absent, there is good reason to suspect foul play. For example, virtually all browsers provide some sort of user-agent name to identify themselves. Conversely, malicious scripts directly posting spam and other payloads to your site frequently operate without specifying a user […] Continue reading »
When designing sites, it is often useful to identify different pages by adding an ID attribute to the <body></body> element. Commonly, the name of the page is used as the attribute value, for example: <body id="about"></body> In this case, “about” would be the body ID for the “About” page, which would be named something like “about.php”. Likewise, other pages would have unique IDs as well, for example: <body id="archive"> </body><body id="contact"> </body><body id="subscribe"> </body><body id="portfolio"></body> ..again, with each ID associated […] Continue reading »
The other day, my server crashed and Perishable Press was unable to connect to the MySQL database. Normally, when WordPress encounters a database error, it delivers a specific error message similar to the following: Continue reading »
Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about […] Continue reading »
Importing and displaying external RSS feeds on your site is a great way to share your online activity with your visitors. If you are active on Flickr, Delicious, Twitter, or Tumblr, your visitors will enjoy staying current with your updates. Many social media sites provide exclusive feeds for user-generated content that may be imported and displayed on virtually any web page. In this article, you will learn three ways to import and display feed content on your WordPress-powered website — […] Continue reading »
You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers. Shortcut: skip the article and jump to Disclaimer and Download » Referrer Spam Sucks For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the […] Continue reading »
In addition to your choice collection of “Share This” links, you may also want to provide visitors with a link that enables them to quickly and easily send the URL permalink of any post to their friends via email. This is a great way to increase your readership and further your influence. Just copy & paste the following code into the desired location in your page template: <a href="mailto:?subject=Fresh%20Linkage%20@%20Perishable%20Press&body=Check%20out%20<?php the_permalink(); ?>%20from%20Perishable%20Press" title="Send a link to this post via email" rel="nofollow">Share […] Continue reading »
As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against […] Continue reading »
At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your site against a wide range of automated attacks and other malicious activity. Continue reading »
I really hate bad robots. When a web crawler, spider, bot — or whatever you want to call it — behaves in a way that is contrary to expected and/or accepted protocols, we say that the bot is acting suspiciously, behaving badly, or just acting stupid in general. Unfortunately, there are thousands — if not hundreds of thousands — of nefarious bots violating our sites every minute of the day. For the most part, there are effective methods available enabling […] Continue reading »
Last year, after much research and discussion, I built a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Unfortunately, these mega-lists eventually became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works […] Continue reading »
In my recent article on blocking proxy servers, I explain how to use HTAccess to deny site access to a wide range of proxy servers. The method works great, but some readers want to know how to allow access for specific proxy servers while denying access to as many other proxies as possible. Fortunately, the solution is as simple as adding a few lines to my original proxy-blocking method. Specifically, we may allow any requests coming from our whitelist of […] Continue reading »
Canonical URLs are important for maintaining consistent linkage, reducing duplicate content issues, and increasing the overall integrity of your site. In addition to cleaning up trailing slashes and removing extraneous index.php and index.html strings, removing the www subdirectory prefix is an excellent way to shorten links and deliver consistent, canonical URLs. Of course, an optimal way of removing (or adding) the www prefix is accomplished via HTAccess canonicalization: Continue reading »
With the imminent release of the next series of (4G) blacklist articles here at Perishable Press, now is the perfect time to examine eight of the most commonly employed blacklisting methods achieved with Apache’s incredible rewrite module, mod_rewrite. In addition to facilitating site security, the techniques presented in this article will improve your understanding of the different rewrite methods available with Apache mod_rewrite. Note: I changed the title of this post from “Eight Ways to Blacklist..” to “Eight Ways to […] Continue reading »
If, for whatever reason, you don’t want to use Feedburner to track your feed statistics, this article describes a relatively simple, “roll-your-own” alternative. Instead of redirecting your feed traffic through Feedburner, keep your original feed URLs and place the following code into a file named “feed_stats.php” (or whatever) and upload to your server: Continue reading »