Tell Google NOT to Index Certain Parts of Your Web Pages

There are several ways to instruct Google to stay away from various pages in your site: Robots.txt directives Nofollow attributes on links Meta noindex/nofollow directives X-Robots noindex/nofollow directives ..and so on. These directives all function in different ways, but they all serve the same basic purpose: control how Google crawls the various pages on your site. For example, you can use meta noindex to instruct Google not to index your sitemap, RSS feed, or any other page you wish. This […] Continue reading »

Dynamic Link Insertion via Unobtrusive External JavaScript

In my recent guest post at The Nexus, I discuss Google’s new nofollow policy and suggest several ways to deal with it. In that article, I explain how Google allegedly has changed the way it deals with nofollow links. Instead of transferring leftover nofollow juice to remaining dofollow links as they always have, Google now pours all that wonderful nofollow juice right down the drain. This shift in policy comes as a terrible surprise to many webmasters and SEO gurus, […] Continue reading »

HTAccess Spring Cleaning 2009

Just like last year, this Spring I have been taking some time to do some general maintenance here at Perishable Press. This includes everything from fixing broken links and resolving errors to optimizing scripts and eliminating unnecessary plugins. I’ll admit, this type of work is often quite dull, however I always enjoy the process of cleaning up my HTAccess files. In this post, I share some of the changes made to my HTAccess files and explain the reasoning behind each […] Continue reading »

SEO Experiment: Let Google Sort it Out

One way to prevent Google from crawling certain pages is to use <meta /> elements in the <head></head> section of your web documents. For example, if I want to prevent Google from indexing and archiving a certain page, I would add the following code to the head of my document: Continue reading »

Best Practices for Error Monitoring

Given my propensity to discuss matters involving error log data (e.g., monitoring malicious behavior, setting up error logs, and creating extensive blacklists), I am often asked about the best way to go about monitoring 404 and other types of server errors. While I consider myself to be a novice in this arena (there are far brighter people with much greater experience), I do spend a lot of time digging through log entries and analyzing data. So, when asked recently about […] Continue reading »

4G Series: The Ultimate Referrer Blacklist, Featuring Over 8000 Banned Referrers

You have seen user-agent blacklists, IP blacklists, 4G Blacklists, and everything in between. Now, in this article, for your sheer and utter amusement, I present a collection of over 8000 blacklisted referrers. Shortcut: skip the article and jump to Disclaimer and Download » Referrer Spam Sucks For the uninitiated, in teh language of teh Web, a referrer is the online resource from whence a visitor happened to arrive at your site. For example, if Johnny the Wonder Parrot was visiting the […] Continue reading »

4G Series: The Ultimate User-Agent Blacklist, Featuring Over 1200 Bad Bots

As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against […] Continue reading »

The Perishable Press 4G Blacklist

At last! After many months of collecting data, crafting directives, and testing results, I am thrilled to announce the release of the 4G Blacklist! The 4G Blacklist is a next-generation protective firewall that secures your site against a wide range of automated attacks and other malicious activity. Continue reading »

Yahoo! Slurp too Stupid to be a Robot

I really hate bad robots. When a web crawler, spider, bot — or whatever you want to call it — behaves in a way that is contrary to expected and/or accepted protocols, we say that the bot is acting suspiciously, behaving badly, or just acting stupid in general. Unfortunately, there are thousands — if not hundreds of thousands — of nefarious bots violating our sites every minute of the day. For the most part, there are effective methods available enabling […] Continue reading »

Building the Perishable Press 4G Blacklist

Last year, after much research and discussion, I built a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Unfortunately, these mega-lists eventually became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works […] Continue reading »

Deadly Killer

Priceless.. Continue reading »

Better Image Caching with CSS

I have written previously on the fine art of preloading images without JavaScript using only CSS. These caching techniques have evolved in terms of effectiveness and accuracy, but may be improved further to allow for greater cross-browser functionality. In this post, I share a “CSS-only” preloading method that works better under a broader set of conditions. Previous image-preloading techniques target all browsers, devices, and media types. Unfortunately, certain browsers do not load images that are hidden directly (via the <img […] Continue reading »

The Halving Method

Working a great deal with blacklists, I am frequently trying to isolate and identify problematic code. For example, a blacklist implementation may suddenly prevent a certain type of page from loading. In order to resolve the issue, the blacklist is immediately removed and tested for the offending directive(s). This situation is common to other coding languages as well, especially when dealing with CSS. Identifying problem code is more of an art form than a science, but fortunately, there are a […] Continue reading »

Perfect WordPress Title Tags Redux

In my previous article on WordPress title tags, How to Generate Perfect WordPress Title Tags without a Plugin, We explore everything needed to create perfect titles for your WordPress-powered site. After discussing the functionality and implementation of various code examples, the article concludes with a “perfect” title-tag script that covers all the bases. Or so I thought.. Some time after the article had been posted, Mat8iou chimed in with a couple of ways to improve thie script by cleaning up […] Continue reading »

Valid, SEO-Friendly Post Translation Links

Ever wanted to provide automatic language translations of your web pages without installing another plugin? Here is a valid, SEO-friendly technique that takes advantage of Google’s free translation service. All you need is a PHP-enabled server and you’re good to go. Just copy and paste the following code into the desired location in your page template and enjoy the results. Once in place, this code will produce translation links for eight common languages for every page on your site. Grab, […] Continue reading »

Custom OpenSearch for Your Website

I recently added OpenSearch functionality to Perishable Press. Now, OpenSearch-enabled browsers such as Firefox and IE 7 alert users with the option to customize their browser’s built-in search feature with an exclusive OpenSearch-powered search option for Perishable Press. The autodiscovery feature of supportive browsers detects the custom search protocol and enables users to easily add it to their collection of readily available site-specific search options. Now, users may search the entire Perishable Press domain with the click of a button. […] Continue reading »

« Newer Posts • 12345…9 • Previous Posts »