Articles tagged as “url

Here is a list of all articles tagged as “url”. If you enjoy the high-quality content that I provide here at Perishable Press, you may want to subscribe to our main content feed to stay current.

Stop 404 Requests for Mobile Versions of Your Site
If you’ve been keeping an eye on your 404 errors recently, you will have noticed an increase in requests for nonexistent mobile files and directories, especially over the past year or so. The scripts and bots requesting these files from your server seem to be looking for a mobile version of your site. Unfortunately, they are wasting bandwidth and resources in the process. It has become common to see the following 404 errors constantly repeated in your log files: http://domain.tld/apple-touch-icon.png http://domain.tld/iphone http://domain.tld/mobile http://domain.tld/mobi http://domain.tld/m So some bot comes along, assumes that your site includes a mobile version, and then tries its hand at guessing the ...
Fixing WordPress Infinite Duplicate Content Issue
Jeff Morris recently demonstrated a potential issue with the way WordPress handles multipaged posts and comments. The issue involves WordPress’ inability to discern between multipaged posts and comments that actually exist and those that do not. By redirecting requests for nonexistent numbered pages to the original post, WordPress creates an infinite amount of duplicate content for your site. In this article, we explain the issue, discuss the implications, and provide an easy, working solution. Understanding the “infinite duplicate content” issue Using the tag, WordPress makes it easy to split your post content into multiple pages, and also makes it easy to paginate the display of your comment threads. For both paged posts ...
Protect WordPress Against Malicious URL Requests
A few months ago, many WordPress sites were attacked with some extremely malicious code. While searching for a good solution, I discovered the following gem of a plugin in the pastebin repository: This script checks for excessively long request strings (i.e., greater than 255 characters), as well as the presence of either “eval(” or “base64” in the request URI. These sorts of nefarious requests were implicated in the September 2009 WordPress attacks. To protect your site using this lightweight script, save the code ...
Remove the WWW Prefix for all URLs via PHP
Canonical URLs are important for maintaining consistent linkage, reducing duplicate content issues, and increasing the overall integrity of your site. In addition to cleaning up trailing slashes and removing extraneous index.php and index.html strings, removing the www subdirectory prefix is an excellent way to shorten links and deliver consistent, canonical URLs. Of course, an optimal way of removing (or adding) the www prefix is accomplished via HTAccess canonicalization: # universal www canonicalization via htaccess # remove www prefix for all urls - replace all domain and tld with yours # http://perishablepress.com/press/2008/04/30/universal-www-canonicalization-via-htaccess/ RewriteEngine On RewriteBase / RewriteCond %{HTTP_HOST} !^domain\.tld$ ...
How to Write Valid URL Query String Parameters
When building web pages, it is often necessary to add links that require parameterized query strings. For example, when adding links to the various validation services, you may find yourself linking to an accessibility checker, such as the freely available Cynthia service: WCAG Accessibility Check Another example is seen when linking your feed to a feed validation service: RSS Feed Validation And one final example showing a more complex query string: Bookmark at Delicious As is, however, these links won’t validate due to a number of issues. Let’s fix ‘em up with a few quick-and-easy changes. Replace ampersands with & One of the reasons these links aren’t ...
Redirect any Subordinate URL to its Parent Directory via PHP
Simple one for you today. After posting on how to use HTAccess to redirect subordinate URLs to the root (or parent) directory, I thought I would share an alternate way of accomplishing the same trick using PHP. Fortunately, using this PHP redirect technique doesn’t require access to or fiddling with your site’s HTAccess (or Apache configuration) file and it is very easy to implement. The scene, as discussed in greater detail in my previous article on this topic, involves a very ...
Redirect All Requests for a Nonexistent File to the Actual File
In my previous article on redirecting 404 requests for favicon files, I presented an HTAccess technique for redirecting all requests for nonexistent favicon.ico files to the actual file located in the site’s web-accessible root directory: # REDIRECT FAVICONZ RewriteCond %{THE_REQUEST} favicon.ico [NC] RewriteRule (.*) http://domain.tld/favicon.ico [R=301,L] As discussed in the article, this code is already in effect here at Perishable Press, as may be seen by clicking on any of the following links: http://perishablepress.com/press/favicon.ico http://perishablepress.com/press/2007/06/12/favicon.ico...
Stop the Madness: Redirect those Ridiculous Favicon 404 Requests
For the last several months, I have been seeing an increasing number of 404 errors requesting “favicon.ico” appended onto various URLs: http://perishablepress.com/press/favicon.ico http://perishablepress.com/press/2007/06/12/favicon.ico http://perishablepress.com/press/2007/09/25/absolute-horizontal-and-vertical-centering-via-css/favicon.ico http://perishablepress.com/press/2007/08/01/temporary-site-redirect-for-visitors-during-site-updates/favicon.ico http://perishablepress.com/press/2007/01/16/maximum-and-minimum-height-and-width-in-internet-explorer/favicon.ico When these errors first began appearing in the logs several months ago, I didn’t think too much of it — “just another idiot who can’t find my site’s favicon..” As time went on, however, the frequency and variety of these misdirected requests continued to increase. A bit frustrating perhaps, but not serious enough to justify immediate action. After all, what’s the worst that can happen? The idiot might actually find the blasted thing? Wouldn’t that be nice.. But no, the 404 favicon errors just won’t go away. Last week, ...
Unexplained Crawl Behavior Involving Tagged Query Strings
I need your help! I am losing my mind trying to solve another baffling mystery. For the past three or four months, I have been recording many 404 Errors generated from msnbot, Yahoo-Slurp, and other spider crawls. These errors result from invalid requests for URLs containing query strings such as the following: http://perishablepress.com/press/page/2/?tag=spam http://perishablepress.com/press/page/3/?tag=code http://perishablepress.com/press/page/2/?tag=email http://perishablepress.com/press/page/2/?tag=xhtml http://perishablepress.com/press/page/4/?tag=notes http://perishablepress.com/press/page/2/?tag=flash http://perishablepress.com/press/page/2/?tag=links http://perishablepress.com/press/page/3/?tag=theme http://perishablepress.com/press/page/2/?tag=press ..plus hundreds and hundreds more 1. The URL pattern is always the same: a different page number followed by a query string containing one of the tags used here at ...
Universal www-Canonicalization via htaccess
During my previous rendezvous involving comprehensive canonicalization for WordPress, I offer my personally customized technique for ensuring consistently precise and accurate URL delivery. That particular method targets WordPress exclusively (although the logic could be manipulated for general use), and requires a bit of editing to adapt the code to each particular configuration. In this follow-up tutorial, I present a basic www-canonicalization technique that accomplishes the following: requires or removes the www prefix for all URLs absolutely no editing when requiring the www prefix minimal amount of editing when removing the www prefix minimal ...
What is My WordPress Feed URL?
For future reference, this article covers each of the many ways to access your WordPress-generated feeds. Several different URL formats are available for the various types of WordPress feeds — posts, comments, and categories — for both permalink and default URL structures. For each example, replace “http://domain.tld/” with the URL of your blog. Note: even though your blog’s main feed is accessible through many different URLs, there are clear benefits to using a single, consistent feed URL throughout your site. WordPress Post-Feed Formats ...
Permalink Evolution: Customize and Optimize Your Dated WordPress Permalinks
How to streamline and maximize the effectiveness of your WordPress URLs by using htaccess to remove extraneous post-date information: years, months, and days.. Recently, there has been much discussion about whether or not to remove the post-date information from WordPress permalinks 1. Way back during the WordPress 1.2/1.5 days, URL post-date inclusion had become very popular, in part due to reports of potential conflicts with post-name-only permalinks. Throw in the ...
Comprehensive URL Canonicalization via htaccess for WordPress-Powered Sites
Permalink URL canonicalization is automated via PHP in WordPress 2.3+, however, for those of us running sites on pre-2.3 versions or preferring to deal with rewrites directly via Apache, comprehensive WordPress URL canonicalization via htaccess may seem impossible. While there are several common methods that are partially effective, there has not yet been available a complete, user-friendly solution designed specifically for WordPress. Until now.. In this article, I share my “secret” htaccess URL canonicalization formula. I originally developed this method in July ...
WordPress Lessons Learned, Part 1: Permalink Structure
While planning my current site renovation project, I considered changing the format of my permalinks. Reasons for modifying the permalink structure of a site include: Optimizing URLs for the search engines Simplifying URL structure for improved readability Removing the implication that your site content is somehow organized chronologically Removing other unwanted organizational implications (e.g., categorically, topically, etc.) Like many people who configured WordPress permalinks a couple of years ago, I chose to include the day, month, and year along with the blog URL and post title. For over two years now, Perishable Press has employed the following ...
Comprehensive Reference for WordPress NoNofollow/Dofollow Plugins
Recently, while deliberating an optimal method for eliminating nofollow link attributes from Perishable Press, I collected, installed, tested and reviewed every WordPress no-nofollow/dofollow plugin that I could find. As of the writing of this post, I have evaluated 12 15 dofollow plugins, all of which are freely available on the Internet. In this article, I present a concise, current, and comprehensive reference for WordPress no-nofollow and dofollow plugins. Every attempt has been made to provide accurate, useful, and complete information for each of the plugins represented below. Further, as this subject is a newfound interest of mine, it is my intention ...
Another Mystery Solved..
Recently, after researching comment links for an upcoming article, I realized that my default values were being submitted as the URL for all comments left without associated website information. During the most recent site redesign, I made the mistake of doing this in comments.php: ... ... Notice the value="[website]" attribute? It seemed like a good idea at the time — I even threw in a nice onfocus auto-highlighting snippet for good measure. I ran the form with this in place for around eight weeks before finally noticing multiple comments using this for their site URL: http://website Hmmm. Not ...
Harvesting cPanel Raw Access Logs
Harvesting Raw Logs For those of us using cPanel as the control panel for our websites, a wealth of information is readily available via cPanel ‘Raw Access Logs’. These logs are perpetually updated with data involving user agents, IP addresses, HTTP activity, resource access, and a whole lot more. Here is a quick tutorial on accessing and interpreting your cPanel raw access logs. Part One: Grab ‘em To grab a copy of your raw access logs, log into cPanel and click on the "Raw Access Logs" icon. Within the Raw Access Log interface, scroll through the list ...
URL Character Codes
URL’s frequently employ potentially conflicting characters such as question marks, ampersands, and pound signs. Fortunately, it is possible to encode such characters via their escaped hexadecimal ASCII representations. For example, we would write "?" as "%3F". Here are a few more URL character codes (case-insensitive):     %3E #     %23 %     %25 {     %7B }     %7D |     %7C \     %5C ^     %5E ~     %7E [     %5B ]     %5D `     %60 ;     %3B /     %2F ?     %3F :     %3A @     %40 =     %3D &     %26 $     %24 +     %2B "     %22 space     %20 References network-tools.com URL Encoding

Attention: Do NOT follow this link!