Controlling the spidering, indexing and caching of your (X)HTML-based web pages is possible with meta robots directives such as these: <meta name="googlebot" content="index,archive,follow,noodp"/> <meta name="robots" content="all,index,follow"/> <meta name="msnbot" content="all,index,follow"/> I use these directives here at Perishable Press and they continue to serve me well for controlling how the “big bots”1 crawl and represent my (X)HTML-based content in search results. For other, non-(X)HTML types of content, however, using meta robots directives to control indexing and caching is not an option. An […] Continue reading »
During the most recent Perishable Press redesign, I noticed that several of my WordPress admin pages had been assigned significant levels of PageRank. Not good. After some investigation, I realized that my ancient robots.txt rules were insufficient in preventing Google from indexing various WordPress admin pages. Specifically, the following pages have been indexed and subsequently assigned PageRank: Continue reading »
Time is running out! Soon, it will be time for the next Google PageRank (PR) update. While it is difficult to predict how your site will perform overall, it seems likely that your highest ranking pages will continue to rank well. The idea behind this article is to improve your site’s overall pagerank by totally beefing up your most popular pages. Of course, every page on your site is important. Ideally, you would want to employ these techniques to every […] Continue reading »
After studying Peter Kent’s excellent book, Search Engine Optimization for Dummies, several key methods emerged for optimizing websites for the search engines. Although the book is written for people who are new to the world of search engine optimization (SEO), many of the principles presented throughout the book remain important, fundamental practices even for the most advanced SEO-wizards. This article divulges these very useful SEO practices and organizes them into manageable chunks. Continue reading »
In his excellent book, Search Engine Optimization for Dummies, Peter Kent explains that many search engines actually get their search results from one (or more) of the larger search engines, such as Google or The Open Directory Project. Therefore, the author concludes that it may not be necessary to spend endless hours registering with thousands of the smaller search sites. Rather, the author provides a brief list of absolutely essential search sites with which it is highly recommended to register. […] Continue reading »
Optimizing your website for the search engines involves many important aspects including keyword development, search engine registration, and SEO logging. This Perishable Press tutorial scopes yet another critical weapon in the SEO wars: establishing and evolving an effective link campaign. We will begin our article by focusing on incoming and outgoing link strategies, proceed with a few tips for internal links, and then conclude with some ideas for getting links. Continue reading »
About the Robots Exclusion Standard: The robots exclusion standard or robots.txt protocol is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website. The information specifying the parts that should not be accessed is specified in a file called robots.txt in the top-level directory of the website. Notes on the robots.txt Rules: Rules of specificity apply, not inheritance. Always include a blank line between rules. Note also that not all robots […] Continue reading »