Latest TweetsNew version of Disable Gutenberg includes options to disable for specific theme templates and/or post/page IDs.…
Perishable Press

Stop WordPress from Leaking PageRank to Admin Pages

During the most recent Perishable Press redesign, I noticed that several of my WordPress admin pages had been assigned significant levels of PageRank. Not good. After some investigation, I realized that my ancient robots.txt rules were insufficient in preventing Google from indexing various WordPress admin pages. Specifically, the following pages have been indexed and subsequently assigned PageRank:

  • WP Admin Login Page
  • WP Lost Password Page
  • WP Registration Page
  • WP Admin Dashboard

Needless to say, it is important to stop WordPress from leaking PageRank to admin pages. Instead of wasting our hard-earned link-equity on non-ranking pages, let’s redirect it to more important pages and posts. In order to accomplish this, we will attack the problem on three different fronts: admin links, robots.txt rules, and meta tags. Let’s take a quick look at each of these three methods..

1. Eliminate sitewide links to your admin pages

Even better, bookmark your login page and eliminate all links to your admin pages. Unless you are actively encouraging people to register with your blog, linking to the login/registration/password page is pointless. If you must link to your admin pages, consolidate them into one location by using if_home() or something similar. The goal here is to eliminate sitewide links to your admin pages. Not only will it help stop the PR leakage, it will simplify your site as well. To add login and register links to your home page only, customize and insert the following code into your sidebar or other target location:

<?php if (is_home()) { ?>
<ul><?php wp_register(); ?>
<li><?php wp_loginout(); ?></li>
<?php } ?>

2. Disallow search engines from your admin pages via robots.txt rules

Disallowing search engines from crawling areas that do not need indexing is a great way to conserve link equity and redistribute it to more critical parts of your site. Although Google seems to obey robots.txt rules, I am still unconvinced that the other search engines follow suit. Nonetheless, we are focusing on Google here, and if the other major engines play along, then more power to us. Just keep in mind that robots.txt rules currently exist as more of an ideal, standardized method of controlling crawl behavior. Adding rules may be a good thing to do, but it is far from a complete solution. To formally disallow Google and all other (obedient) search engines from accessing any behind-the-scenes admin pages, add these rules to your site’s robots.txt file:

User-agent: *
Disallow: */wp-admin/*
Disallow: */wp-login.php
Disallow: */wp-register.php

3. Add noindex, nofollow meta tags to your admin pages

Perhaps the best way to prevent search engines from crawling your admin pages is to explicitly mark them with noindex, nofollow meta robots tags. In my experience, while only a few search engines obey rules specified via robots.txt, all of the four major search engines (Ask, Google, Live, and Yahoo!) seem to obey rules specified via meta tags. Further, while robots.txt rules disallow crawling in general, meta rules allow differentiation between crawling, indexing, archiving, caching, and much more. For our admin pages, we want to forbid all search engine activity — no crawling or indexing allowed. Implementing such meta tags throughout your admin area involves three different WordPress files and requires six insertions of the following code:

<meta name="googlebot" content="noindex,noarchive,nofollow" />
<meta name="msnbot" content="noindex,nofollow" />
<meta name="robots" content="noindex,nofollow" />

The previous set of meta tags explicitly forbids Google, MSN, and all other search engines from crawling and indexing the page. We are going to copy & paste these tags into each of the head elements located in the following files:

  • /wp-admin/admin-header.php (1x)
  • /wp-login.php (2x)
  • /wp-register.php (3x)

In each of the previous files, locate each instance of “<head>”. Copy & paste the three meta elements (provided above) somewhere within each of the head elements. For example, place the tags immediately after the <title> element. Upon completion, you will have added a total of six (6) sets of meta tags, according to the numbers specified in the list of files presented above. After uploading your files, check their source code in a browser to verify the tags have been added correctly.

If using all three of these methods seems like overkill, using only the third method should be sufficient. I currently employ the second two techniques, and plan on removing sitewide admin links during the next redesign. Of course, if you really want to lock down your admin pages, nothing works better than htaccess, but we’ll save that for another article.. ;)

Jeff Starr
About the Author Jeff Starr = Fullstack Developer. Book Author. Teacher. Human Being.
15 responses
  1. Jeff Starr

    Yes, robots.txt rules are effective for Google and other compliant search engines, but they are far from universally obeyed or even acknowledged. Many crawlers completely disregard the robots.txt file and jump directly into any page they can find. Of course, with Google completely dominating the world of search, optimizing your pages for everyone else may seem like wishful thinking; however, we shouldn’t put all of our link equity into one engine (so to speak).

  2. Thanks, that was really helpful. Hope it will work for me.

  3. Jeff Starr

    Yes indeed, should work fine, Niklas! — Good luck!

[ Comments are closed for this post ]