Stop Bitacle from Stealing Content
If you have yet to encounter the content-scraping site, bitacle.org, consider yourself lucky. The scum-sucking worm-holes at bitacle.org are well-known for literally, blatantly, and piggishly stealing blog content and using it for financial gains through advertising. While I am not here to discuss the legal, philosophical, or technical ramifications of illegal bitacle behavior, I am here to provide a few critical tools that will help stop bitacle from stealing your content.
The .htaccess Finger
Perhaps the most straightforward and effective method for keeping the bitacle thieves away from your site, adding the following htaccess rules to your root htaccess file will literally block bitacle’s IP address and return a 403 “Forbidden” response. To make it happen, add this to your site’s root htaccess file:
RewriteBase /
RewriteCond %{REMOTE_ADDR} ^212\.22\.59\.251$ [OR]
RewriteCond %{HTTP_USER_AGENT} Bitacle
RewriteRule .? - [F]
For more information on .htaccess files and blocking unwanted requests, check out Stupid htaccess Tricks and How to Block Bad Bots.
The robots.txt Slap
Next up, another effective anti-bitacle method that instructs the bitacle bots to stay away from your site. This method uses a robots.txt
file in your site’s root directory and literally denies bitacle agents crawl-access to all site contents. Simply add the following lines to your site’s root robots.txt
file:
User-agent: Bitacle bot/1.1
Disallow: /
User-agent: Bitacle bot
Disallow: /
User-agent: Bitacle *
Disallow: /
User-agent: Bitacle*
Disallow: /
User-agent: Bitacle
Disallow: /
For more information on robots.txt
, check out Robots Notes Plus and Better Robots.txt Rules for WordPress.
Related WordPress Plugins
For more help on the anti-plagiarism front, check out the Copyfight and Copyright Proof. These fine WordPress plugins come highly recommended and are definitely worth checking out.
Other Essential Tools
Beyond the essential preventative methods discussed above, there are many other resources and tools now available for dealing with site scrapers, content thieves, and other worthless garbage. A worthwhile website is Copyscape, which provides an excellent tool that enables users to search the web for stolen content. If you find that your content has indeed been plagiarized, read up on how to respond properly and effectively. Finally, try searching for various search terms, such as “plagiarism tools”, “content scraping”, “copyright protection”, “syndication theft”, etc. Good Luck!
References & Resources
- .htaccess made easy
- Stupid htaccess Tricks
- Robots Notes Plus
- Better Robots.txt Rules for WordPress
- How to Deal with Content Scrapers
- How to Protect Your Site Against Content Thieves
- Stop bitacle!
- Detect Copyright Theft
- Copyright and Plagiarism Blog
2 responses to “Stop Bitacle from Stealing Content”
Figured I’d let you know that Bitacle’s bot does not pay attention to robots.txt rules. The most effective way to stop them is to simply ban their user-agent and take some measures to ensure that your content can’t be easily spidered/stolen.
Also, you are at far greater risk by using a service such as FeedBurner.
Yeah, I have read elsewhere that bitacle ignores robots.txt rules, but I am paranoid enough to include them anyway. It may not be necessary, but it is the formally accepted method, and it definitely won’t hurt anything.
As for FeedBurner (and similar services), the benefits of their service currently outweigh the potential threat of content hounds like bitacle. Nonetheless, I definitely will be looking into it further and perhaps changing my mind if anything serious unfolds.. Either way, I appreciate your comment and the heads up concerning our mutual enemy! ;)