Spring Sale! Save 30% on all books w/ code: PLANET24
Web Dev + WordPress + Security

4G Series: The Ultimate User-Agent Blacklist, Featuring Over 1200 Bad Bots

[ Image: Inverted Eclipse ] As discussed in my recent article, Eight Ways to Blacklist with Apache’s mod_rewrite, one method of stopping spammers, scrapers, email harvesters, and malicious bots is to blacklist their associated user agents. Apache enables us to target bad user agents by testing the user-agent string against a predefined blacklist of unwanted visitors. Any bot identifying itself as one of the blacklisted agents is immediately and quietly denied access. While this certainly isn’t the most effective method of securing your site against malicious behavior, it may certainly provide another layer of protection.

With Great Power..

Please be aware that there are several things to consider before choosing to implement an extensive user-agent blacklist on your site. First and most importantly is the transient nature of the user agent itself. On most systems, the user-agent variable is easy to change, making it possible for bot owners to use any user-agent name they wish. Once a bad bot makes the rounds, becomes known, and is blacklisted, the bot owner need only modify or change its declared user agent and they’re back in business. User-agent names are constantly invented, spoofed, or otherwise altered in order to operate beneath — or above — the virtual radar. Thus, a user-agent blacklist is a high-maintenance affair, requiring continuous cultivation in order to maintain relevancy and effectiveness.

Performance is another important issue to consider. While a well-maintained user-agent blacklist may average a reasonable number of user agents, blacklists that are simply appended with new names will eventually grow painfully large and ultimately decrease server performance. Then you’re left with a never-ending blacklist of retired user agents that fails to protect your site while slowing things down to a virtual crawl (no pun intended). And despite your best intentions, we both know that taking time for periodic “blacklist maintenance” is a luxury that simply doesn’t exist, at least for most of us.

As if those reasons weren’t enough to persuade you against using an ultimate user-agent blacklist, here is another: the 4G Blacklist. Put simply, the 4G Blacklist is a more effective way to protect your site against a wide variety of spam, exploits, and malicious attacks. Unlike huge lists of banned user agents, the 4G Blacklist requires zero maintenance, consumes fewer resources, and may retain its effectiveness indefinitely.

But alas, for those of you who are still determined to get your hands on the latest “ultimate” user-agent blacklist, here you go..

The Ultimate User-Agent Blacklist

As you may recall, the original Ultimate HTAccess Blacklist was released here at Perishable Press a couple of years ago. Then, several months later, I added more bad user agents, compressed the list into single-line format, and released the Ultimate HTAccess Blacklist 2. This list contained over 300 bad bots and was generally well-received by the community, protecting many sites against a plethora of site rippers, grabbers, spammers, harvesters, bad bots, and other online scum. When used as a solid foundation on which to build and cultivate your own user-agent blacklist, the Ultimate HTAccess Blacklist can help to improve performance, increase security, and conserve precious resources.

Now, in this new and improved version of the Ultimate User-Agent Blacklist, I have integrated my recent collection1 of actively malicious bad bots to more than quadruple the number of blocked user agents. This new list features a whopping 1211 blacklisted user agents, including three of my own creation2 to be used exclusively for my diabolical and obsessive monitoring purposes (insert maniacal laughter here). Also, as with the second version of the user-agent blacklist, this new version is written in compressed, single-line format to facilitate usability and performance.

So, without further ado, here is the third incarnation of the Ultimate User-Agent Blacklist. Simply copy and paste the following code into the root HTAccess file of your site to enjoy a serious reduction in wasted bandwidth, stolen resources, and comment spam. Remember to backup your stuff before you meddle with things, and always test, test, test whenever implementing HTAccess directives.

# PERISHABLE PRESS ULTIMATE USER-AGENT BLACKLIST

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} ^$|\<|\>|\'|\%|\_iRc|\_Works|\@\$x|\<\?|\$x0e|\+select\+|\+union\+|1\,\1\,1\,|2icommerce|3GSE|4all|59\.64\.153\.|88\.0\.106\.|98|85\.17\.|A\_Browser|ABAC|Abont|abot|Accept|Access|Accoo|AceFTP|Acme|ActiveTouristBot|Address|Adopt|adress|adressendeutschland|ADSARobot|agent|ah\-ha|Ahead|AESOP\_com\_SpiderMan|aipbot|Alarm|Albert|Alek|Alexibot|Alligator|AllSubmitter|alma|almaden|ALot|Alpha|aktuelles|Akregat|Amfi|amzn\_assoc|Anal|Anarchie|andit|Anon|AnotherBot|Ansearch|AnswerBus|antivirx|Apexoo|appie|Aqua_Products|Arachmo|archive|arian|ASPSe|ASSORT|aster|Atari|ATHENS|AtHome|Atlocal|Atomic_Email_Hunter|Atomz|Atrop|^attach|attrib|autoemailspider|autohttp|axod|batch|b2w|Back|BackDoorBot|BackStreet|BackWeb|Badass|Baid|Bali|Bandit|Baidu|Barry|BasicHTTP|BatchFTP|bdfetch|beat|Become|Beij|BenchMark|berts|bew|big.brother|Bigfoot|Bilgi|Bison|Bitacle|Biz360|Black|Black.Hole|BlackWidow|bladder.fusion|Blaiz|Blog.Checker|Blogl|BlogPeople|Blogshares.Spiders|Bloodhound|Blow|bmclient|Board|BOI|boitho|Bond|Bookmark.search.tool|boris|Bost|Boston.Project|BotRightHere|Bot.mailto:craftbot@yahoo.com|BotALot|botpaidtoclick|botw|brandwatch|BravoBrian|Brok|Bropwers|Broth|browseabit|BrowseX|Browsezilla|Bruin|bsalsa|Buddy|Build|Built|Bulls|bumblebee|Bunny|Busca|Busi|Buy|bwh3|c\-spider|CafeK|Cafi|camel|Cand|captu|Catch|cd34|Ceg|CFNetwork|cgichk|Cha0s|Chang|chaos|Char|char\(32\,35\)|charlotte|CheeseBot|Chek|CherryPicker|chill|ChinaClaw|CICC|Cisco|Cita|Clam|Claw|Click.Bot|clipping|clshttp|Clush|COAST|ColdFusion|Coll|Comb|commentreader|Compan|contact|Control|contype|Conc|Conv|Copernic|Copi|Copy|Coral|Corn|core-project|cosmos|costa|cr4nk|crank|craft|Crap|Crawler0|Crazy|Cres|cs\-CZ|cuill|Curl|Custo|Cute|CSHttp|Cyber|cyberalert|^DA$|daoBot|DARK|Data|Daten|Daum|dcbot|dcs|Deep|DepS|Detect|Deweb|Diam|Digger|Digimarc|digout4uagent|DIIbot|Dillo|Ding|DISC|discobot|Disp|Ditto|DLC|DnloadMage|DotBot|Doubanbot|Download|Download.Demon|Download.Devil|Download.Wonder|Downloader|drag|DreamPassport|Drec|Drip|dsdl|dsok|DSurf|DTAAgent|DTS|Dual|dumb|DynaWeb|e\-collector|eag|earn|EARTHCOM|EasyDL|ebin|EBM-APPLE|EBrowse|eCatch|echo|ecollector|Edco|edgeio|efp\@gmx\.net|EirGrabber|email|Email.Extractor|EmailCollector|EmailSearch|EmailSiphon|EmailWolf|Emer|empas|Enfi|Enhan|Enterprise\_Search|envolk|erck|EroCr|ESurf|Eval|Evil|Evere|EWH|Exabot|Exact|EXPLOITER|Expre|Extra|ExtractorPro|EyeN|FairAd|Fake|FANG|FAST|fastlwspider|FavOrg|Favorites.Sweeper|Faxo|FDM\_1|FDSE|fetch|FEZhead|Filan|FileHound|find|Firebat|Firefox.2\.0|Firs|Flam|Flash|FlickBot|Flip|fluffy|flunky|focus|Foob|Fooky|Forex|Forum|ForV|Fost|Foto|Foun|Franklin.Locator|freefind|FreshDownload|FrontPage|FSurf|Fuck|Fuer|futile|Fyber|Gais|GalaxyBot|Galbot|Gamespy\_Arcade|GbPl|Gener|geni|Geona|Get|gigabaz|Gira|Ginxbot|gluc|glx.?v|gnome|Go.Zilla|Goldfire|Google.Wireless.Transcoder|Googlebot\-Image|Got\-It|GOFORIT|gonzo|GornKer|GoSearch|^gotit$|gozilla|grab|Grabber|GrabNet|Grub|Grup|Graf|Green.Research|grub|grub\-client|gsa\-cra|GSearch|GT\:\:WWW|GuideBot|guruji|gvfs|Gyps|hack|haha|hailo|Harv|Hatena|Hax|Head|Helm|herit|hgre|hhjhj\@yahoo|Hippo|hloader|HMView|holm|holy|HomePageSearch|HooWWWer|HouxouCrawler|HMSE|HPPrint|htdig|HTTPConnect|httpdown|http.generic|HTTPGet|httplib|HTTPRetriever|HTTrack|human|Huron|hverify|Hybrid|Hyper|ia\_archiver|iaskspi|IBM\_Planetwide|iCCra|ichiro|ID\-Search|IDA|IDBot|IEAuto|IEMPT|iexplore\.exe|iGetter|Ilse|Iltrov|Image|Image.Stripper|Image.Sucker|imagefetch|iimds\_monitor|Incutio|IncyWincy|Indexer|Industry.Program|Indy|InetURL|informant|InfoNav|InfoTekies|Ingelin|Innerpr|Inspect|InstallShield.DigitalWizard|Insuran\.|Intellig|Intelliseek|InterGET|Internet.Ninja|Internet.x|Internet\_Explorer|InternetLinkagent|InternetSeer.com|Intraf|IP2|Ipsel|Iria|IRLbot|Iron33|Irvine|ISC\_Sys|iSilo|ISRCCrawler|ISSpi|IUPUI.Research.Bot|Jady|Jaka|Jam|^Java|java\/|Java\(tm\)|JBH.agent|Jenny|JetB|JetC|jeteye|jiro|JoBo|JOC|jupit|Just|Jyx|Kapere|kash|Kazo|KBee|Kenjin|Kernel|Keywo|KFSW|KKma|Know|kosmix|KRAE|KRetrieve|Krug|ksibot|ksoap|Kum|KWebGet|Lachesis|lanshan|Lapo|larbin|leacher|leech|LeechFTP|LeechGet|leipzig\.de|Lets|Lexi|lftp|Libby|libcrawl|libcurl|libfetch|libghttp|libWeb|libwhisker|libwww|libwww\-FM|libwww\-perl|LightningDownload|likse|Linc|Link|Link.Sleuth|LinkextractorPro|Linkie|LINKS.ARoMATIZED|LinkScan|linktiger|LinkWalker|Lint|List|lmcrawler|LMQ|LNSpiderguy|loader|LocalcomBot|Locu|London|lone|looksmart|loop|Lork|LTH\_|lwp\-request|LWP|lwp-request|lwp-trivial|Mac.Finder|Macintosh\;.I\;.PPC|Mac\_F|magi|Mag\-Net|Magnet|Magp|Mail.Sweeper|main|majest|Mam|Mana|MarcoPolo|mark.blonin|MarkWatch|MaSagool|Mass|Mass.Downloader|Mata|mavi|McBot|Mecha|MCspider|mediapartners|^Memo|MEGAUPLOAD|MetaProducts.Download.Express|Metaspin|Mete|Microsoft.Data.Access|Microsoft.URL|Microsoft\_Internet\_Explorer|MIDo|MIIx|miner|Mira|MIRE|Mirror|Miss|Missauga|Missigua.Locator|Missouri.College.Browse|Mist|Mizz|MJ12|mkdb|mlbot|MLM|MMMoCrawl|MnoG|moge|Moje|Monster|Monza.Browser|Mooz|Moreoverbot|MOT\-MPx220|mothra\/netscan|mouse|MovableType|Mozdex|Mozi\!|^Mozilla$|Mozilla\/1\.22|Mozilla\/22|^Mozilla\/3\.0.\(compatible|Mozilla\/3\.Mozilla\/2\.01|Mozilla\/4\.0\(compatible|Mozilla\/4\.08|Mozilla\/4\.61.\(Macintosh|Mozilla\/5\.0|Mozilla\/7\.0|Mozilla\/8|Mozilla\/9|Mozilla\:|Mozilla\/Firefox|^Mozilla.*Indy|^Mozilla.*NEWT|^Mozilla*MSIECrawler|Mp3Bot|MPF|MRA|MS.FrontPage|MS.?Search|MSFrontPage|MSIE\_6\.0|MSIE6|MSIECrawler|msnbot\-media|msnbot\-Products|MSNPTC|MSProxy|MSRBOT|multithreaddb|musc|MVAC|MWM|My\_age|MyApp|MyDog|MyEng|MyFamilyBot|MyGetRight|MyIE2|mysearch|myurl|NAG|NAMEPROTECT|NASA.Search|nationaldirectory|Naver|Navr|Near|NetAnts|netattache|Netcach|NetCarta|Netcraft|NetCrawl|NetMech|netprospector|NetResearchServer|NetSp|Net.Vampire|netX|NetZ|Neut|newLISP|NewsGatorInbox|NEWT|NEWT.ActiveX|Next|^NG|NICE|nikto|Nimb|Ninja|Ninte|NIPGCrawler|Noga|nogo|Noko|Nomad|Norb|noxtrumbot|NPbot|NuSe|Nutch|Nutex|NWSp|Obje|Ocel|Octo|ODI3|oegp|Offline|Offline.Explorer|Offline.Navigator|OK.Mozilla|omg|Omni|Onfo|onyx|OpaL|OpenBot|Openf|OpenTextSiteCrawler|OpenU|Orac|OrangeBot|Orbit|Oreg|osis|Outf|Owl|P3P|PackRat|PageGrabber|PagmIEDownload|pansci|Papa|Pars|Patw|pavu|Pb2Pb|pcBrow|PEAR|PEER|PECL|pepe|Perl|PerMan|PersonaPilot|Persuader|petit|PHP|PHP.vers|PHPot|Phras|PicaLo|Piff|Pige|pigs|^Ping|Pingd|PingALink|Pipe|Plag|Plant|playstarmusic|Pluck|Pockey|POE\-Com|Poirot|Pomp|Port.Huron|Post|powerset|Preload|press|Privoxy|Probe|Program.Shareware|Progressive.Download|ProPowerBot|prospector|Provider.Protocol.Discover|ProWebWalker|Prowl|Proxy|Prozilla|psbot|PSurf|psycheclone|^puf$|Pulse|Pump|PushSite|PussyCat|PuxaRapido|PycURL|Pyth|PyQ|QuepasaCreep|Query|Quest|QRVA|Qweer|radian|Radiation|Rambler|RAMP|RealDownload|Reap|Recorder|RedCarpet|RedKernel|ReGet|relevantnoise|replacer|Repo|requ|Rese|Retrieve|Rip|Rix|RMA|Roboz|Rogue|Rover|RPT\-HTTP|Rsync|RTG30|.ru\)|ruby|Rufus|Salt|Sample|SAPO|Sauger|savvy|SBIder|SBP|SCAgent|scan|SCEJ\_|Sched|Schizo|Schlong|Schmo|Scout|Scooter|Scorp|ScoutOut|SCrawl|screen|script|SearchExpress|searchhippo|Searchme|searchpreview|searchterms|Second.Street.Research|Security.Kol|Seekbot|Seeker|Sega|Sensis|Sept|Serious|Sezn|Shai|Share|Sharp|Shaz|shell|shelo|Sherl|Shim|Shiretoko|ShopWiki|SickleBot|Simple|Siph|sitecheck|SiteCrawler|SiteSnagger|Site.Sniper|SiteSucker|sitevigil|SiteX|Sleip|Slide|Slurpy.Verifier|Sly|Smag|SmartDownload|Smurf|sna\-|snag|Snake|Snapbot|Snip|Snoop|So\-net|SocSci|sogou|Sohu|solr|sootle|Soso|SpaceBison|Spad|Span|spanner|Speed|Spegla|Sphere|Sphider|spider|SpiderBot|SpiderEngine|SpiderView|Spin|sproose|Spurl|Spyder|Squi|SQ.Webscanner|sqwid|Sqworm|SSM\_Ag|Stack|Stamina|stamp|Stanford|Statbot|State|Steel|Strateg|Stress|Strip|studybot|Style|subot|Suck|Sume|sun4m|Sunrise|SuperBot|SuperBro|Supervi|Surf4Me|SuperHTTP|Surfbot|SurfWalker|Susi|suza|suzu|Sweep|sygol|syncrisis|Systems|Szukacz|Tagger|Tagyu|tAke|Talkro|TALWinHttpClient|tamu|Tandem|Tarantula|tarspider|tBot|TCF|Tcs\/1|TeamSoft|Tecomi|Teleport|Telesoft|Templeton|Tencent|Terrawiz|Test|TexNut|trivial|Turnitin|The.Intraformant|TheNomad|Thomas|TightTwatBot|Timely|Titan|TMCrawler|TMhtload|toCrawl|Todobr|Tongco|topic|Torrent|Track|translate|Traveler|TREEVIEW|True|Tunnel|turing|Turnitin|TutorGig|TV33\_Mercator|Twat|Tweak|Twice|Twisted.PageGetter|Tygo|ubee|UCmore|UdmSearch|UIowaCrawler|Ultraseek|UMBC|unf|UniversalFeedParser|unknown|UPG1|UtilMind|URLBase|URL.Control|URL\_Spider\_Pro|urldispatcher|URLGetFile|urllib|URLSpiderPro|URLy|User\-Agent|UserAgent|USyd|Vacuum|vagabo|Valet|Valid|Vamp|vayala|VB\_|VCI|VERI\~LI|verif|versus|via|Viewer|virtual|visibilitygap|Visual|vobsub|Void|VoilaBot|voyager|vspider|VSyn|w\:PACBHO60|w0000t|W3C|w3m|w3search|walhello|Walker|Wand|WAOL|WAPT|Watch|Wavefire|wbdbot|Weather|web.by.mail|Web.Data.Extractor|Web.Downloader|Web.Ima|Web.Mole|Web.Sucker|Web2Mal|Web2WAP|WebaltBot|WebAuto|WebBandit|Webbot|WebCapture|WebCat|webcraft\@bea|Webclip|webcollage|WebCollector|WebCopier|WebCopy|WebCor|webcrawl|WebDat|WebDav|webdevil|webdownloader|Webdup|WebEMail|WebEMailExtrac|WebEnhancer|WebFetch|WebGo|WebHook|Webinator|WebInd|webitpr|WebFilter|WebFountain|WebLea|Webmaster|WebmasterWorldForumBot|WebMin|WebMirror|webmole|webpic|WebPin|WebPix|WebReaper|WebRipper|WebRobot|WebSauger|WebSite|Website.eXtractor|Website.Quester|WebSnake|webspider|Webster|WebStripper|websucker|WebTre|WebVac|webwalk|WebWasher|WebWeasel|WebWhacker|WebZIP|Wells|WEP\_S|WEP.Search.00|WeRelateBot|wget|Whack|Whacker|whiz|WhosTalking|Widow|Win67|window.location|Windows.95\;|Windows.95\)|Windows.98\;|Windows.98\)|Winodws|Wildsoft.Surfer|WinHT|winhttp|WinHttpRequest|WinHTTrack|Winnie.Poh|wire|WISEbot|wisenutbot|wish|Wizz|WordP|Works|world|WUMPUS|Wweb|WWWC|WWWOFFLE|WWW\-Collector|WWW.Mechanize|www.ranks.nl|wwwster|^x$|X12R1|x\-Tractor|Xaldon|Xenu|XGET|xirq|Y\!OASIS|Y\!Tunnel|yacy|YaDirectBot|Yahoo\-MMAudVid|YahooSeeker|YahooYSMcm|Yamm|Yand|yang|Yeti|Yoono|yori|Yotta|YTunnel|Zade|zagre|ZBot|Zeal|ZeBot|zerx|Zeus|ZIPCode|Zixy|zmao|Zyborg [NC]
RewriteRule ^(.*)$ - [F,L]
</IfModule>

And, for those of you who enjoy looking at long lists of bad robots, here is the same blacklist of 1211 banned user agents in uncompressed format:

[ Uncompressed view of the Ultimate User-Agent Blacklist (click image for full-size view) ]
Click image for full-size uncompressed view of 1211 blocked user-agents

I love this game :)

Notes

  • 1 Special thanks to “Mr. M” for graciously sharing his extensive user-agent list and granting permission to integrate them into this version of the blacklist. Thanks M! :)
  • 2 Free iPod Nano plus honorable mention in my next article for the first person to identify correctly the three “imaginary” (i.e., fake) user agents. Good luck! ;)

About the Author
Jeff Starr = Creative thinker. Passionate about free and open Web.
Banhammer: Protect your WordPress site against threats.

75 responses to “4G Series: The Ultimate User-Agent Blacklist, Featuring Over 1200 Bad Bots”

  1. I love this script!
    I’ve tested it with all the recent ‘goodbot’ useragents for Google, Yahoo, and Ask. The only false positive is for

    Mozilla/2.0 (compatible; Ask Jeeves/Teoma; +http://about.ask.com/en/docs/about/webmasters.shtml)

    Can this be made to work without breaking the matching rules?

Comments are closed for this post. Something to add? Let me know.
Welcome
Perishable Press is operated by Jeff Starr, a professional web developer and book author with two decades of experience. Here you will find posts about web development, WordPress, security, and more »
USP Pro: Unlimited front-end forms for user-submitted posts and more.
Thoughts
I live right next door to the absolute loudest car in town. And the owner loves to drive it.
8G Firewall now out of beta testing, ready for use on production sites.
It's all about that ad revenue baby.
Note to self: encrypting 500 GB of data on my iMac takes around 8 hours.
Getting back into things after a bit of a break. Currently 7° F outside. Chillz.
2024 is going to make 2020 look like a vacation. Prepare accordingly.
First snow of the year :)
Newsletter
Get news, updates, deals & tips via email.
Email kept private. Easy unsubscribe anytime.