2G Blacklist: Closing the Door on Malicious Attacks
Since posting the Ultimate htaccess Blacklist and then the Ultimate htaccess Blacklist 2, I find myself dealing with a new breed of malicious attacks. It is no longer useful to simply block nefarious user agents because they are frequently faked. Likewise, blocking individual IP addresses is generally a waste of time because the attacks are coming from a decentralized network of zombie machines. Watching my error and access logs very closely, I have observed the following trends in current attacks:
- UAs are faked, typically using something generic like
Mozilla/5.0
- Each attack may involve hundreds of compromised IP addresses
- Attacks generally target a large number of indexed (i.e., known) pages
- Attacks often utilize query strings appended to variously named PHP files
- The target URLs often include a secondary URL appended to a permalink
- An increasing number of attacks employ random character strings to probe for holes
Yet despite the apparent complexity of such attacks, they tend to look remarkably similar. Specifically, notice the trends in the following examples of (nonexistent) target URLs, or “attack strings,” as I like to call them:
perishablepress.com
was replaced with example.com
. This was required to prevent endless 404 errors from googlebot constantly crawling plain-text URLs.https://example.com/press/tag/tutorial/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
https://example.com/press/2008/01/13/like-spider-pig/home.php?p=http://www.rootsfestival.no/.s/n?
https://example.com/press/wp-content/images/2006/feed-collection/feed-icon_orange-28px.png%20alt=
https://example.com/press/2007/08/29/stop-wordpress-from-leaking-pagerank-to-admin-pages/admin/doeditconfig.php
https://example.com/press/2007/07/29/slideshow-code-for-dead-letter-art/http://cccryuan1918ssdf.nightmail.ru/babyboy/?
https://example.com/press/tag/upgrade/includes/db_connect.php?baseDir=http://www.stempel-immobilien.de/images/mambo??
https://example.com/press/2007/12/17/how-to-enable-php-error-logging-via-htaccess/indexhttp://hellinsoloradio.com/test.txt?
https://example.com/press/tag/php/components/com_webring/admin.webring.docs.php?component_dir=http://www.cartographia.org/ftp/files/source/SinG??
https://example.com/press/2007/11/14/easily-adaptable-wordpress-loop-templates/home.php?menu=http://www.zojoma.com/gjestebok/img/response?%0D?
https://example.com/press/2007/10/15/ultimate-htaccess-blacklist-2-compressed-version/1x2n6l6bx6nt//001mAFC(-~l-xAou6.oCqAjB4ukkmrntoz1A//0011C/uikqijg4InjxGu.k
https://example.com/press/2006/06/14/the-htaccess-rules-for-all-wordpress-permalinks//wordpress/wp-content/plugins/wordtube/wordtube-button.php?wpPATH=http://www.mecad.es/bo??
https://example.com/press/2007/10/15/ultimate-htaccess-blacklist-2-compressed-version/x%7b.//000Ooz,m4//000____::um,qymuxH%3bmJ.5G+D//001F00Dox%7b1rF9DrEtxmn7unwp%7dqDr/
https://example.com/press/2007/07/17/wp-shortstat-slowing-down-root-index-pages/members/plugins/payment/secpay/config.inc.php?config%5brhttp://www.algebramovie.com/images/test.txt???
Now imagine hundreds or even thousands of requests for each of these different URL variations, each targeting a different preexisting resource. So, for example, using the first attack string from our list, such an attack would generate the following log entries:
http://example.com/press/2007/01/29/fun-with-the-dos-command-prompt/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
http://example.com/press/2006/11/01/addmysite-plugin-for-wordpress/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
http://example.com/press/2006/11/20/add-rss-feed-link-icons/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
http://example.com/press/2006/01/10/stupid-htaccess-tricks/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
http://example.com/press/tag/tutorial/menu.php?http://www.lexiaintl.org/templates/css/test.txt?
.
.
.
[ + many more ]
Then, associated with each of these attacks is a unique (or semi-unique) IP address and (faked) user agent. Occasionally, such attacks will be executed from a single machine or even small network, in which case the user agent for each entry is generally generically randomized to avoid user-agent-based blacklists. More typically, however, the current state of spammer and cracker attacks employs a virtually “unblockable” array of user agents and IP addresses. In short, recent blacklisting methods relying on either of these variables are becoming increasingly less effective at stopping malicious attacks.
A Second Generation Blacklist
Given these observations, I have adopted a new strategy for dealing with this “new breed” of malicious attacks. Instead of targeting zillions of IP addresses and/or user agents for blacklisting, I am now identifying recurring attack string patterns and subsequently blocking them via the RedirectMatch
directive of Apache’s powerful Alias module, mod_alias
:
<IfModule mod_alias.c>
RedirectMatch 403 attackstring1
RedirectMatch 403 attackstring2
RedirectMatch 403 attackstring3
</IfModule>
By blocking key portions of the actual strings used in an attack, we are targeting an “unfakable” variable and preventing its use in any capacity. For example, referring to our previously given collection of attack strings, we are able block almost the entire collection with a single line of code:
RedirectMatch 403 http\:\/\/
Within the context of current server-exploitation techniques, that one line of code is an immensely powerful weapon for closing the door on malicious attacks. By focusing our blacklisting efforts directly on the attack vector itself, we employ a strategy that transcends the emergent complexity and variation inherent among intrinsic attack parameters. They can fake the user agents, the IP addresses, and just about everything else, but they can’t fake the (potential) targets of their attacks. Attack strings contain patterns that remain far more constant than previously targeted variables. And it gets even better..
Presenting the 2G Blacklist
For several months now, I have been harvesting key portions of malicious attack strings from my access logs and adding them to my new and improved “2G” blacklist. After the addition of each new string, I take as much time as possible to test the effectiveness of the block and ensure that it doesn’t interfere with normal functionality.
Although highly effective in its current state, the 2G Blacklist is a work in progress. As time goes on, this blacklisting method will certainly evolve to keep up with the rapidly changing arsenal of spammer and cracker attacks. To stay current with this and many other security measures, I encourage you to subscribe to Perishable Press.
As mentioned, this blacklist is designed for Apache servers equipped with the mod_alias
module. You will need access to your site’s root htaccess file, into which you simply copy & paste the following code:
# 2G Blacklist from Perishable Press
<IfModule mod_alias.c>
RedirectMatch 403 \.inc
RedirectMatch 403 alt\=
RedirectMatch 403 http\:\/\/
RedirectMatch 403 menu\.php
RedirectMatch 403 main\.php
RedirectMatch 403 file\.php
RedirectMatch 403 home\.php
RedirectMatch 403 view\.php
RedirectMatch 403 about\.php
RedirectMatch 403 order\.php
RedirectMatch 403 index2\.php
RedirectMatch 403 errors\.php
RedirectMatch 403 config\.php
RedirectMatch 403 button\.php
RedirectMatch 403 middle\.php
RedirectMatch 403 threads\.php
RedirectMatch 403 contact\.php
RedirectMatch 403 display\.cgi
RedirectMatch 403 display\.php
RedirectMatch 403 include\.php
RedirectMatch 403 register\.php
RedirectMatch 403 db_connect\.php
RedirectMatch 403 doeditconfig\.php
RedirectMatch 403 send\_reminders\.php
RedirectMatch 403 admin_db_utilities\.php
RedirectMatch 403 admin\.webring\.docs\.php
RedirectMatch 403 keds\.lpti
RedirectMatch 403 r\.verees
RedirectMatch 403 pictureofmyself
RedirectMatch 403 remoteFile
RedirectMatch 403 mybabyboy
RedirectMatch 403 mariostar
RedirectMatch 403 zaperyan
RedirectMatch 403 babyboy
RedirectMatch 403 aboutme
RedirectMatch 403 xAou6
RedirectMatch 403 qymux
</IfModule>
A brief rundown of what we are doing here.. First, notice that the entire list is enclosed with a conditional <IfModule>
container; this ensures that your site will not crash if for some reason mod_alias
becomes unavailable. The list itself is elegantly simple. Each line targets a specific string of characters that, if matched in the URL, will return a server status 403
Forbidden HTTP error code. Nice, clean, and easy.
Wrap Up..
Although highly effective at stopping many attacks, this blacklist is merely another useful tool in the ongoing hellish battle against the evil forces of the nefarious online underworld. It is meant to complement existing methods, not replace them. Some frequently asked questions:
Is there still benefit from blocking individual IP addresses? As discussed elsewhere at Perishable Press, yes, crackers and attackers have their favorite locations and certain zombie machines are easier to manipulate than others.
Is there still benefit from blocking certain ranges of IPs? Yes, especially for coordinated attacks, blocking ranges of IP addresses often is the most expedient way of securing a site against threats.
Is there still benefit from blocking certain user agents? Yes, many spammers, scrapers and crackers have yet to spoof this aspect of their game — there are many well-known and well-hated user agents that should be banned.
It is in addition to these tools, then, that the 2G Blacklist provides another layer of security with solid defense against malicious attacks.
22 responses to “2G Blacklist: Closing the Door on Malicious Attacks”
I would like explanations on how this line works.
redirectmatch 403 https://
Plus, I wonder if a more powerful approch wouldn’t be possible : you are thinking the 403 list to block on certain criteras, but wouldn’t it be stronger to block attempts to URLs that do not match your structures.
I mean, couldn’t you block URLs that differ from, let’s say :
// Posts https://example.com/year/month/day/title/ // Categories https://example.com/category/category-name/ // Etc https://example.com/xxx/xxx/
Oh, secondary instances, I missed that, stupid me.
I stay tuned for this Apache catching rules solution, and I’ll try to find something myself – even if I’ve never seen any attack string in my PHP errors log.
I found the raw logs in the ASO cpanel, and I must say that I’m pretty impressed to think you find the courage to look at all this not-really-written-for-humans gibberish stuff.
Unless you use some kind of regex based filter to clear the recurrent and safe requests.
I’m wonder when your mighty plugin will be released :p
Hi Louis,
In the article, I present a series of examples whereby attack strings target various permalinks by appending secondary URLs. For example:
https://perishablepress.com/permalink/http://attack-url.tld/string...
https://perishablepress.com/permalink/http://attack-url.tld/test.txt?
https://perishablepress.com/permalink/http://attack-url.tld/result.php?
etc..
Thus, by using
redirectmatch 403 http\:\/\/
, we are able to match and block all such attempts. This works because it only matches secondary instances of the “http://
” in the URL. In my experience, many attacks are successfully blocked using this directive.Your second concern is a good idea, and I have definitely tried to create a set of htaccess rules to do the job, however I am as of yet unable to create something that is at once specific enough to catch attacks yet general enough to ignore legitimate URLs. Additionally, for such a technique to benefit other users, it would need to accommodate a wide variety of alternate URL formats. So, until I find time to develop a stronger, more universal strategy, this method is the best I can offer. Hopefully, it is not all in vain ;)
Sounds good, Louis, you know that I will definitely post any new techniques as soon as I discover them. In the meantime, you may want to examine your site’s 404 or even Apache error logs to identify this type of attack — they generally will not generate any PHP errors.
Regards,
Jeff
Thanks for this great update to an already great article.
I have be plagued by many of the attacks mentioned. Because I run a VB board I did have to modify some of the rules to be compatible I also have a couple you may wish to add.
redirectmatch 403 f-.html
redirectmatch 403 ImpEvData.php
redirectmatch 403 check_proxy.html
Something I’m also concerned about is the thousands of 404’s I get from agents similar to this Java/1.
I have tried adding this rule
BrowserMatch “Java/1.0” 403 to mod_alias.c but I’m not sure if I should or can. Can you advise on this?
One last comment and please don’t take offense but I find this blog really hard to read with gray text on a black background.
Again brilliant work, i had a few spammers slip through the .htaccess but luckily askimet caught them.
Just a small note though for Joomla users you will have to delete
‘redirectmatch 403 index2.php’ for the backend to work properly.
Hi Peter,
Thanks for the great feedback. You may want to try blocking any specific user agents (such as
Java/1
) using Apache’sSetEnvIfNoCase
directive as follows:# Block Java/1.0
SetEnvIfNoCase User-Agent "Java/1.0" keep_out
<Limit GET POST>
order allow,deny
allow from all
deny from env=keep_out
</Limit>
That should work — I block a number of nasties with that method. Also, regarding the low-contrast/readability issue, if you look at the lower-right corner of your browser, you should see a small sun icon. Click that, and the site should brighten up for you! ;)
@Don: Thanks for the heads up on the Joomla front. I may very well drop that rule from the next incarnation of the list. Cheers!
one of the other commands also mucks about with selecting the ‘visual/code’ section in create/edit a post.
im not sure which one it is but if i find out i’ll let you know
Thanks Don, we appreciate your efforts in adapting the blacklist for users of Joomla. Very cool!
‘we appreciate’ is there more of you or have you turned into royalty ? :D
Nah the above mentioned point is in wordpress not joomla :p