Blocking the “ReallyLongRequest” Bandit

♦ Posted by Jeff Starr in .htaccess, Security

Updated June 26, 2024

[ Sneaky Bandit ] While browsing server logs, I kept seeing these super long request URIs that begin with “YesThisIsAReallyLongRequest…” and then the request string just keeps going for like 1 kilobyte worth of characters. Not just a few times, but many. In other words, somebody is going around and repeatedly hitting servers with gigantic-size requests. Probably to test server response using other people’s servers. Ummm, yeah kinda malicious. So I did some research and then blocked the “ReallyLongRequest” Bandit.

Into the rabbit hole..

Here is an example of the log entries I kept finding (you have to scroll!):

[Tue July 10 02:24:12 2018] [core:error] [pid 28290] (36) File name too long: [client 123.456.789.0:50777] AH00036: access to /YesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHXYesThisIsAReallyLongRequestURLbutWeAreDoingItOnPurposeWeAreScanningForResearchPurposePleaseHaveALookAtTheUserAgentTHX [...] failed (filesystem path [...])

While investigating my Apache server logs, I discovered hundreds if not thousands of these “really long” requests going back many months. So for who knows how long, these clowns have been hitting who knows how many sites with giant-size requests. How giant? Whatever it takes to crash your server? Maybe probing for vulnerabilities? At this point, all we know for sure is that someone is going around wasting server resources like a total bandit. To learn more, let’s dig a little deeper..

Follow the bandit..

Adding whitespace to the really long URI requests, we get the following message:

Yes This Is A Really Long Request URL but We Are Doing It On Purpose We Are Scanning For Research Purpose Please Have A Look At The User Agent THX

And then translating that mess to English, we get this:

Yes, this is a really long request URL, but we are doing it on purpose. We are scanning for research purpose. Please have a look at the user agent. Thanks.

Okay follow the white rabbit, I’ll play your little game. Let’s go ahead and check out the user agent:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.86 Safari/537.36 Scanning for research (researchscan.comsys.rwth-aachen.de)

Another clue! Lol. Okay so now tucked in the user agent, we find an URL:

researchscan.comsys.rwth-aachen.de

The first few times following that URL resulted in a timeout or server error. Thankfully, the Lord has taught me patience, so I kept on trying. Finally got this page to load. Here is what it says:

COM SYS
Communication & Distributed Systems
RWTH Aachen University

Why am I receiving connection attempts from this machine?

These connections are part of an Internet-wide research study being conducted by computer scientists at RWTH Aachen University. The research involves making benign connection attempts to every public IP address. By measuring the entire public address space, we are able to analyze global patterns and trends in protocol deployment and security.

As part of this study, every public IP address receives a handful of packets per day on a selection of common ports. These consist of regular UDP probes and TCP connection attempts followed by RFC-compliant protocol handshakes with responsive hosts. We never attempt to exploit security problems, guess passwords, or change device configuration. We only receive data that is publicly visible to anyone who connects to a particular address and port.

Why are you collecting this data?

The data collected through these connections helps computer scientists study the deployment and configuration of network protocols and security technologies. For example, we use it to help web browser makers and other software developers understand the impact of proposed protocol changes and security improvements. In some cases, we are able to detect vulnerable systems and report the problems to the system operators.

Can I request that my server be excluded?

To have your host or network excluded from future scans conducted by RWTH Aachen University, please contact researchscan@comsys.rwth-aachen.de with your IP address or CIDR block. Alternatively, you can configure your firewall to drop traffic from the subnet we use for scanning: 137.226.113.0/26.

Ahhh, so that’s what they’re doing with the endless really long requests.

Of course, it’s up to you to believe or not believe whatever you read on the Internet. Personally, when it comes to security, I tend to be a little cynical. Not that I don’t think these guys are legit; it’s just that I don’t want people running malicious tests on my server. I work too hard to “trust” blindly just because some web page says so. Instead, I examine the facts and draw my conclusions after applying critical thinking skills and a healthy dose of wisdom.

Nuisance or malicious?

At this point, we know the following facts:

The bandit is hitting millions of servers with fabricated URI requests¹
Each URI request is suuuuuuper long, thus testing upper server limits²
Each request consumes energy, server memory, and other resources³
These long requests have been happening for over three years⁴

You make your own decision, but in my experience this behavior is a nuisance, unethical, and in fact malicious. Simply claiming “we are doing research” does NOT give you carte blanche powers to act unethically and leech precious resources from millions of servers for years at a time. Before I digress into a lengthy diatribe on ethical hacking, it’s time to put an end to this nonsense..

Stopping the bandit

For those that want to block the long-request bandit, there are several ways to go about it. Here are three ways to make it happen..

Block via IP Address

The first way to block the long requests is suggested via the bandit itself:

..you can configure your firewall to drop traffic from the subnet we use for scanning: 137.226.113.0/26

So yeah, let’s go ahead and do that. Open your site’s root .htaccess file (or implement via server config file), and add the following line, depending on your version of Apache:

# Block long-request bandit - Apache 2.2
<IfModule !authz_core_module>
	Order Allow,Deny
	Allow from all
	Deny from 137.226.113.0/26
</IfModule>

# Block long-request bandit - Apache 2.4+
<IfModule authz_core_module>
	<RequireAll>
		Require all granted
		Require not ip 137.226.113.0/26
	</RequireAll>
</IfModule>

This blocks the bandit’s entire IP range. If we perform a quick whois lookup, we can verify that 137.226.113.7 is on the network belonging to RWTH Aachen University in Germany. So save the changes, upload to your server, and done.

Block via Request URI

Another effective method of blocking the bandit’s crazy long requests is to use Apache’s RedirectMatch, for example something like this:

RedirectMatch 403 YesThisIsAReallyLongRequest
RedirectMatch 403 ScanningForResearchPurpose

Using two directives with different matching patterns is a little redundant, but it ensures that any variations of the ReallyLongRequestURL will be denied access. This method does not block query-string matches though, so for that we can use mod_rewrite..

Block via mod_rewrite

And just for giggles, here is how to break out the big guns and block the ReallyLongRequest string in both the request URI and the query string:

<IfModule mod_rewrite.c>
    RewriteCond %{REQUEST_METHOD} .* [NC]
    RewriteCond %{THE_REQUEST}  (YesThisIsAReallyLongRequest|ScanningForResearchPurpose) [NC,OR]
    RewriteCond %{QUERY_STRING} (YesThisIsAReallyLongRequest|ScanningForResearchPurpose) [NC]
    RewriteRule .* - [F,L]
</IfModule>

That’s gonna do the job just fine. But if you have to choose one of these three methods, go with the first and just block the entire range of bandit IP addresses. The alternate methods are just for your information, and/or just in case blocking via IP is not possible in your environment. Also, you may want to check out more ways to block stuff with mod_rewrite.

Testing

To verify that the bandit’s long requests are in fact blocked, you can try requesting the super long request example provided above. To test GET requests, you can make such requests using any browser. To test other types of request methods — like POST, HEAD, TRACE, TRACK, et al — you can use cURL. For example, in Terminal you can run these simple commands:

$ curl -i -X GET https://example.com/
$ curl -i -X POST https://example.com/
$ curl -i -X HEAD https://example.com/
$ curl -i -X TRACK https://example.com/

Just replace example.com with your actual URL.

Footnotes

¹ Based on their own claim: “every public IP address receives a handful of packets per day”
² As shown by the log entry example presented in this article
³ Millions of requests made every day for years at a time.. do the math
⁴ As evidenced via the Wayback Machine

About the Author

Jeff Starr = Creative thinker. Passionate about free and open Web.