Examples of Nested Encoding

♦ Posted by Jeff Starr in .htaccess, Security

Updated June 28, 2024 • One comment

Typically malicious scans use some sort of encoding to obscure their payloads. For example, instead of injecting a literal script, the attacker will run it through a PHP encoding function such as base64_encode(), utf8_encode(), or urlencode(). So if and when you need to decode some discovered payload, you can use whichever decoding function will do the job. For example, base64_decode(), utf8_decode(), or urldecode(). Sounds straightforward, but let’s dig a little deeper..

Detection, translation

In order to detect a payload, you need to know what to look for. If the payload is not encoded, then you can grep for the usual hacky strings like “eval(“, “base64_”, and so forth. Obviously, unencoded payloads are simple to detect, which means that most malicious scripts are gonna take the time to encode, or obscure, the code beforehand.

So if an attacker encodes their payload one time, it’s still relatively straightforward to detect and translate the data. Searching a site for the single-encoded versions of the usual hacky strings requires a little more effort, but it’s still very doable. For example, instead of searching for eval(, you would search for eval%28, and so forth.

Where things get interesting, and more challenging, is when an attacker encodes their payload multiple times. Looking at it from their perspective, if you want your 1337 codes to remain undetected, why not encode it several times so it won’t be discovered in a typical investigation? And that’s precisely the strategy adopted by savvy attackers. Multiple-encoded payloads are easier to deliver and harder to discover.

Nested encoding example #1

To get a better idea of how nested encoding works, here is an example from a payload that was encoded four times, and used in an actual XSS attack. Here is the original code that was discovered during investigation:

/%252Bresult:%252bchosen%252bnickname%252b%252522bubbae1%252522%25253b%252bsuccess%25253b

Obviously encoded. So, running this code one time through my UTF-8 decoder, we get this:

/%2Bresult:%2bchosen%2bnickname%2b%2522bubbae1%2522%253b%2bsuccess%253b

Then decoding again:

/+result:+chosen+nickname+%22bubbae1%22%3b+success%3b

And finally a fourth time:

/+result:+chosen+nickname+"bubbae1";+success;

There it is. That’s just a small portion of the entire payload, but it illustrates the extent to which bad actors will go to deliver (and hide) their payloads.

Nested encoding example #2

This example demonstrates why it’s necessary to search for multiple-encoded strings when investigating a hacked site. Let’s begin with one of the most common encoding methods used by attackers to obscure their code:

eval(base64_decode(

If this were included in an attack, it would be simple to detect, block, remove, whatever. But then what if it gets encoded one time via base64_encode(). That string would then look like this:

ZXZhbChiYXNlNjRfZGVjb2RlKA==

This string is going to be much harder to detect because it is basically just an alphanumeric string with some equal signs. But let’s say that you have a script that smartly searches for any occurrences of ZXZ. That only works for single-encoded scripts. To bypass detection, we can encode a second time:

WlhaaGJDaGlZWE5sTmpSZlpHVmpiMlJsS0E9PQ==

This code will remain hidden during investigations that only look for single-encoded known strings. In order to discover this payload, our grep script would need to look for something like Wlhaa. So let’s make the payload even more difficult to find by encoding it a third time:

V2xoYWFHSkRhR2xaV0U1c1RtcFNabHBIVm1waU1sSnNTMEU5UFE9PQ==

You can see where this is going.. for each level of encoding, you need to search for another set of matching strings in order to find the payload. Doing the math, that’s three or more searches for each known target string. In my experience, most security scanners don’t dig that deep, something to keep in mind if you’re ever investigating a hacked website. Just for fun, let’s encode our example payload a fourth time:

VjJ4b1lXRkhTa1JoUjJ4YVYwVTFjMVJ0Y0ZOYWJIQklWbTF3YVUxc1NuTlRNRVU1VUZFOVBRPT0=

That there is our original eval(base64_decode( string encoded by base64_encode() four times. And that’s only one tiny portion of a typical payload. To see how this example works, you can decode it yourself using a free online base64 decoder tool, there’s a million of them out there on the Internetz.

Take-home lesson

If you are investigating a hacked site, expand your search to include known strings that may be encoded multiple times. Doing so will increase your workload, but it also will improve your accuracy and effectiveness.

Likewise, if you develop software that detects exploits, payloads, and so forth. Be mindful of nested encoding (or iterated encoding, or whatever you want to call it). It’s too easy to bypass certain site scanners by simply triple-encoding (or more) the payload.

I also recommend a good firewall to protect your site against encoded attacks. The 6G Blacklist is a lightweight .htaccess-based firewall that does an excellent job of stopping a wide range of malicious requests. Or if you are using WordPress, BBQ and BBQ Pro also provide strong protection against malicious attacks. As they say, an ounce of prevention is worth a pound of cure.

About the Author

Jeff Starr = Web Developer. Book Author. Secretly Important.

One response to “Examples of Nested Encoding”

Jim S Smith 2017/07/03 4:13 pm

Have seen a lot of:

eval(gzdecode(base64_decode( . . . ));

Have also seen a lot of custom encryption/decryption methods used too. It seems one of the favorites, is called “arcfour”.

Anyway,

Good practice should be: If you can NOT read the code normally, it should definitely be treated carefully, and with suspicion.

– Jim

Comments are closed for this post. Something to add? Let me know.