This was a small exercise in curiosity. I actually did this months ago, when I was first starting in the world of RE. I had no real use for reversing this file, it was more an effort to learn more about the tools used, specifically the hex editor. After using computers in a casual sense for so long, it felt very satisfying to take a peek under the hood and see how things were represented in a hex editor.
One of the first games I started messing around with was a free MMORPG for mobile phones and PC called Warspear Online. When you install this game there’s very few files in the game’s directory.
It seems as if most of the game’s files are stored inside one PAK file. Opening it up in a hex editor you can see some of the file names and types included in the PAK file.
So, obviously, the first file I was interested in was badwords.txt. I really needed to appease my curiosity and my childish sense of humor. I wanted to see what these bad words were.
When we separate the badwords.txt and its header we’re left with these bytes.
After briefly analyzing some of the patterns displayed in this PAK file I felt like I could make some safe assumptions about these bytes.
95 0A 1D 00 - An offset or address maybe?
B2 0A 00 00 - A size?
B8 2D 00 00 - A size?
01 - A flag of some type?
62 61 64 77 6F 72 64 73 2E 74 78 74 - File name"badwords.txt"
I can’t go to the address of 0x950A1D00 because it doesn’t exist, my file isn’t that large. So I tried reversing the endianness and went to the offset of 0x1D0A95 instead.
I believe I’m in the right place now. I had just learned about magic numbers and I was honestly a bit shocked that I had never heard the term, nor even considered the concept, before. I knew magic numbers would be in play here and fortunately I got a clue right off the bat.
You can see how the address we’ve found begins with the hex bytes 42 54 68 or in ASCII BZh. After staring at this for a bit I noticed just above it, at the address 0x1D0981, we have the exact same hex bytes. Was this the magic number?
Next up I went to my most valuable tool, Google, and searched for “BZh magic number” with some very nice results. This is from the Wikipedia article on file signatures.
Now I know what compression method was used, I went back to Google to find out how can I decompress a .bz2 file.
Thankfully this is a popular compression method supported by most unpackers like WinZip. Now to get our bytes. I went ahead and tried my luck using the first group of hex bytes that I assumed might related to a size value.
At this point I’m feeling really good about my assumptions so far. The bytes selected end exactly at another BZh magic number. That would have me believe that is the beginning of another compressed file, which means we have the correct amount of bytes.
Next I copied the selected bytes, put them into their own file, and saved it with a .bz2 file extension. I opened the .bz2 file with 7-Zip and extracted it. I had to manually change the extension to a .txt but when I did I was greeted with some very bad words. I did it!
BadWords.txt
dick
cunt
fuck
nigg
bitch
slut
cock
vagina
whore
penis
fags
faggot
asshole
The file is filled with exactly what you might guess, bad words. The game is made by Russian developers, so of course the majority of bad words are in Russian. My favorite part was at the bottom though.
They’ve added all the known websites of gold sellers and exploiters to their filter list in an attempt to prevent these people from advertising their services in-game.