Mafia DTA files format. Reversing encryption and packing algorithm. Creating DTA Unpacker on C++.
Skills: Asm, C++, basics of OllyDbg and hex editing
Time: ….
Tools: OllyDBG, Ida (if you have Ida…) , Hex Editor (e.g. 010Editor)
Affected: Mafia, Chameleon, Hidden & Dangerous 2
Content.
- Prologue
- Preparation
- Part I. Basics
- Part II. Code tracing. Unpacking data
- Part III. Structure analyzing. Coding the dta unpacker
Prologue.
At the beginning I could say that this document should be a detailed guide, aimed to show basics aspects of understanding unknown file formats, reversing encryption and packing algorithm inside executable file, creating unpacker on C++.
Second note: We will base our analyzing on Chameleon. I don’t know what this game actually is, but the executable file and dll are quite small, so I can provide them with this tutorial. Anyway, Mafia or Hidden & Dangerous 2 has same (nearly same) dll’s, and doesn’t matter which game I used.
Preparation.
Assume that you have already took a quick look at our aim – Chameleon’s setup files, which consists of .dta archive, executable module ChameleonSetup.exe and rw_data.dll (there was other files, but they are not interesting in our case)
dta examination give no result, except file header, which equal “ISD1”
As we can see, all data inside archive are packed or even encrypted, and without detailed reversing of executable files we are unable to get something valuable.
Part I. Basics,
So, shall we begin?
From this moment I will explain major ways, or you can call them basic steps, in retrieving, catching and tracking data, which loaded by application. These steps can be implemented in any application and in any cases, and if you will get an idea how it’s going, you will be able to do everything what you want in a future.
Of course, if you already know how to set breakpoints on “CreateFile” call, and you know how to search string etc, just skip these steps and immediately turn to sections bellow.
Open “ChameleonSetup.exe” in OllyDbg. Let’s try to find functions, which are liable for operation with .dta archive. It means that we should find some codes with dta initialization, or something where “.dta” appears.
Choose “Search for -> All referenced text strings” in context menu,
then scroll string list to the top and call “Search for text” from the context menu
Uncheck “Case sensitive” and press OK. Now, just set breakpoints (BP) on each address with “.dta”.
You can limit yourself with following 4 BP
But to be sure in successful result, we need additional BP in kernel to catch all files IO calls.Code:Address=00405749, Text string=UNICODE "\isdata.dta" Address=00405764, Text string=UNICODE "isdata.dta" Address=00405951, Text string=UNICODE "\isdata.dta" Address=0040596C, Text string=UNICODE "isdata.dta"
By using command box, enter and call following lines one by one
- BP CreateFileW
- BP ReadFile
- BP SetFilePointer
Let me make small remark about each function
CreateFileW - Creates or opens an object, and returns a handle that can be used to access that object.
ReadFile - Reads data from the specified file or input/output (I/O) device. Reads occur at the position specified by the file pointer if supported by the device.Code:HANDLE CreateFileW ( LPCWSTR filename, //[In] pointer to filename to be accessed. DWORD access, //[In] access mode requested. DWORD sharing, //[In] share mode. LPSECURITY_ATTRIBUTES sa, //[In] pointer to security attributes. DWORD creation, //[In] how to create the file. DWORD attributes, //[In] attributes for newly created file. HANDLE template //[In] handle to file with extended attributes to copy. )
SetFilePointer - Moves the file pointer of the specified file.Code:BOOL WINAPI ReadFile( __in HANDLE hFile, __out LPVOID lpBuffer, __in DWORD nNumberOfBytesToRead, __out_opt LPDWORD lpNumberOfBytesRead, __inout_opt LPOVERLAPPED lpOverlapped );
BP on these functions provides us with ample opportunity to catch nearly everything, e.g. we can get the data right after it is read.Code:DWORD WINAPI SetFilePointer( __in HANDLE hFile, __in LONG lDistanceToMove, __inout_opt PLONG lpDistanceToMoveHigh, __in DWORD dwMoveMethod );
What we got at this moment: breakpoints in functions, which are manipulating with .dta archive names. Even if we fail and these BP will not work, we will definitely catch file accessing routines due BP in system functions.
Note: finding strings with archives names or archive extension is a common way in identifying “archives functions”. It works in 70%. In other 30% we need breakpoints in kernel functions
Run application by pressing F9. We will immediately break on CreateFileW.
In stack we have all parameters with which the function was called.Code:7C8107F0 > $ 8BFF mov edi, edi 7C8107F2 . 55 push ebp 7C8107F3 . 8BEC mov ebp, esp 7C8107F5 . 83EC 58 sub esp, 0x58 7C8107F8 . 8B45 18 mov eax, dword ptr [ebp+0x18] 7C8107FB . 48 dec eax 7C8107FC . 0F84 46FF0100 je 7C830748
Wonderful! Application tries to open “ISdata.dta” and the main call was made in rw_data.100021D7 (RETURN to rw_data.100021D7 from kernel32.CreateFileA).Code:0012D110 7C801A53 /CALL to CreateFileW from kernel32.7C801A4E 0012D114 7FFDFC00 |FileName = "Chameleon\ISdata.dta" 0012D118 80000000 |Access = GENERIC_READ 0012D11C 00000001 |ShareMode = FILE_SHARE_READ 0012D120 00000000 |pSecurity = NULL 0012D124 00000003 |Mode = OPEN_EXISTING 0012D128 10000080 |Attributes = NORMAL|RANDOM_ACCESS 0012D12C 00000000 \hTemplateFile = NULL 0012D130 003C06D8 0012D134 100021D7 RETURN to rw_data.100021D7 from kernel32.CreateFileA
Click in stack window on line with address 0012D134, and call context menu to follow return pointer in disassembler (or simple press Enter):
I think that we can remove BP in kernel CreateFileW and just set BP on CreateFileA in rw_data
You can restart application and see how it works (you will break again on 100021D1 with accessing “ISdata.dta");Code:100021AE |. 53 push ebx ; /hTemplateFile 100021AF |. 894D 08 mov dword ptr [ebp+0x8], ecx ; | 100021B2 |. 8B7D 0C mov edi, dword ptr [ebp+0xC] ; | 100021B5 |. 0BFA or edi, edx ; | 100021B7 |. 68 80000010 push 10000080 ; |Attributes = NORMAL|RANDOM_ACCESS 100021BC |. 6A 03 push 0x3 ; |Mode = OPEN_EXISTING 100021BE |. 897D 0C mov dword ptr [ebp+0xC], edi ; | 100021C1 |. 8BBC24 6C010000 mov edi, dword ptr [esp+0x16C] ; | 100021C8 |. 53 push ebx ; |pSecurity 100021C9 |. 6A 01 push 0x1 ; |ShareMode = FILE_SHARE_READ 100021CB |. 68 00000080 push 0x80000000 ; |Access = GENERIC_READ 100021D0 |. 57 push edi ; |FileName 100021D1 |. FF15 10000110 call dword ptr [<&KERNEL32.CreateFileA>] ; \CreateFileA
Now, by pressing F8, trace until ReadFile call on 10002205 (also you can remove BP from kernel32.ReadFile, ‘cuz we already found it)
Actualize in dump buffer address 0012D168 (Buffer = 0012D168) and make one more step with F8. Now ReadFile has been called and in dump we have dword with 49534431Code:0012D140 00000070 |hFile = 00000070 (window) 0012D144 0012D168 |Buffer = 0012D168 0012D148 00000004 |BytesToRead = 4 0012D14C 0012D16C |pBytesRead = 0012D16C 0012D150 00000000 \pOverlapped = NULL
It’s logical to assume that if we read something, we should check it. Below we will stumble on checking routine (begins from 10002246)
Code:10002246 |> \8B4424 14 mov eax, dword ptr [esp+0x14] ; move first dword to eax 1000224A |. C745 20 FFFFFFFF mov dword ptr [ebp+0x20], -0x1 10002251 |. 3D 49534430 cmp eax, 0x30445349 ; compare with ISD0 10002256 |. 75 05 jnz short 1000225D 10002258 |. 895D 20 mov dword ptr [ebp+0x20], ebx ; if ISD0, mov 0 1000225B |. EB 0E jmp short 1000226B 1000225D |> 3D 49534431 cmp eax, 0x31445349 ; compare with ISD1 10002262 |. 75 07 jnz short 1000226B 10002264 |. C745 20 01000000 mov dword ptr [ebp+0x20], 0x1 ; if ISD1, mov 1
We have ISD1… Now trace until the RET (on 100023E5) and leave this function. We still in rw_data, more exactly – in rw_data.dtaCreate. OK, completely leave rw_data.dtaCreate and trace Chameleon until (004052F6).
Why this call? Of course, as far as possible, we should check every call in order to know what happen in every routine. And you can do it manually, but only instruction on 004052F6 will lead us to the useful content. You can press F7 to trace into 10002440, or press F8 if you want to break directly on kernel function SetFilePointer.Code:004052F2 . 33C7 xor eax, edi 004052F4 . 52 push edx 004052F5 . 50 push eax 004052F6 . FF53 0C call dword ptr [ebx+0xC]
Current function calls SetFilePointer and sets pointer in file on second dword, then again sets pointer to zero… Never mind, ‘cuz we should pay attention on ReadFile (100025BB)
This read into buffer 0x18 bytes from the beginning of “ISData.dta”. Stack:Code:100025B5 . 57 push edi ; /pOverlapped 100025B6 . 51 push ecx ; |pBytesRead 100025B7 . 6A 18 push 0x18 ; |BytesToRead = 18 (24.) 100025B9 . 55 push ebp ; |Buffer 100025BA . 52 push edx ; |hFile 100025BB . FF15 08000110 call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
Instruction on 100025C3Code:0012D240 0000004C |hFile = 0000004C (window) 0012D244 003C08C0 |Buffer = 003C08D0 0012D248 00000018 |BytesToRead = 18 (24.) 0012D24C 0012D264 |pBytesRead = 0012D264 0012D250 00000000 \pOverlapped = NULL
denotes, that data has been successfully read, and the next instructionCode:100025C3 . /75 32 jnz short 100025F7 ; data read
checks how many bytes has been read (in our case 0x18 byres)Code:cmp dword ptr [esp+0x10], 0x18
OK, probably we close to decrypting/unpacking routine. What we have right now: piece of code, which moves to stack some values and call function. Let examine it more precisely.
Our buffer in stack contains 0x14 bytes from “ISData.dta” (beginning from the second dword)Code:10002630 > \8B4424 74 mov eax, dword ptr [esp+0x74] ; mov some dword1 10002634 . 8D75 04 lea esi, dword ptr [ebp+0x4] ; esi: file in buffer + 0x4 10002637 . B9 05000000 mov ecx, 0x5 1000263C . 8D7C24 3C lea edi, dword ptr [esp+0x3C] 10002640 . F3:A5 rep movs dword ptr es:[edi], dword ptr [esi] ; copy 0x14 byte from second dword to the stack 10002642 . 8B4C24 70 mov ecx, dword ptr [esp+0x70] ; mov some dword2 10002646 . 50 push eax ; to stack: dword1 10002647 . 51 push ecx ; to stack: dword2 10002648 . 8D5424 44 lea edx, dword ptr [esp+0x44] ; get buffer address 1000264C . 6A 14 push 0x14 ; size 1000264E . 52 push edx ; to stack: buffer address 1000264F . E8 5C690000 call 10008FB0
Interesting. Let’s trace into “call 10008FB0”. Inside we haveCode:0012D290 F6 DD 75 DE F2 44 DC DE 82 DD 75 DE D2 21 DC DE цЭuЮтDЬЮ‚ЭuЮТ!ЬЮ 0012D2A0 4B D5 75 DE KХuЮ
Voila, function with cycles, XORs, NOTs… Also, this function has lots of local calls…Code:10008FB0 /$ 8B4C24 08 mov ecx, dword ptr [esp+0x8] 10008FB4 |. 55 push ebp 10008FB5 |. 8BC1 mov eax, ecx 10008FB7 |. 56 push esi 10008FB8 |. C1E8 03 shr eax, 0x3 10008FBB |. 57 push edi 10008FBC |. 8D14C5 00000000 lea edx, dword ptr [eax*8] 10008FC3 |. 2BCA sub ecx, edx 10008FC5 |. 895424 14 mov dword ptr [esp+0x14], edx 10008FC9 |. 8BE9 mov ebp, ecx 10008FCB |. 8BC8 mov ecx, eax 10008FCD |. 48 dec eax 10008FCE |. 85C9 test ecx, ecx 10008FD0 |. 74 36 je short 10009008 10008FD2 |. 8B5424 10 mov edx, dword ptr [esp+0x10] 10008FD6 |. 8B7C24 1C mov edi, dword ptr [esp+0x1C] 10008FDA |. 53 push ebx 10008FDB |. 8B5C24 1C mov ebx, dword ptr [esp+0x1C] 10008FDF |. 8D14C2 lea edx, dword ptr [edx+eax*8] 10008FE2 |. 8D70 01 lea esi, dword ptr [eax+0x1] 10008FE5 |> 8B02 /mov eax, dword ptr [edx] 10008FE7 |. 8B4A 04 |mov ecx, dword ptr [edx+0x4] 10008FEA |. F7D0 |not eax 10008FEC |. F7D1 |not ecx 10008FEE |. 33C3 |xor eax, ebx 10008FF0 |. 33CF |xor ecx, edi 10008FF2 |. F7D0 |not eax 10008FF4 |. F7D1 |not ecx 10008FF6 |. 8902 |mov dword ptr [edx], eax 10008FF8 |. 894A 04 |mov dword ptr [edx+0x4], ecx 10008FFB |. 4E |dec esi 10008FFC |. 83EA 08 |sub edx, 0x8 10008FFF |. 85F6 |test esi, esi 10009001 |.^ 77 E2 \ja short 10008FE5 10009003 |. 8B5424 18 mov edx, dword ptr [esp+0x18] 10009007 |. 5B pop ebx 10009008 |> 8B4424 10 mov eax, dword ptr [esp+0x10] 1000900C |. 8D0C02 lea ecx, dword ptr [edx+eax] 1000900F |. 8BD5 mov edx, ebp 10009011 |. 4D dec ebp 10009012 |. 85D2 test edx, edx 10009014 |. 74 21 je short 10009037 10009016 |. 8D7C24 18 lea edi, dword ptr [esp+0x18] 1000901A |. 8D0429 lea eax, dword ptr [ecx+ebp] 1000901D |. 2BF9 sub edi, ecx 1000901F |. 8D75 01 lea esi, dword ptr [ebp+0x1] 10009022 |> 8A08 /mov cl, byte ptr [eax] 10009024 |. F6D1 |not cl 10009026 |. 8808 |mov byte ptr [eax], cl 10009028 |. 8A1407 |mov dl, byte ptr [edi+eax] 1000902B |. 32D1 |xor dl, cl 1000902D |. 4E |dec esi 1000902E |. F6D2 |not dl 10009030 |. 8810 |mov byte ptr [eax], dl 10009032 |. 48 |dec eax 10009033 |. 85F6 |test esi, esi 10009035 |.^ 77 EB \ja short 10009022 10009037 |> 5F pop edi 10009038 |. 5E pop esi 10009039 |. 5D pop ebp 1000903A \. C2 1000 retn 0x10
Now we should trace it and in every cycle check our incoming buffer with 0x14 encrypted data.
First cycle take third and fourth dword from buffer, not them, xor them with dwords DE75DDF2, DEDC644B (which has been passed to the main function). We can assume that these strange dwords are keys: key1 and key2.
But our buffer isn’t completely decrypted.Code:10008FE5 |> /8B02 /mov eax, dword ptr [edx] ; 3th dword from buffer 10008FE7 |. |8B4A 04 |mov ecx, dword ptr [edx+0x4] ; 4th dword from buffer 10008FEA |. |F7D0 |not eax ; not dword3 10008FEC |. |F7D1 |not ecx ; not dword4 10008FEE |. |33C3 |xor eax, ebx ; xor (not dword3) with key1 10008FF0 |. |33CF |xor ecx, edi ; xor (not dword4) with key2 10008FF2 |. |F7D0 |not eax ; not(xor (not dword3) with key1) 10008FF4 |. |F7D1 |not ecx ; not(xor (not dword4) with key2) 10008FF6 |. |8902 |mov dword ptr [edx], eax ; write result: dword3 10008FF8 |. |894A 04 |mov dword ptr [edx+0x4], ecx ; write result: dword4 10008FFB |. |4E |dec esi ; decrease counter 10008FFC |. |83EA 08 |sub edx, 0x8 10008FFF |. |85F6 |test esi, esi 10009001 |.^\77 E2 \ja short 10008FE5
Second cycle in current function
do the same thing as previous cycle, but decrypt data byte-by-byte.Code:10009022 |> /8A08 /mov cl, byte ptr [eax] ; mov byte from the end of buffer 10009024 |. |F6D1 |not cl ; not byte 10009026 |. |8808 |mov byte ptr [eax], cl ; write it back 10009028 |. |8A1407 |mov dl, byte ptr [edi+eax] ; get byte from the key 1000902B |. |32D1 |xor dl, cl 1000902D |. |4E |dec esi 1000902E |. |F6D2 |not dl 10009030 |. |8810 |mov byte ptr [eax], dl 10009032 |. |48 |dec eax 10009033 |. |85F6 |test esi, esi 10009035 |.^\77 EB \ja short 10009022
So, at the end our buffer contains
Code:0012D290 04 00 00 00 B9 20 00 00 70 00 00 00 99 45 00 00 ...№ ..p...™E.. 0012D2A0 B9 08 00 00 № ..
As a result, decryption turns into a simple steps:
- get encrypted byte
- NOT encrypted byte
- XOR by key
- NOT result
- save result
C++ example:
From this point, we can fully describe parameters of our call on 1000264FCode:unsigned char keys[8]; ((DWORD)keys)[0] = key2; ((DWORD)keys)[1] = key1; for (unsigned int i = 0; i < size; i++) data[i] = (unsigned char)(~((~data[i]) ^ key[i%8]));
Wait a minute! One reasonable question: how we get decryption keys? Basically we can forget about them, because they do not change for each Chameleon archive. Keys for each archive in Mafia and H&D2 we can picked from stack before calling decryption function. More details you can find in appendix.Code:10002646 . 50 push eax ; to stack: key1 10002647 . 51 push ecx ; to stack: key2 10002648 . 8D5424 44 lea edx, dword ptr [esp+0x44] ; get buffer address 1000264C . 6A 14 push 0x14 ; size 1000264E . 52 push edx ; to stack: buffer address 1000264F . E8 5C690000 call 10008FB0 ; decrypt
Let’s get back to the subject.
After decryption we have only 20 decrypted bytes in DTA header
By tracing down we stopped on another SetFilePointer at 10002892Code:struct DtaHeader { char signature[4]; // “ISD1” DWORD d1; // 04 00 00 00 - 4 DWORD d2; // B9 20 00 00 - 8377 DWORD d3; // 70 00 00 00 - 112 DWORD d4; // 99 45 00 00 - 17817 };
StackCode:1000288C . 6A 00 push 0x0 ; /Origin = FILE_BEGIN 1000288E . 6A 00 push 0x0 ; |pOffsetHi = NULL 10002890 . 51 push ecx ; |OffsetLo 10002891 . 52 push edx ; |hFile 10002892 . FF15 04000110 call dword ptr [<&KERNEL32.SetFilePoint>; \SetFilePointer
Function moves file pointer to 0x20B9 (dword d2 in DTA header). Next, ReadFile on 100028B5 loads data, beginning from 20B9, and loads 0x70 bytes (dword d3 in DTA header).Code:0012D244 00000070 |hFile = 00000070 (window) 0012D248 000020B9 |OffsetLo = 20B9 (8377.) 0012D24C 00000000 |pOffsetHi = NULL 0012D250 00000000 \Origin = FILE_BEGIN
Stack
By switching to our hex editor, we notice, that 0x20B9 + 0x70 offset leads us to the end of dta. If you don’t want to trace in Olly and wait again unpacked data, you can take advantage of 010Editor script (or make your own small tool) and decrypt this block by yourself.Code:0012D240 00000070 |hFile = 00000070 (window) 0012D244 003C08D0 |Buffer = 003C08D0 0012D248 00000070 |BytesToRead = 70 (112.) 0012D24C 0012D264 |pBytesRead = 0012D264 0012D250 00000000 \pOverlapped = NULL
That’s all for this function. After returning to the Chameleon, we have DTA header and record with files data (file table).
As we can see, dta header contains following data:
Let’s try to identify something in the file table. Each data entry has a fixed length of 28 bytesCode:struct DtaHeader { char signature[4]; DWORD numOfFiles; // Number of files in archive DWORD ftOffset; // File table offset DWORD ftSize; // File table size DWORD extra1; };
50 00 01 00 18 00 00 00 3E 00 00 00 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
It’s clear, that the latest 16 bytes reserved for file name. By comparing other 3 entries, we can say that the third word is a file name length
Code:typedef struct { ubyte unknown; ubyte unknown; WORD fileNameSize; //File name length DWORD unknown; DWORD unknown; char fileName[16]; }
Part II. Code tracing. Unpacking data
In Part 1 we have finished with decryption, dta file header and got some internal structure of given archive.
Let’s continue.
Don’t forget that we work with installer, and after needed data has been decrypted, installer manually reads files from the archive.
We jumped again into rw_data (rw_data.dtaOpen). And first called kernel function is CreateFileA with following parametersCode:004052F4 . 52 push edx ; push key1 004052F5 . 50 push eax ; push key2 004052F6 . FF53 0C call dword ptr [ebx+0xC] ; read dta header and file list 004052F9 . 84C0 test al, al ; data decrypted 004052FB . 74 38 je short 00405335 004052FD . 6A 00 push 0x0 004052FF . 68 3CD74000 push 0040D73C ; ASCII "idata.txt" 00405304 . E8 DB390000 call <jmp.&rw_data.w_data.> ; rw_data.dtaOpen, open "idata.txt"
If after execution we will get EAX = -1, the "idata.txt" will be read from the archive, otherwise – from hddCode:0012D028 0040D73C |FileName = "idata.txt" 0012D02C 80000000 |Access = GENERIC_READ 0012D030 00000001 |ShareMode = FILE_SHARE_READ 0012D034 00000000 |pSecurity = NULL 0012D038 00000003 |Mode = OPEN_EXISTING 0012D03C 00000080 |Attributes = NORMAL 0012D040 00000000 \hTemplateFile = NULL
Trace down until 1000377C
and step inside call 10003A10. It’s a really long function… If everything OK (there are some checks at the beginning, like “if needed file located in archive” and so on), you will come to another SetFilePointer callCode:10003774 |> \8B8C24 80020000 mov ecx, dword ptr [esp+0x280] 1000377B |. 51 push ecx ; /Arg1 1000377C |. E8 8F020000 call 10003A10 ; \rw_data.10003A10
which will set file pointer to 0x44 (68.)Code:10003CFC |. A1 A4BC0110 ||mov eax, dword ptr [0x1001BCA4] 10003D01 |. 53 ||push ebx ; /Origin 10003D02 |. 53 ||push ebx ; |pOffsetHi 10003D03 |. 8B8C07 18010000 ||mov ecx, dword ptr [edi+eax+0x118] ; | 10003D0A |. 8B1407 ||mov edx, dword ptr [edi+eax] ; | 10003D0D |. 8B4C0E 04 ||mov ecx, dword ptr [esi+ecx+0x4] ; | 10003D11 |. 51 ||push ecx ; |OffsetLo 10003D12 |. 52 ||push edx ; |hFile 10003D13 |. FF15 04000110 ||call dword ptr [<&KERNEL32.SetFilePointer>] ; \SetFilePointer
OK, why 0x44? Take a look into record with "idata.txt", because this value was taken from the File tableCode:0012CD00 00000070 |hFile = 00000070 (window) 0012CD04 00000044 |OffsetLo = 44 (68.) 0012CD08 00000000 |pOffsetHi = NULL 0012CD0C 00000000 \Origin = FILE_BEGIN
And we can define another value - structure offsetCode:91 02 09 00 44 00 00 00 72 00 00 00 49 44 41 54 41 2E 54 58 54 00 00 00 00 00 00 00
Step over until ReadFile call at 10003E1ECode:typedef struct { ubyte unknown; ubyte unknown; WORD fileNameSize; //File name length DWORD structOffset; //structure offset, contains additional data DWORD unknown; char fileName[16]; }
StackCode:10003E1E |. FF15 08000110 ||call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
We see, that there 32 bytes will be read from 68 offset and will be decrypted on 10003E77Code:0012CCFC 00000070 |hFile = 00000070 (window) 0012CD00 0012CDC0 |Buffer = 0012CDC0 0012CD04 00000020 |BytesToRead = 20 (32.) 0012CD08 0012CD28 |pBytesRead = 0012CD28 0012CD0C 00000000 \pOverlapped = NULL
Now we got another structure, which adds some more information about file "idata.txt" inside archive.Code:10003E77 |. E8 34510000 ||call 10008FB0 ; decrypt
Continue tracing until ReadFile call on 10003F96
If you are an attentive person, you will notice that the BytesToRead parameter calculated some instruction before and new data will be read from the current file pointer (right after previous 32 bytes). For our "idata.txt" we read 9 bytes (value comes from latest file structure + 0x1C), and decrypt them on 1000401BCode:10003F96 |. FF15 08000110 ||call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
We got full filenameCode:1000401B |. E8 904F0000 ||call 10008FB0 ; decrypt
After that, our new filename converted to uppercase and compares with filename from the first data block (from File table)
Some instruction after we read one dword after filemame, and then read another byte at 100043AB. At this moment we can’t guess the purpose of these values.
And… maybe you will not believe in that, but it was only preparation before unpacking.
We decrypt file metadata (file name, size etc), we decrypt some other values and now we come nearer to unpacking routine
Let’s summarize our data.
Now just trace and trace, and soon without fail you land on this function
Here we again check and retrieve data from dta. Then we get 5th dword from the “file extra data” block (for "idata.txt" it’s 0x00002BB4)Code:00403D60 /$ 6A FF push -0x1 ; read and unpack 00403D62 |. 68 5B8F4000 push 00408F5B ; SE handler installation 00403D67 |. 64:A1 0000000>mov eax, dword ptr fs:[0] 00403D6D |. 50 push eax 00403D6E |. 64:8925 00000>mov dword ptr fs:[0], esp 00403D75 |. 81EC 24080000 sub esp, 0x824 00403D7B |. 53 push ebx 00403D7C |. 56 push esi 00403D7D |. 57 push edi 00403D7E |. 68 48D74000 push 0040D748 ; UNICODE "Chameleon.exe" 00403D83 |. 6A 15 push 0x15 00403D85 |. E8 E6FBFFFF call 00403970 00403D8A |. 83C4 08 add esp, 0x8 00403D8D |. 33FF xor edi, edi 00403D8F |. 57 push edi 00403D90 |. 68 3CD74000 push 0040D73C ; ASCII "idata.txt" 00403D95 |. E8 4A4F0000 call <jmp.&rw_data.w_data.> ; check dta
Subtract 2 from received value and allocate memory with size (0x00002BB4 – 2)Code:00403DB7 |> \55 push ebp 00403DB8 |. 57 push edi 00403DB9 |. 6A 02 push 0x2 00403DBB |. 56 push esi 00403DBC |. E8 1D4F0000 call <jmp.&rw_data.w_data.>
Definitely this value is Unpacked File SizeCode:00403DC1 |. 83EB 02 sub ebx, 0x2 ; unpacked size - 2 00403DC4 |. 53 push ebx ; /size 00403DC5 |. FF15 B8A34000 call dword ptr [<&MSVCRT.malloc>] ; \malloc
And now we a going to main unpacking routine on 00403DD3, passing to this function buffer size, buffer address and 1
We jump in rw_data.dtaRead. Bla-bla-bla, instruction and instruction… Continue tracing… We should stop on SetFilePointer (100053B3) and check offset valueCode:00403DD0 |. 53 push ebx ; size 00403DD1 |. 55 push ebp ; buffer 00403DD2 |. 56 push esi 00403DD3 |. E8 004F0000 call <jmp.&rw_data.w_data.> ; read end unpack
0x72… where we saw 0x72? It was in the first data block “File table”, for "idata.txt" entry.Code:001249E4 00000070 |hFile = 00000070 (window) 001249E8 00000072 |OffsetLo = 72 (114.) 001249EC 00000000 |pOffsetHi = NULL 001249F0 00000000 \Origin = FILE_BEGIN
read data from 0x72 to the stackCode:typedef struct { ubyte b1; ubyte flags; WORD fileNameSize; //File name length DWORD structOffset; //structure offset, contains additional data DWORD dataOffset; //data offset char fileName[16]; }
Now function should decide what to do with data: whether they are encrypted, packed or something else.Code:10005434 |. 50 push eax ; |Buffer 10005435 |. 57 push edi ; |hFile 10005436 |. FF15 08000110 call dword ptr [<&KERNEL32.ReadFile>] ; \ReadFile
If data is encrypted, we decrypt it before unpacking
At this moment all data are stored in stack (the size of “extra data” is equal 0x80 bytes)Code:10005629 |> \8A4424 1F mov al, byte ptr [esp+0x1F] ; get flag 1000562D |. 84C0 test al, al 1000562F |. 74 21 je short 10005652 ; data was crypted 10005631 |. A1 6CBC0110 mov eax, dword ptr [0x1001BC6C] 10005636 |. 8B4C24 10 mov ecx, dword ptr [esp+0x10] ; compressed size 1000563A |. 8B5428 44 mov edx, dword ptr [eax+ebp+0x44] ; key2 1000563E |. 8B4428 40 mov eax, dword ptr [eax+ebp+0x40] ; key1 10005642 |. 52 push edx 10005643 |. 50 push eax 10005644 |. 8D9424 89000000 lea edx, dword ptr [esp+0x89] 1000564B |. 51 push ecx 1000564C |. 52 push edx 1000564D |. E8 EE390000 call 10009040 ; decrypt before unpacking
and they are ready for unpacking.
Main unpacking cycle begins from 100056F6. As I think, this is some kind of dictionary coder
A dictionary coder, also sometimes known as a substitution coder, is a class of lossless data compression algorithms which operate by searching for matches between the text to be compressed and a set of strings contained in a data structure (called the 'dictionary') maintained by the encoder. When the encoder finds such a match, it substitutes a reference to the string's position in the data structure.
Maybe, LZ77 variation, but without questions, this algorithm works quite well.
Finally, after passing unpacking cycles, function checks how many bytes have been unpacked and after that copy everything into the output buffer.Code:100056F6 |> /8B4424 30 /mov eax, dword ptr [esp+0x30] 100056FA |> |8A5424 17 mov dl, byte ptr [esp+0x17] 100056FE |. |84D2 |test dl, dl 10005700 |. |75 23 |jnz short 10005725 10005702 |. |66:0FB68434 8000>|movzx ax, byte ptr [esp+esi+0x80] 1000570B |. |66:0FB69434 8100>|movzx dx, byte ptr [esp+esi+0x81] 10005714 |. |C1E0 08 |shl eax, 0x8 10005717 |. |03C2 |add eax, edx 10005719 |. |C64424 17 10 |mov byte ptr [esp+0x17], 0x10 1000571E |. |894424 30 |mov dword ptr [esp+0x30], eax 10005722 |. |83C6 02 |add esi, 0x2 10005725 |> |F6C4 80 |test ah, 0x80 10005728 |. |75 15 |jnz short 1000573F ; just copy bytes to the result buffer 1000572A |. |8B5424 38 |mov edx, dword ptr [esp+0x38] 1000572E |. |8A8434 80000000 |mov al, byte ptr [esp+esi+0x80] ; extra data + 0x80 + counter 10005735 |. |46 |inc esi ; inc counter 10005736 |. |880411 |mov byte ptr [ecx+edx], al ; write byte 10005739 |. |41 |inc ecx 1000573A |. |E9 CD000000 |jmp 1000580C 1000573F |> |8A8434 81000000 |mov al, byte ptr [esp+esi+0x81] ; extra data + 0x81 + counter 10005746 |. |33DB |xor ebx, ebx 10005748 |. |8A9C34 80000000 |mov bl, byte ptr [esp+esi+0x80] ; extra data + 0x80 + counter 1000574F |. |8BD0 |mov edx, eax 10005751 |. |81E2 FF000000 |and edx, 0xFF ; get only byte 10005757 |. |C1E3 04 |shl ebx, 0x4 1000575A |. |C1EA 04 |shr edx, 0x4 1000575D |. |03DA |add ebx, edx 1000575F |. |895C24 64 |mov dword ptr [esp+0x64], ebx 10005763 |. |75 55 |jnz short 100057BA 10005765 |. |66:0FB69434 8200>|movzx dx, byte ptr [esp+esi+0x82] 1000576E |. |66:0FB6C0 |movzx ax, al 10005772 |. |C1E0 08 |shl eax, 0x8 10005775 |. |8D5C02 0F |lea ebx, dword ptr [edx+eax+0xF] 10005779 |. |33D2 |xor edx, edx 1000577B |. |81E3 FFFF0000 |and ebx, 0xFFFF 10005781 |. |895C24 4C |mov dword ptr [esp+0x4C], ebx 10005785 |. |8D43 01 |lea eax, dword ptr [ebx+0x1] 10005788 |. |85C0 |test eax, eax 1000578A |. |7E 25 |jle short 100057B1 1000578C |. |33C0 |xor eax, eax 1000578E |> |8B7C24 38 |/mov edi, dword ptr [esp+0x38] 10005792 |. |8A9C34 83000000 ||mov bl, byte ptr [esp+esi+0x83] 10005799 |. |03C1 ||add eax, ecx 1000579B |. |42 ||inc edx 1000579C |. |881C38 ||mov byte ptr [eax+edi], bl 1000579F |. |8B5C24 4C ||mov ebx, dword ptr [esp+0x4C] 100057A3 |. |8BC2 ||mov eax, edx 100057A5 |. |25 FFFF0000 ||and eax, 0xFFFF 100057AA |. |8D7B 01 ||lea edi, dword ptr [ebx+0x1] 100057AD |. |3BC7 ||cmp eax, edi 100057AF |.^|7C DD |\jl short 1000578E 100057B1 |> |83C6 04 |add esi, 0x4 100057B4 |. |8D4C19 01 |lea ecx, dword ptr [ecx+ebx+0x1] 100057B8 |. |EB 52 |jmp short 1000580C 100057BA |> |24 0F |and al, 0xF 100057BC |. |33D2 |xor edx, edx 100057BE |. |66:0FB6C0 |movzx ax, al 100057C2 |. |83C0 02 |add eax, 0x2 100057C5 |. |25 FFFF0000 |and eax, 0xFFFF 100057CA |. |894424 4C |mov dword ptr [esp+0x4C], eax 100057CE |. |8D78 01 |lea edi, dword ptr [eax+0x1] 100057D1 |. |85FF |test edi, edi 100057D3 |. |7E 2C |jle short 10005801 100057D5 |. |33C0 |xor eax, eax 100057D7 |. |EB 04 |jmp short 100057DD 100057D9 |> |8B5C24 64 |/mov ebx, dword ptr [esp+0x64] 100057DD |> |8BE9 | mov ebp, ecx 100057DF |. |2BEB ||sub ebp, ebx 100057E1 |. |03E8 ||add ebp, eax 100057E3 |. |03C1 ||add eax, ecx ; inc counter 100057E5 |. |8BDD ||mov ebx, ebp 100057E7 |. |8B6C24 38 ||mov ebp, dword ptr [esp+0x38] ; outbuffer address 100057EB |. |42 ||inc edx 100057EC |. |8A1C2B ||mov bl, byte ptr [ebx+ebp] 100057EF |. |881C28 ||mov byte ptr [eax+ebp], bl ; write byte 100057F2 |. |8BC2 ||mov eax, edx 100057F4 |. |25 FFFF0000 ||and eax, 0xFFFF 100057F9 |. |3BC7 ||cmp eax, edi 100057FB |.^|7C DC |\jl short 100057D9 100057FD |. |8B4424 4C |mov eax, dword ptr [esp+0x4C] 10005801 |> |8B6C24 60 |mov ebp, dword ptr [esp+0x60] 10005805 |. |83C6 02 |add esi, 0x2 10005808 |. |8D4C01 01 |lea ecx, dword ptr [ecx+eax+0x1] 1000580C |> |8B7C24 30 |mov edi, dword ptr [esp+0x30] 10005810 |. |8A5424 17 |mov dl, byte ptr [esp+0x17] 10005814 |. |8B4424 54 |mov eax, dword ptr [esp+0x54] ; size + 1 10005818 |. |D1E7 |shl edi, 1 1000581A |. |FECA |dec dl 1000581C |. |3BF0 |cmp esi, eax 1000581E |. |897C24 30 |mov dword ptr [esp+0x30], edi 10005822 |. |885424 17 |mov byte ptr [esp+0x17], dl 10005826 |.^\0F82 CAFEFFFF \jb 100056F6
We finished with direct code tracing, and from now we will concentrate on coding dta unpacker.Code:100058E6 |. F3:A5 rep movs dword ptr es:[edi], dword ptr [esi] ; copy data from temp buffer to the normal
P.S. Parts of this tutorial were written in a different time. If you found mismatches or mistakes, let me know
P.P.S. Wait for Part3 and Part4...
Please register or login to download attachments.