We have reported about a vulnerability relating to Microsoft Word application and called as CVE-2012-0158. You can refer to here. [1] According to the result on Virus Total, we analyze a malware relating the CVE-2012-0158 vulnerability.Last month, I paid much attention to the following email:
Figure 1. E-mail in the inbox
This email was sent from an unknown address (**tuannguyen8@gmail.com) and attached with a .DOC file (Lịch-các-ngày-lễ.doc). Because of suspecting this file attached a keylogger, a type of malware, we had uploaded this file on Virus Total site (https://www.virustotal.com) in order to check whether it is a file infected malware or not. As expected, the result of Virus Total was 28/57 antiviruses including Kaspersky, Bitdefender, ESET-NOD32, which identified a vulnerability CVE-2012-0158 exploit. This exploit was owned by an author named Trần Duy Linh.
I started shaping of what you need to find the shellcode containing in .DOC file. I used Frank Boldewin’s the OfficeMalscanner toolkit [2] to scan this file. The result returned contains an OLE2 Compound Format embedded into this file.
Figure 2. The result was returned by the OfficeMalscanner toolkit
Continuing to scan OLE2 file by the OfficeMalscanner toolkit:
Figure 3. The result canning OLE file was returned by the OfficeMalscanner toolkit.
The scanned OLE file cannot detect malware. Therefore, we decided to find shellcode by hand.
We used 010 Editor [3] to analyze this .DOC file. As this file is not like the RTF file analyzed earlier, we decided to try to find the NOP string (90 90 90 90) from which the shellcode often start. The result returning included 2 offset addresses where the NOP string was started. I was particularly interested to 0x6DD0 offset
Figure 4. Signs of shellcode in .DOC file.
Before the NOP-Sled block, we noticed 4 bytes of 0x27583C30 (Litte Endian) value, an address of opcode (JMP ESP) located in Windows XP SP3’s MSCOMCTL.OCX. A remarkable byte string behind the NOP-Sled block was the same as some opcode assembly of familiar codes (PUSHAD và JMP [offset]).
To be sure, we tested to disassemble a hex code starting at 0x6E00 by a disassembly online[4]. The code began with PUSHAD and used the first 0x1F bytes to decode the following 0x167 bytes by XOR with 0xCC.
Figure 5. Shellcode transforms themselves by XOR with 0xCC
We extracted 0x167 bytes starting from 0x6E1F offset to a .bin file and used FileInSight [5] to perform XOR with 0xCC.
Thanks to disassemble the hex code by IDA Pro tool, we recovered the results after several times pressing the “C” button:
Figure 6. “kernel32” string stored in Stack.
As a result, we were capable of confirming this shellcode including:
- Starting position: 0x6E00 offset of .DOC file.
- Size of shellcode: 0x187
- Shellcode transforms themselves with the first 0x1F bytes by XOR with 0xCC
Analyzing shellcode:
We started more thoroughly analyzing about shellcode. Now, there were two ways to be likely to analyze dynamically shellcode:
- : We extracted shellcode by hand. Then, we were going to use a tool to transform shellcode to .exe file or write a program to jump into shellcode.
- : Changing one byte of 0xCC value in .DOC file.
We decided to choose the 2nd method. We had changed one of the 0x90 bytes (NOP code) to 0xCC. Subsequently, we debugged this .DOC file by loading Microsoft Office 2007 SP3 into IDA Pro on a virtual machine running Windows XP SP3. The display of the debug process stopped at the point where we had changed by 0xCC.
Setting breakpoint at the decryption function’s the location of RET command, we debugged continuously. Shellcode was decrypted by the XOR algorithm and started the main job.
Firstly, the shellcode parsed PEB to get the address of kernel32.dll
After getting the base address of kernel32.dll, shellcode used a decryption function to find the addresses of 6 APIs owning encrypted strings as follows:
The decryption function performed the following tasks:
- Parsing the address of ENT (Export Name Table) of kernel32.dll.
- Browsing each APIs and decrypting the name of API with following algorithm.
- Getting the address of the function of decrypted name coinciding with the input value.
This is an assembly code of the decryption function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
1. Stack[00000650]:00121CDA decrypt_: ; CODE XREF: Stack[00000650]:00121D2Dp 2. Stack[00000650]:00121CDA pusha 3. Stack[00000650]:00121CDB mov ebp, [esp+24h] 4. Stack[00000650]:00121CDF mov eax, [ebp+3Ch] 5. Stack[00000650]:00121CE2 mov edx, [ebp+eax+78h] 6. Stack[00000650]:00121CE6 add edx, ebp 7. Stack[00000650]:00121CE8 mov ecx, [edx+18h] ;Get Number of function; 8. Stack[00000650]:00121CEB mov ebx, [edx+20h] ;Get Export Name Table(ENT) 9. Stack[00000650]:00121CEE add ebx, ebp 10. Stack[00000650]:00121CF0 11. Stack[00000650]:00121CF0 loc_121CF0: ; CODE XREF: Stack[00000650]:00121D0Dj 12. Stack[00000650]:00121CF0 jecxz short loc_121D28 13. Stack[00000650]:00121CF2 dec ecx 14. Stack[00000650]:00121CF3 mov esi, [ebx+ecx*4] 15. Stack[00000650]:00121CF6 add esi, ebp 16. Stack[00000650]:00121CF8 xor edi, edi 17. Stack[00000650]:00121CFA xor eax, eax 18. Stack[00000650]:00121CFC cld 19. Stack[00000650]:00121CFD 20. Stack[00000650]:00121CFD loc_121CFD: ; CODE XREF: Stack[00000650]:00121D07j 21. Stack[00000650]:00121CFD lodsb 22. Stack[00000650]:00121CFE test al, al 23. Stack[00000650]:00121D00 jz short loc_121D09 24. Stack[00000650]:00121D02 rol edi, 13h 25. Stack[00000650]:00121D05 add edi, eax 26. Stack[00000650]:00121D07 jmp short loc_121CFD 27. Stack[00000650]:00121D09 ; --------------------------------------------------------------------------- 28. Stack[00000650]:00121D09 29. Stack[00000650]:00121D09 loc_121D09: ; CODE XREF: Stack[00000650]:00121D00j 30. Stack[00000650]:00121D09 cmp edi, [esp+28h] ; Compare to cipher 31. Stack[00000650]:00121D0D jnz short loc_121CF0 32. Stack[00000650]:00121D0F mov ebx, [edx+24h] 33. Stack[00000650]:00121D12 add ebx, ebp 34. Stack[00000650]:00121D14 mov cx, [ebx+ecx*2] 35. Stack[00000650]:00121D18 mov eax, edx 36. Stack[00000650]:00121D1A mov ebx, [eax+1Ch] 37. Stack[00000650]:00121D1D add ebx, ebp 38. Stack[00000650]:00121D1F mov eax, [ebx+ecx*4] 39. Stack[00000650]:00121D22 add eax, ebp 40. Stack[00000650]:00121D24 mov [esp+1Ch], eax 41. Stack[00000650]:00121D28 42. Stack[00000650]:00121D28 loc_121D28: ; CODE XREF: Stack[00000650]:loc_121CF0j 43. Stack[00000650]:00121D28 popa 44. Stack[00000650]:00121D29 retn 45. Stack[00000650]:00121D2A ; --------------------------------------------------------------------------- 46. Stack[00000650]:00121D2A |
We set a breakpoint after the decryption function, and traced continuously to receive 6 respectively APIs:
- GetFileSize
- LoadLibrary
- SetFilePointer
- ReadFile
- GetModuleHandle
- GlobalAlloc
Thanks to encrypting and decrypting the name of the APIs, shellcode made the analyzing process become difficult.
After getting these addresses of APIs, shellcode allocated memory and retrieved the HANDLE of kernel32. We were wondering why the author of shellcode had used repeatedly the decryption function to get addresses of APIs before shellcode allocated memory and read data from .DOC file.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
1. debug016:00350157 loc_350157: ; CODE XREF: debug016:0035014Aj 2. debug016:00350157 push 0 3. debug016:00350159 push 0 4. debug016:0035015B push dword ptr [ebp-18h] ; offset 0x1A830 5. debug016:0035015E push dword ptr [ebp-4] ; hFile 6. debug016:00350161 call dword ptr [ebp-40h] ; call SetFilePointer 7. debug016:00350164 push dword ptr [ebp-14h] 8. debug016:00350167 push 40h 9. debug016:00350169 call dword ptr [ebp-34h] ; call GlobalAlloc 10. debug016:0035016C mov [ebp-0Ch], eax ; Allocate 7B2 bytes 11. debug016:0035016F push 0 12. debug016:00350171 lea eax, [ebp-1Ch] 13. debug016:00350174 push eax 14. debug016:00350175 push dword ptr [ebp-14h] ; size=0x7B2 15. debug016:00350178 push dword ptr [ebp-0Ch] ; Buffer 16. debug016:0035017B push dword ptr [ebp-4] ; hFile 17. debug016:0035017E call dword ptr [ebp-3Ch] ; call ReadFile 18. debug016:00350181 mov eax, [ebp-0Ch] 19. debug016:00350184 20. debug016:00350184 JMP_To_Dropper_: ; Jump to dropper in .DOC 21. debug016:00350184 jmp eax |
During analyzing, we detected that the shellcode executed a different code in .DOC file (we called this new code as shellcode2) by moving 0x7B2 bytes from the location of 0x1A830 value to allocated buffer, and then, shellcode were jumping straight into this buffer.
Because of getting errors when debugging shellcode2 by IDA Pro, we decided to extract shellcode2 by the 010 Editor tool and called shellcode2 as “Dropper” for convenient using. We stored the hex code of the Dropper into a .BIN file and started to analyze the Dropper.
thanks for your intersted analyzing
Do you mind if send to me the sample? Beacause i really want to practise
to my email, please
Thanks a lot