Tool to aid the creation of RegEx offset finders with AutoIt code
This tool will automatically convert OllyDbg dumped code into correct syntax for AutoIt, with all arguments replaced with .{8} or .{2} or whatever size it is.
CPU Disasm
Address Hex dump Command Comments
00604B30 /. 53 PUSH EBX
00604B31 |. 8B5C24 08 MOV EBX,DWORD PTR SS:[ARG.1]
00604B35 |. 56 PUSH ESI
00604B36 |. 8B7424 10 MOV ESI,DWORD PTR SS:[ARG.2]
00604B3A |. 57 PUSH EDI
00604B3B |. 56 PUSH ESI ; /Arg2 => [ARG.2]
00604B3C |. 53 PUSH EBX ; |Arg1 => [ARG.1]
00604B3D |. E8 EE221E00 CALL 007E6E30 ; \ElementClient.007E6E30
00604B42 |. 84C0 TEST AL,AL
00604B44 |. 74 24 JE SHORT 00604B6A
00604B46 |. 85F6 TEST ESI,ESI
00604B48 |. 74 18 JE SHORT 00604B62
00604B4A |. 8BCE MOV ECX,ESI
00604B4C |. E8 EFFD1E00 CALL 007F4940 ; [ElementClient.007F4940
00604B51 |. 84C0 TEST AL,AL
00604B53 |. 74 0D JE SHORT 00604B62
00604B55 |. 8B06 MOV EAX,DWORD PTR DS:[ESI]
00604B57 |. 6A 01 PUSH 1
00604B59 |. 6A 00 PUSH 0
00604B5B |. 6A 00 PUSH 0
00604B5D |. 8BCE MOV ECX,ESI
00604B5F |. FF50 1C CALL DWORD PTR DS:[EAX+1C]
Into something like this to put into an AutoIt regex search string:
Code:
$search &= '.*?' & _
'53' & _ ; PUSH EBX
'8B5C24.{2}' & _ ; MOV EBX,DWORD PTR SS:[ARG.1]
'56' & _ ; PUSH ESI
'8B7424.{2}' & _ ; MOV ESI,DWORD PTR SS:[ARG.2]
'57' & _ ; PUSH EDI
'56' & _ ; PUSH ESI ; /Arg2 => [ARG.2]
'53' & _ ; PUSH EBX ; |Arg1 => [ARG.1]
'E8.{8}' & _ ; CALL 007E6E30 ; \ElementClient.007E6E30
'84C0' & _ ; TEST AL,AL
'74.{2}' & _ ; JE SHORT 00604B6A
'85F6' & _ ; TEST ESI,ESI
'74.{2}' & _ ; JE SHORT 00604B62
'8BCE' & _ ; MOV ECX,ESI
'E8.{8}' & _ ; CALL 007F4940 ; [ElementClient.007F4940
'84C0' & _ ; TEST AL,AL
'74.{2}' & _ ; JE SHORT 00604B62
'8B06' & _ ; MOV EAX,DWORD PTR DS:[ESI]
'6A.{2}' & _ ; PUSH 1
'6A.{2}' & _ ; PUSH 0
'6A.{2}' & _ ; PUSH 0
'8BCE' & _ ; MOV ECX,ESI
'FF50.{2}' ; CALL DWORD PTR DS:[EAX+1C]
Tool isn't particularly fancy as I kind of just threw it together to suit my needs at the time, but it might be useful for anyone who wants to make similar offset finders.
Basically, all you need to do is open up OllyDbg and find the section of code you're interested in, then copy it to the clipboard
Now run this little program and paste your code in the top box and hit the apply button. It will automatically convert it into correct syntax for AutoIt, with all arguments replaced with .{8} or .{2} or whatever size it is.
If there are two arguments, for example a 4 byte argument and a 1 byte argument, it can even convert this to .{10}
Once it's converted into AutoIt syntax, you can then test the regex on your .exe by putting the full file name and path in the text entry box below the second code window and hitting the 'Test' button. It will report back how many occurrences of your pattern were found and then list them for you.
Once you've found a decent pattern, just copy the text from the lower box and paste it into your autoit program.
For anyone interested in the workings, it basically just uses a regular expression to extract the opcode, arguments and comments for each line of code:
Code:
^[0-9A-F]{8}[\^|\.\>\s\\\/\$]+(([0-9A-F]{2,8}):)?([0-9A-F]{2,8})?\s?([0-9A-F]{8})?([0-9A-F]{4})?([0-9A-F]{2})?[\s:]([0-9A-F]{8})?([0-9A-F]{4})?([0-9A-F]{2})?\s+(.+)
And saves all the components of the instruction to an array of matches. Then it uses some fairly simple logic to reconstruct a syntactically suitable string for AutoIt.
As I said, it's not perfect and it might not be the exact formatting you want, and the comment tab spacing isn't great but it definitely saved me a lot of time. Hopefully it'll save you some time too
Possible future addition:
Enter an address of which you wish to find an offset and the program could incrementally increase the size of the search string until only one match is returned, thus giving you the most efficient possible regex.
Code:
#include <ButtonConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>
#include <EditConstants.au3>
#include <Array.au3>
Opt("GUIOnEventMode", 1) ;0=disabled, 1=OnEvent mode enabled
$hGui = GUICreate("Offset Finder maker by dumbfck", 600, 800)
GUISetBkColor(0x192127)
GUISetOnEvent(-3, "_Quit")
$btnApply = GUICtrlCreateButton("Apply", 150, 770, 100, 25)
GUICtrlSetOnEvent($btnApply, "btnApply_Click")
$btnTest = GUICtrlCreateButton("Test", 350, 770, 100, 25)
GUICtrlSetOnEvent($btnTest, "btnTest_Click")
$btnBrowse = GUICtrlCreateButton("Browse", 520, 743, 70, 22)
GUICtrlSetOnEvent($btnBrowse, "btnBrowse_Click")
$txtFileName = GUICtrlCreateInput("C:\Program Files\Perfect World Entertainment\Perfect World International\element\elementclient.exe", 10, 745, 500, 20)
GUISetFont (6, 300, 1,"courier")
$txtIn = GUICtrlCreateEdit("", 10, 30, 580, 350)
$txtOut = GUICtrlCreateEdit("", 10, 390, 580, 350)
GUISetState(@SW_SHOW, $hGui)
Global Static $testString
; Loop forever
While 1
Sleep(10)
WEnd
Func btnBrowse_Click()
$fileName = FileOpenDialog("Select .exe file to test on", "C:\Program Files\Perfect World Entertainment\Perfect World International\element\elementclient.exe", "Executables (*.exe)", 1 )
GUICtrlSetData($txtFileName, $fileName)
EndFunc
Func btnTest_Click()
$fileName = GUICtrlRead($txtFileName)
$file = FileOpen($fileName, 16)
$data = FileRead($file, FileGetSize($fileName))
FileClose($file)
if not $data Then
MsgBox(0, "Error", "Could not read test file")
EndIf
$offsets = StringRegExp($data, $testString, 3)
If Not IsArray($offsets) Then
MsgBox(16, 'Error', 'Could not find pattern')
Exit
Else
$max = UBound($offsets)
if $max > 5 Then
$maxDisplayed = 5
Else
$maxDisplayed = $max
EndIf
$addressString = ""
for $i = 0 to $maxDisplayed-1
$funcAddr = 0x400000 + StringInStr($data, $offsets[$i])/2 - 1
$addressString &= Hex($funcAddr) & @CRLF
Next
if $max > 5 Then
$addressString &= "... more matches not shown"
EndIf
MsgBox(0, 'Found', 'Pattern matched ' & $max & " times at addresses:" & @CRLF & $addressString)
_ArrayDisplay($offsets)
EndIf
EndFunc
Func btnApply_Click()
$inStr = GUICtrlRead($txtIn)
$newString = "$search &= '.*?' & _" & @CRLF
$testString = ''
$inArray = StringSplit(StringStripWS($inStr, 2), @CRLF)
for $i = 0 to UBound($inArray)-1
$outLine = " "
$testLine = ""
#cs
[0-9A-F]{8} 8 hex digits - The address field of the Olly disasm - This gets discarded
[\^|\.\>\s\\\/\$]+ one or more of the following: ^ | . > whitespace \ / $
These are the chars Olly uses to show loops / jumps etc - This gets discarded
(([0-9A-F]{2,8}):)? Optionally match 2-8 hex digits followed by a : ($matches[1])
Extra brackets are to match the value without the : at the end ($matches[2])
([0-9A-F]{2,8})? Optionally match another 2-8 hex digits - If there is no previous match containing a : this will be the full opcode
Otherwise, it will be the second part of the opcode after the : ($matches[3])
\s? Optional space after opcode - discard
([0-9A-F]{8})? if exists, matches if first argument after opcode is 8 digits ($matches[4])
([0-9A-F]{4})? if exists, matches if first argument after opcode is 4 digits ($matches[5])
([0-9A-F]{2})? if exists, matches if first argument after opcode is 2 digits ($matches[6])
[\s:] discard any whitespace or a : character
([0-9A-F]{8})? if exists, matches if second argument after opcode is 8 digits ($matches[4])
([0-9A-F]{4})? if exists, matches if second argument after opcode is 4 digits ($matches[5])
([0-9A-F]{2})? if exists, matches if second argument after opcode is 2 digits ($matches[6])
\s+ Discard one or more whitespace
(.+) Match everything else at the end - This is the mnemonic, e.g., MOV EBX,[ECX] ($matches[10)
matches[0] full match... don't do anything with it
matches[1] first part of opcode followed by and including ':' *always remove*
matches[2] first part of opcode if it contains a ':' *always show*
matches[3] full opcode or second part of opcode if it contains a ':' *always show*
matches[4] 32bit first argument - replace with .{8}
matches[5] 16bit first argument - replace with .{4}
matches[6] 8bit first argument - replace with .{2}
matches[7] 32bit second argument - replace with .{8}
matches[8] 16bit second argument - replace with .{4}
matches[9] 8bit second argument - replace with .{2}
matches[10] Comment - add after a ';'
#ce
$regex = "^[0-9A-F]{8}[\^|\.\>\s\\\/\$]+(([0-9A-F]{2,8}):)?([0-9A-F]{2,8})?\s?([0-9A-F]{8})?([0-9A-F]{4})?([0-9A-F]{2})?[\s:]([0-9A-F]{8})?([0-9A-F]{4})?([0-9A-F]{2})?\s+(.+)"
$matches = StringRegExp($inArray[$i], $regex, 2)
if IsArray($matches) Then
$testLine &= $matches[2] & $matches[3]
$outLine &= "'"
; Should only get matches in [4] OR [5] OR [6], should never get them in more than one of these.
If $matches[4] Then
if $matches[7] Then
$testLine &= '.{16}' ; 32 bit first arg + 32 bit 2nd arg = 64 bit (impossible?)
ElseIf $matches[8] Then
$testLine &= '.{12}' ; 32 bit first arg + 16 bit 2nd arg = 48 bit (impossible?)
ElseIf $matches[9] Then
$testLine &= '.{10}' ; 32 bit first arg + 8 bit 2nd arg = 40 bit
Else
$testLine &= '.{8}' ; else there is just one 32 bit argument
EndIf
EndIf
If $matches[5] Then
if $matches[7] Then
$testLine &= '.{12}' ; 16 bit first arg + 32 bit 2nd arg = 48 bit (impossible?)
ElseIf $matches[8] Then
$testLine &= '.{8}' ; 16 bit first arg + 16 bit 2nd arg = 32 bit (impossible?)
ElseIf $matches[9] Then
$testLine &= '.{6}' ; 16 bit first arg + 8 bit 2nd arg = 24 bit
Else
$testLine &= '.{4}' ; else there is just one 16 bit argument
EndIf
EndIf
If $matches[6] Then
if $matches[7] Then
$testLine &= '.{10}' ; 8 bit first arg + 32 bit 2nd arg = 40 bit (impossible?)
ElseIf $matches[8] Then
$testLine &= '.{6}' ; 8 bit first arg + 16 bit 2nd arg = 24 bit (impossible?)
ElseIf $matches[9] Then
$testLine &= '.{4}' ; 8 bit first arg + 8 bit 2nd arg = 16 bit
Else
$testLine &= '.{2}' ; else there is just one 8 bit argument
EndIf
EndIf
$testString &= $testLine
$outLine &= $testLine
if $i <> UBound($inArray)-1 Then
$outLine &= "' & _"
Else
$outLine &= "'"
EndIf
if StringLen($outLine) < 18 then
$outLine &= ' ' ; Add another tab to straighten out comments
EndIf
$outLine &= ' ; ' & $matches[10]
$newString &= $outLine & @CRLF
EndIf
Next
GUICtrlSetData($txtOut, $newString)
EndFunc
Func _Quit()
Exit
EndFunc