![]() |
| [[ Home | Forums | 3D Engines Database | Wiki | Articles/Tutorials | Game Dev Jobs | IRC Chat Network | Contact Us ]] |
|
|
#1 |
|
Senior Member
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
|
Another assembly related Code Spotlight. A few days ago I noticed that there aren't any public domain simple x86 binary decoders. This can be useful for a disassembler, a debugger, a code optimizer or many low-level tricks. So I decided to write one myself. The whole theory about the x86 instruction format can be found on the sandpile.org site. Instead of using big tables with the encoding format of every instruction, I decided to keep things compact and determine their format purely from the binary code. This is also how an x86 processor does it in hardware...
Code:
Enjoy! Nicolas 'Nick' Capens |
|
|
|
|
|
#2 |
|
New Member
Join Date: Jan 2003
Posts: 21
|
nice piece of code. did you miss the B0-BF instructions?
|
|
|
|
|
|
#3 |
|
Senior Member
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
|
That's correct. It was noted on flipCode.com as well, but I forgot to update it here. I'll do it right away...
|
|
|
|
|
|
#4 |
|
DevMaster Staff
Join Date: Sep 2003
Location: Hell
Posts: 1,109
|
I think you're forgetting to check for the following prefixes.
2eh—36h—3eh—26h—64h—65h - the segment overrides |
|
|
|
|
|
#5 |
|
DevMaster Staff
Join Date: Sep 2003
Location: Hell
Posts: 1,109
|
Also, the sib byte is not there if the rm portion of the mod R/M byte is 11, so changing this:
if( (modRM & 0x07) == 0x04 ) func++; to this: if( (modRM & 0x07) == 0x04 && (modRM & 0xc0) != 0xc0 ) func++; Seems to have fixed that. |
|
|
|
|
|
#6 |
|
Senior Member
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
|
Thanks a lot for reporting those bugs bladder!
|
|
|
|
|
|
#7 |
|
New Member
Join Date: Oct 2005
Posts: 1
|
Is there a final version with the fixes?
I'd appreciate a link or post. Also, is there a verified x86 decoder out there? or is this one pretty bug free after these postings? thx |
|
|
|
|
|
#8 |
|
New Member
Join Date: Feb 2007
Posts: 1
|
There are still some bugs in this code. There are alternative code for decoder (http://hack-expo.void.ru/groups/blt/text/disasm.txt)
|
|
|
|
|
|
#10 | |
|
Senior Member
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
|
Quote:
|
|
|
|
|
|
|
#11 | |
|
Senior Member
Join Date: Sep 2005
Location: .nl
Posts: 505
|
Quote:
Can you (or someone else) read russian? It mentions LDE (it looks similar to it), but I'm curious what the text exactly says about it. |
|
|
|
|
|
|
#12 |
|
Senior Member
Join Date: Jan 2003
Posts: 868
|
Basically, the guy speaks about how he wrote the universal disassembler LDE and the numerous modifications of it for different tasks, that were developed shorty after. He gives some examples and points, that he only had to check for the length of the processed instructions, since everything else is already matched by the check for EB,E8,E9, 7x, 0F 8x and etc. Then he says, that in some cases, one would wish to know more than simply the length and a long passage starts explaining the disassembler internals.
|
|
|
|
|
|
#13 |
|
Senior Member
Join Date: Sep 2005
Location: .nl
Posts: 505
|
great, thanks!
|
|
|
|
|
|
#14 |
|
New Member
Join Date: Aug 2007
Posts: 3
|
I'm really sorry to post at such an old thread, but can anyone explain to me how this decoder works? It seems to be magic as there is nothing in the Intel manuals at all, and nothing that I could find online.
Thank you! |
|
|
|
|
|
#15 |
|
Senior Member
Join Date: Sep 2005
Location: .nl
Posts: 505
|
Since Nick, the author, is here often, he can probably give you right the answer, but in short: there are (obviously) some patterns in the encoding of x86 instructions, and this decoder makes smart usage of it. When I wrote my own equivalent, somewhere in the previous century, I used information from
http://www.sandpile.org/ |
|
|
|
|
|
#16 |
|
New Member
Join Date: Aug 2007
Posts: 3
|
I've heard that there is a pattern, but I have been unable to find it myself by studying the opcode encodings. Also, I've used sandpile, but to what I found, they don't have anything on it either. (Just opcode maps, and the like.)
The thing I'm having trouble with is determining if there is a modrm byte based on the opcode bits. Thanks again! Alboin Last edited by Alboin : 08-31-2007 at 11:00 AM. |
|
|
|
|
|
#17 | |
|
Senior Member
Join Date: Aug 2004
Location: Ghent, Belgium
Posts: 1,056
|
Quote:
It's most likely not optimal at all, but it gave me this reasonably compact result. A table-based approach might actually be more practical and faster if you need to do some more serious decoding. ![]() |
|
|
|
|
|
|
#18 | |
|
New Member
Join Date: Aug 2007
Posts: 3
|
Quote:
Thanks once again. Alboin |
|
|
|
|
|
|
#19 |
|
Senior Member
Join Date: Sep 2005
Location: .nl
Posts: 505
|
Sure, a table can be faster. To identify an opcode, one can do it more or less in a few mov reg32, [offset table+reg32*4]-alike instructions. Which is/can/should be faster than many bitwise operations and comparions.
|
|
|
|
|
|
#20 |
|
New Member
Join Date: Dec 2007
Posts: 2
|
For speed issues i'd suggest replacing large logical chains like
Code:
With a bitmap lookup table that could look like this: (Sorry the macros look horrible but it makes the tables most readable) Code:
Note: On a 64bit machine when writing 64bit code you'd use 64bit types for even more speed The advantages of using a lookup table like this should be pretty obvious. Making all table entries 32bit allows aligned access (a byte array would get severe penalty for not having) while the whole table just needs 256bit (32bytes) and thus should be pretty cache friendly when heavily random accessed. Note that a cache miss for a 256*4 byte table or alignment penalty from a 256byte might be way more costly than the 4 simple ALU instructions to decode the according bit of the table. Tables however are unlikely to outperform short ALU chains but you'd have to test in practise for this. Having many branches (compares) might breakdown performance alot however. |
|
|
|
|
|
#21 |
|
New Member
Join Date: Dec 2007
Posts: 2
|
A general Problem you'll (well for pretty sure) have at any disassembler
or instruction length decoder are ambiguous codestreams. Imagine a simple codestream where for protectional reasons the programmer emmited something like this: Code:
Now if you want to disassemble in a linear fashion and disassembled the jno the disassembler will be desynced and disassemble following code wrong (Same goes for the instruction counter only decoder) Why is this of importance? This might be very well of practical importance if you want to analyze a remote function for example and thus need to know where it starts/ends in my particular reason i wanted to relocate a subfunctions of a program that might be protected in that way. To state the problem more concrete trying to _automatically_ relocate code like Code:
Seems virtually impossible to me unless you do something fishy as relocate it to two different functions where one has the relocated call and the other keeps the extended op and followup code. This is more than inpractical though. Any suggestions to this? |
|
|
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
|