Dynamic recompilation, while certainly not
a new concept as evidenced by the number of emulators which successfully
employ it, is relatively difficult for many beginners to grasp and
then later successfully implemented. This tutorial is an attempt
by myself to lay down the basics of dynamic recompilation so that
any who look at this tutorial will be able to understand somewhat
how it works.
Dynamic recompilation is the name for an
emulation technique in which programs written for a completely different
CPU architecture (from hereout called the "target architecture")
are decoded and made to run on the current CPU architecture (from
hereout called the "client architecture"). This is done
by decoding individual CPU opcodes for the target architecture CPU
and then translating them into their equivalent opcode or set of
opcodes for the client architecture CPU, of course its a bit more
complex than this, but explaining the specifics of why is beyond
the scope of this tutorial. In order to get the most out of this
technique it is quite necessary for one to have a very decent understanding
of assembly language so that they can choose the most efficient
representation for the client architecture.
Firstly, there are three basic components
of a recompiler, they are as follows:
enum ASSEMBLER_HELP_VALUES { SIB = 4, ///< SIB Value Passed To SibSB DISP32 = 5 ///< DISP32 Value Passed To ModRM }; typedef enum { EAX = 0, EBX = 3, ECX = 1, EDX = 2, ESI = 6, EDI = 7, EBP = 5, ESP = 4 } X86RegisterType; /////////////////////////////////////////////////////////////////////////////////////////////////// class X86Emitter { public: //////////////////////////////////////////////////////////////////////////////////////////////// // Constructors /** The default constructor. */ X86Emitter(unsigned long bufferSize = 256); //////////////////////////////////////////////////////////////////////////////////////////////// // Destructors /** The default destructor. */ ~X86Emitter(); //////////////////////////////////////////////////////////////////////////////////////////////// // Block Execution /** This function will execute the block in memory. */ void ExecuteBlock(); //////////////////////////////////////////////////////////////////////////////////////////////// // Move Opcode Emitters /** Move 16 bits of data from memory into an x86 CPU register. */ void Mov16RtoM(unsigned int to, X86RegisterType from); /** Move 16 bits of data from an x86 CPU register to memory. */ void Mov16MtoR(X86RegisterType to, unsigned int from); //////////////////////////////////////////////////////////////////////////////////////////////// // Add Opcode Emitters /** Adds an immediate value to an x86 CPU register. */ void Add32ItoR(X86RegisterType to, unsigned int from); //////////////////////////////////////////////////////////////////////////////////////////////// // Return Opcode Emitter /** This opcode must end every recompiled block. */ void Ret(); protected: //////////////////////////////////////////////////////////////////////////////////////////////// // Emitter Functions /** Emits 8 bits of data to our code generation buffer. */ void EmitByte(unsigned char byte); /** Emits 16 bits of data to our code generation buffer. */ void EmitWord(unsigned short word); /** Emits 32 bits of data to our code generation buffer. */ void EmitDword(unsigned long dword); //////////////////////////////////////////////////////////////////////////////////////////////// // Emitter Modifier Functions /** standard opcode emitter helper. */ void ModRM(unsigned char mod, unsigned char rm, X86RegisterType register); /** standard opcode emitter helper. */ void SibSB(unsigned char sib, unsigned char rm, unsigned char index); //////////////////////////////////////////////////////////////////////////////////////////////// // Variables unsigned char* mBuffer; ///< The pointer to the allocated code generation buffer. unsigned long mBufferSize; ///< The size of the buffer in bytes. unsigned long mBufferPointer; ///< The current index into the allocated buffer. };
As
you can no doubt tell writing a dynamic recompiler, even a simple
one like the one we're creating takes alot of work. This is
a class that is basic enough to allow us to write values to
a CPU register and add to them then read them back, but even
this requires a hefty amount of code. In the constructor of
our class we will be allocating our code generation
buffer, the default size for this buffer will
be 256 bytes, this should be enough to encapsulate everything
that we will need for our simple demo.
Our simple demo should result in output similar to the following:
Moving value 14 to register EAX... Adding value 14 to register EAX... Expected Result :: 28 Result Found :: 28
This
is about as simple as dynamic recompilation programs will ever get.
This program is merely designed to test our dynamic recompiler implementation
and to ensure that we don't have any nasty errors hanging about
in the basics of our dynamic recompiler. All the names of the functions
for our class should be familiar to you if you've met the prerequisite
of being familiar with assembly language, as they are very standard
assembly language functions. For those who don't know though, the
MOV opcode on x86 CPUs moves a value into one of the CPU registers
(i.e. EAX, EBX, ECX, etc..) and the ADD opcode on x86 CPUs adds
an immediate value to one of the CPU registers, an immediate value
being one not read from memory or another register. All the implementation
details of these opcodes, including the emitted hexcode values I
found in two documents (The
Nasm Manual and Intel
P4 Processor Manuals). Basically implementing opcodes requires
research and accuracy, and it can often be a slow and somewhat boring
process. Since I'm writing this tutorial though I've done the hard
work for you, so here's the implementation for of this simple emitter
class:
X86Emitter::X86Emitter(unsigned long bufferSize) : mBuffer( NULL ), mBufferSize( bufferSize ), mBufferPointer( 0 ), { mBuffer = new unsigned char[bufferSize]; } /////////////////////////////////////////////////////////////////////////////////////////////////// X86Emitter::~X86Emitter() { if(mBuffer != NULL) { delete [] mBuffer; mBuffer = NULL; } } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::ExecuteBlock() { if(mBlockPointer > 0) { void (*BlockFunction)() = (void(*)())&mBlock; BlockFunction(); } } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::Mov16RtoM(unsigned int to, X86RegisterType from) { EmitByte( 0x66 ); EmitByte( 0x89 ); ModRM(0, from, DISP32); EmitDword( from ); } void X86Emitter::Mov16MtoR(X86RegisterType to, unsigned int from) { EmitByte( 0x66 ); EmitByte( 0x8B ); ModRM(0, to, DISP32); EmitDword( from ); } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::Add32ItoR(X86RegisterType to, unsigned int from) { if(to == EAX) { EmitByte( 0x05 ); } else { EmitByte( 0x81 ); ModRM(3, 0, to); } EmitDword( from ); } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::Ret() { EmitByte( 0xC3 ); } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::EmitByte(unsigned char byte) { mBuffer[ mBufferPointer++ ] = byte; } void X86Emitter::EmitWord(unsigned short word) { *((unsigned short*)(mBuffer + mBufferPointer)) = word; mBufferPointer += 2; } void X86Emitter::EmitDword(unsigned long dword) { *((unsigned long*)(mBuffer + mBufferPointer)) = dword; mBufferPointer += 4; } /////////////////////////////////////////////////////////////////////////////////////////////////// void X86Emitter::ModRM(unsigned char mod, unsigned char rm, X86RegisterType register) { EmitByte((mod << 6) | (rm << 4 ) | (register)); } void X86Emitter::SibSB(unsigned char sib, unsigned char rm, unsigned char index) { EmitByte(sib << 6) | (rm << 4) | (index)); }
#include "X86Emitter.hpp" #include <iostream> using namespace std; int main(int argc, char** argv) { X86Emitter CodeEmitter; // This is the code to do the moving of the value and the adding. cout<<"Moving Value 14 To Register EAX..."<<endl; CodeEmitter.MovMtoR(EAX, (unsigned int)&value); cout<<"Adding Value 14 To Register EAX..."<<endl; CodeEmitter.AddItoR(EAX, value); cout<<"Expected Result :: 28"<<endl; // Now we move the value back into our variable. CodeEmitter.Mov16RtoM((unsigned int)&value, EAX); CodeEmitter.Ret(); // Now We Run Our Program And Check The Output. CodeEmitter.ExecuteBlock(); cout<<"Result Found :: "<< value <<endl; return EXIT_SUCCESS; }
If
everthing went well then we should get the output shown above. If
something went wrong then its probably time to go back and debug.
Anyway, I hope that I have made the topic of dynamic recompilation
at least a little less scary through this tutorial, and hopefully
I have show you that, while its not simple to make one, they aren't
necessarily that difficult either.
Credits: I give major credits for this document to GoldFinger the author of GoldRec, some of the source code in here is borrowed from his dynamic recompiler which I think is an excellent one, and certainly something to look out for.