(Created page with "Time to create an overview of how you can make managed languages work on bare hardware Steps: # define native ABI for Java # create bytecode-to-native compiler (uses objectwe...")
(Update compiler with some flavour text)
Line 12:
Line 12:
(追記) == Compiler == (追記ここまで)
(追記) In bytecoded languages there are several steps before code can be run. Typically you have "the" compiler, which converts your source files into some portable binary, and you have an interpreter that reads those binaries and runs instructions from them. Modern interpreters turn bytecode into native code, as to avoid the <tt>if(instruction = ...)</tt> that takes several cycles while the instruction you'd actually want to execute would otherwise cost you just one CPU instruction. (追記ここまで)
(追記) In the case of an OS, we need to take this a step further. We could run an interpreter, but that's slow. We could compile into native on boot but that needs just as much OS as we actually want to run. Instead, the appropriate solution is to compile to native in advance, so we can just run the code directly from the start. (追記ここまで)
(追記) The entirety of a language is still a whole lot to deal with, but for our example it suffices to deal with just integers. That's right, no objects yet! (追記ここまで)
(追記) The Java bytecode uses a stack for operations, and a list of locals. These need not be in the same place, but as the x86 only has one hardware stack, we'll be using it as both local stack, call stack, and operation stack. A few tricks are used to make this compiler easier, and in turn, make it difficult to interface with C. Locals in Java terms include the function arguments, and as a result the locals would be split around return addresses. Since the caller doesn't know the storage needed - it doesn't even get the number of arguments for free - the called function should fix this. We also can't put things past the top of the stack because that'll be a big issue with interrupts later. Basically we copy all arguments to the other side of the return address and then we make some room for locals so that they can be indexed by <tt>EBP - 4 * slot_number</tt> where for instance 0 and 1 would be arguments and 2+ would be true locals. (追記ここまで)
(追記) The fact that getting the number of arguments is convoluted to perform on the caller's side, we do callee-cleanup using <tt>RET imm</tt> instead of the regular <tt>RET</tt>. Locals is a convoluted issue as well, so we just reserve room for 8 because you're not meant to copy this code anyway. (追記ここまで)
<source lang="java">
<source lang="java">
Line 27:
Line 39:
File file = new File(filename);
File file = new File(filename);
(削除) (削除ここまで)try
(追記) (追記ここまで)try
(削除) (削除ここまで){
(追記) (追記ここまで){
(削除) (削除ここまで)FileInputStream input = new FileInputStream(file);
(追記) (追記ここまで)FileInputStream input = new FileInputStream(file);
(削除) (削除ここまで)byte bytes[] = new byte[(int)file.length()];
(追記) (追記ここまで)byte bytes[] = new byte[(int)file.length()];
(削除) (削除ここまで)input.read(bytes);
(追記) (追記ここまで)input.read(bytes);
(削除) (削除ここまで)return bytes;
(追記) (追記ここまで)return bytes;
(削除) (削除ここまで)}
(追記) (追記ここまで)}
(削除) (削除ここまで)catch (IOException e)
(追記) (追記ここまで)catch (IOException e)
(削除) (削除ここまで){
(追記) (追記ここまで){
throw new RuntimeException("Unable to read file " + file, e);
throw new RuntimeException("Unable to read file " + file, e);
(削除) (削除ここまで)}
(追記) (追記ここまで)}
Line 44:
Line 55:
(削除) (削除ここまで){
(追記) (追記ここまで){
BufferedWriter writer = new BufferedWriter(new FileWriter(new File(filename)));
BufferedWriter writer = new BufferedWriter(new FileWriter(new File(filename)));
(削除) (削除ここまで)for (String string : lines)
(追記) (追記ここまで)for (String string : lines)
(削除) (削除ここまで){
(追記) (追記ここまで){
writer.write(string);
writer.write(string);
(削除) (削除ここまで)writer.newLine();
(追記) (追記ここまで)writer.newLine();
(削除) (削除ここまで)}
(追記) (追記ここまで)}
(削除) (削除ここまで)writer.close();
(追記) (追記ここまで)writer.close();
(削除) (削除ここまで)}
(追記) (追記ここまで)}
catch (IOException e)
catch (IOException e)
(削除) (削除ここまで){
(追記) (追記ここまで){
throw new RuntimeException("Unable to write output file " + filename, e);
throw new RuntimeException("Unable to write output file " + filename, e);
(削除) (削除ここまで)}
(追記) (追記ここまで)}
Line 62:
Line 73:
// make all these assembly-friendly names. Note that the
// make all these assembly-friendly names. Note that the
(削除) (削除ここまで)// constructor is for instance called <init>
(追記) (追記ここまで)// constructor is for instance called <init>
return classname.replace("/","_") + "__" + method.replace("<","_").replace(">","_");
return classname.replace("/","_") + "__" + method.replace("<","_").replace(">","_");
(追記) public static void jumpgroup(List<String> output, String jumpcode, LabelNode dest) (追記ここまで)
(追記) output.add("pop edx"); (追記ここまで)
(追記) output.add("pop ecx"); (追記ここまで)
(追記) output.add("cmp ecx, edx"); (追記ここまで)
(追記) output.add(jumpcode + " .l" + dest.getLabel()); (追記ここまで)
Line 71:
Line 90:
ClassNode node = new ClassNode();
ClassNode node = new ClassNode();
(削除) (削除ここまで)ClassReader reader = new ClassReader(readEntireFile(args[0]));
(追記) (追記ここまで)ClassReader reader = new ClassReader(readEntireFile(args[0]));
(削除) (削除ここまで)reader.accept(node, 0);
(追記) (追記ここまで)reader.accept(node, 0);
(削除) (削除ここまで)List<String> outputdata = new LinkedList<String>();
(追記) (追記ここまで)List<String> outputdata = new LinkedList<String>();
outputdata.add("section .text");
outputdata.add("section .text");
(削除) (削除ここまで)for (MethodNode method : node.methods)
(追記) (追記ここまで)for (MethodNode method : node.methods)
(削除) (削除ここまで){
(追記) (追記ここまで){
method.visitCode();
method.visitCode();
(削除) (削除ここまで)String methodname = decorate(node.name, method.name, method.signature);
(追記) (追記ここまで)String methodname = decorate(node.name, method.name, method.signature);
if ((method.access & Opcodes.ACC_NATIVE) != 0) continue;
if ((method.access & Opcodes.ACC_NATIVE) != 0) continue;
(削除) (削除ここまで)// prologue
(追記) (追記ここまで)// prologue
(削除) (削除ここまで)System.out.println("; attributes: " + method.attrs);
(追記) (追記ここまで)System.out.println("; attributes: " + method.attrs);
outputdata.add("global " + methodname);
outputdata.add("global " + methodname);
(削除) (削除ここまで)outputdata.add(methodname + ":");
(追記) (追記ここまで)outputdata.add(methodname + ":");
(削除) (削除ここまで)outputdata.add("push ebp");
(追記) (追記ここまで)outputdata.add("push ebp");
(削除) (削除ここまで)outputdata.add("mov ebp, esp"(削除) ); (削除ここまで)
(追記) (追記ここまで)outputdata.add("mov ebp, esp");
(削除) int locals = (method.localVariables == null) ? 0 : method.localVariables.size(); (削除ここまで)
(削除) outputdata.add("sub esp, " + locals * 4 (削除ここまで));
(削除) (削除ここまで)Iterator<AbstractInsnNode> iterator = method.instructions.iterator();
(追記) int locals = (method.localVariables == null) ? 0 : method.localVariables.size(); (追記ここまで)
(削除) (削除ここまで)while (iterator.hasNext())
(追記) outputdata.add("; locals: + " + locals); (追記ここまで)
(削除) (削除ここまで){ (削除) (削除ここまで)
(追記) int arguments = (method.parameters == null) ? 0 : method.parameters.size(); (追記ここまで)
(追記) if ((method.access & Opcodes.ACC_STATIC) == 0) arguments++; // hidden "this" (追記ここまで)
(追記) outputdata.add("; params: + " + arguments); (追記ここまで)
(追記) // copy params so that they correspond with java indexing and join with the local numbering (追記ここまで)
(追記) for (int i = 0; i < arguments; i++) (追記ここまで)
(追記) outputdata.add("push dword [ebp + " + (8 + 4 * i) + "]"); (追記ここまで)
(追記) // do some frame checking for many variables, locals is not of much use... (追記ここまで)
(追記) outputdata.add("sub esp, 32"); (追記ここまで)
(追記) (追記ここまで)Iterator<AbstractInsnNode> iterator = method.instructions.iterator();
(追記) (追記ここまで)while (iterator.hasNext())
(追記) (追記ここまで){ (追記) (追記ここまで)
AbstractInsnNode insn = iterator.next();
AbstractInsnNode insn = iterator.next();
(削除) (削除ここまで)if (insn instanceof LabelNode)
(追記) int opcode = insn.getOpcode() & 0xff; (追記ここまで)
(削除) (削除ここまで){
(追記) outputdata.add(" ; " + opcode + " = " + insn.getClass().getSimpleName()); (追記ここまで)
(削除) (削除ここまで)LabelNode labelinsn = (LabelNode)insn;
(追記) //outputdata.add("mov byte [" + (0xb8000 + 156) + "], '0' + " + (opcode % 10)); (追記ここまで)
(削除) (削除ここまで)outputdata.add(".l" + labelinsn.getLabel());
(追記) //outputdata.add("mov byte [" + (0xb8000 + 154) + "], '0' + " + (opcode / 10)%10); (追記ここまで)
(削除) (削除ここまで)}
(追記) //outputdata.add("mov byte [" + (0xb8000 + 152) + "], '0' + " + (opcode / 100)); (追記ここまで)
(削除) (削除ここまで)else if (insn instanceof LineNumberNode)
(追記) (追記ここまで)
(削除) (削除ここまで){
(追記) (追記ここまで)if (insn instanceof LabelNode)
(削除) (削除ここまで)// ignore these
(追記) (追記ここまで){
(削除) (削除ここまで)}
(追記) (追記ここまで)LabelNode labelinsn = (LabelNode)insn;
(削除) (削除ここまで)else
(追記) (追記ここまで)outputdata.add(".l" + labelinsn.getLabel());
(削除) (削除ここまで){
(追記) (追記ここまで)}
(削除) (削除ここまで)outputdata.add("Can't deal with " + insn.getClass().getSimpleName() + ", fix it (" + insn.getOpcode() + ")");
(追記) (追記ここまで)else if (insn instanceof LineNumberNode)
(削除) (削除ここまで)}
(追記) (追記ここまで){
(追記) (追記ここまで)// ignore these
(追記) else if (insn instanceof VarInsnNode) (追記ここまで)
(追記) // copy a variable (追記ここまで)
(追記) VarInsnNode varinsn = (VarInsnNode) insn; (追記ここまで)
(追記) switch(varinsn.getOpcode()) (追記ここまで)
(追記) case Opcodes.ILOAD: (追記ここまで)
(追記) case Opcodes.ALOAD: (追記ここまで)
(追記) // todo: verify offset (追記ここまで)
(追記) outputdata.add("push dword [ebp - " + (4 + 4 * varinsn.var) + "]"); (追記ここまで)
(追記) case Opcodes.ISTORE: (追記ここまで)
(追記) case Opcodes.ASTORE: (追記ここまで)
(追記) outputdata.add("pop dword [ebp - " + (4 + 4 * varinsn.var) + "]"); (追記ここまで)
(追記) outputdata.add("Can't deal with varinsnnode: " + varinsn.getOpcode()); (追記ここまで)
(追記) else if (insn instanceof MethodInsnNode) (追記ここまで)
(追記) MethodInsnNode methodinsn = (MethodInsnNode) insn; (追記ここまで)
(追記) String calledmethod = decorate(methodinsn.owner, methodinsn.name, methodinsn.desc); (追記ここまで)
(追記) outputdata.add("extern " + calledmethod); (追記ここまで)
(追記) outputdata.add("call " + calledmethod); (追記ここまで)
(追記) if (!methodinsn.desc.endsWith("V")) (追記ここまで)
(追記) // not a void return value (追記ここまで)
(追記) outputdata.add("push eax"); (追記ここまで)
(追記) /*switch (methodinsn.getOpcode()) (追記ここまで)
(追記) outputdata.add("Can't deal with methodcall: " + methodinsn.getOpcode()); (追記ここまで)
(追記) else if (insn instanceof IntInsnNode) (追記ここまで)
(追記) IntInsnNode intinsn = (IntInsnNode)insn; (追記ここまで)
(追記) switch(intinsn.getOpcode()) (追記ここまで)
(追記) case Opcodes.BIPUSH: (追記ここまで)
(追記) outputdata.add("push " + intinsn.operand); (追記ここまで)
(追記) outputdata.add("Can't deal with intinsnnode: " + intinsn.getOpcode()); (追記ここまで)
(追記) else if (insn instanceof JumpInsnNode) (追記ここまで)
(追記) JumpInsnNode jmpinsn = (JumpInsnNode)insn; (追記ここまで)
(追記) switch(jmpinsn.getOpcode()) (追記ここまで)
(追記) case Opcodes.IF_ICMPEQ: jumpgroup(outputdata, "je", jmpinsn.label); break; // 159 (追記ここまで)
(追記) case Opcodes.IF_ICMPNE: jumpgroup(outputdata, "jne", jmpinsn.label); break; // 160 (追記ここまで)
(追記) case Opcodes.IF_ICMPLT: jumpgroup(outputdata, "jb", jmpinsn.label); break; // 161 (追記ここまで)
(追記) case Opcodes.IF_ICMPGE: jumpgroup(outputdata, "jae", jmpinsn.label); break; // 162 (追記ここまで)
(追記) case Opcodes.IF_ICMPGT: jumpgroup(outputdata, "ja", jmpinsn.label); break; // 163 (追記ここまで)
(追記) case Opcodes.IF_ICMPLE: jumpgroup(outputdata, "jbe", jmpinsn.label); break; // 164 (追記ここまで)
(追記) case Opcodes.GOTO: // 167 (追記ここまで)
(追記) outputdata.add("jmp .l" + jmpinsn.label.getLabel()); (追記ここまで)
(追記) outputdata.add("Can't deal with jumpinsnnode: " + jmpinsn.getOpcode()); (追記ここまで)
(追記) else if (insn instanceof LdcInsnNode) (追記ここまで)
(追記) LdcInsnNode ldcinsn = (LdcInsnNode) insn; (追記ここまで)
(追記) if (ldcinsn.cst instanceof Integer) (追記ここまで)
(追記) outputdata.add("push " + ldcinsn.cst.toString()); (追記ここまで)
(追記) outputdata.add("Can't deal with data in ldcinsnnode (" + ldcinsn.getOpcode() +"): " + ldcinsn.cst.getClass().getSimpleName()); (追記ここまで)
(追記) else if (insn instanceof IincInsnNode) (追記ここまで)
(追記) IincInsnNode incinsn = (IincInsnNode) insn; (追記ここまで)
(追記) switch (incinsn.getOpcode()) (追記ここまで)
(追記) case Opcodes.IINC: (追記ここまで)
(追記) outputdata.add("add dword [ebp - " + (4 + 4 * incinsn.var) + "], " + incinsn.incr); (追記ここまで)
(追記) outputdata.add("Can't deal with iincinsnnode: " + incinsn.getOpcode()); (追記ここまで)
(追記) else if (insn.getOpcode() >= Opcodes.ICONST_M1 && insn.getOpcode() <= Opcodes.ICONST_5) // 2...8 (追記ここまで)
(追記) outputdata.add("push " + (insn.getOpcode() - Opcodes.ICONST_M1 - 1)); (追記ここまで)
(追記) else if (insn.getOpcode() == Opcodes.IADD) // 96 (追記ここまで)
(追記) outputdata.add("pop edx"); (追記ここまで)
(追記) outputdata.add("add [esp], edx"); (追記ここまで)
(追記) else if (insn.getOpcode() == Opcodes.IMUL) // 104 (追記ここまで)
(追記) outputdata.add("pop eax"); (追記ここまで)
(追記) outputdata.add("pop ecx"); (追記ここまで)
(追記) outputdata.add("imul ecx"); // eax:edx = eax * ecx (追記ここまで)
(追記) outputdata.add("push eax"); (追記ここまで)
(追記) else if (insn.getOpcode() == Opcodes.I2B) // 145 (追記ここまで)
(追記) outputdata.add("and dword [esp], 0xff"); (追記ここまで)
(追記) else if (insn.getOpcode() == Opcodes.RETURN) // 177 (追記ここまで)
(追記) outputdata.add("xor eax, eax"); (追記ここまで)
(追記) else if (insn instanceof FrameNode) (追記ここまで)
(追記) outputdata.add("; framenode"); (追記ここまで)
(追記) (追記ここまで)outputdata.add("Can't deal with " + insn.getClass().getSimpleName() + ", fix it (" + insn.getOpcode() + ")");
// epilogue
// epilogue(追記) , stdcall to save complexity on decoding methodinsns (追記ここまで)
(削除) (削除ここまで)outputdata.add("leave");
(追記) (追記ここまで)outputdata.add("leave");
(削除) (削除ここまで)outputdata.add("ret");
(追記) (追記ここまで)outputdata.add("ret " (追記) + arguments (追記ここまで));
Line 123:
Line 285:
(追記) Yes, that's a compiler in a magic 256 lines of code. The result is Intel syntax assembly, because there already are decent tools for assembly out there that deal with object formats out there. Of course you can write your own later as well. (追記ここまで)
Revision as of 19:00, 21 February 2016
Time to create an overview of how you can make managed languages work on bare hardware
Steps:
- define native ABI for Java
- create bytecode-to-native compiler (uses objectweb asm)
- compile compiler
- compile managed os to bytecode (uses regular javac)
- compile bytecode to native assembly
- create runtime for things that have to be non-native
- assemble os and runtime (uses yasm)
- package the final kernel binary. (uses binutils)
Compiler
In bytecoded languages there are several steps before code can be run. Typically you have "the" compiler, which converts your source files into some portable binary, and you have an interpreter that reads those binaries and runs instructions from them. Modern interpreters turn bytecode into native code, as to avoid the if(instruction = ...) that takes several cycles while the instruction you'd actually want to execute would otherwise cost you just one CPU instruction.
In the case of an OS, we need to take this a step further. We could run an interpreter, but that's slow. We could compile into native on boot but that needs just as much OS as we actually want to run. Instead, the appropriate solution is to compile to native in advance, so we can just run the code directly from the start.
The entirety of a language is still a whole lot to deal with, but for our example it suffices to deal with just integers. That's right, no objects yet!
The Java bytecode uses a stack for operations, and a list of locals. These need not be in the same place, but as the x86 only has one hardware stack, we'll be using it as both local stack, call stack, and operation stack. A few tricks are used to make this compiler easier, and in turn, make it difficult to interface with C. Locals in Java terms include the function arguments, and as a result the locals would be split around return addresses. Since the caller doesn't know the storage needed - it doesn't even get the number of arguments for free - the called function should fix this. We also can't put things past the top of the stack because that'll be a big issue with interrupts later. Basically we copy all arguments to the other side of the return address and then we make some room for locals so that they can be indexed by EBP - 4 * slot_number where for instance 0 and 1 would be arguments and 2+ would be true locals.
The fact that getting the number of arguments is convoluted to perform on the caller's side, we do callee-cleanup using RET imm instead of the regular RET. Locals is a convoluted issue as well, so we just reserve room for 8 because you're not meant to copy this code anyway.
package nl.combuster.minijava;
import org.objectweb.asm.*;
import org.objectweb.asm.tree.*;
import java.util.*;
import java.io.*;
public class Compiler
{
public static byte[] readEntireFile(String filename)
{
File file = new File(filename);
try
{
FileInputStream input = new FileInputStream(file);
byte bytes[] = new byte[(int)file.length()];
input.read(bytes);
return bytes;
}
catch (IOException e)
{
throw new RuntimeException("Unable to read file " + file, e);
}
}
public static void writeOutput(String filename, List<String> lines)
{
try
{
BufferedWriter writer = new BufferedWriter(new FileWriter(new File(filename)));
for (String string : lines)
{
writer.write(string);
writer.newLine();
}
writer.close();
}
catch (IOException e)
{
throw new RuntimeException("Unable to write output file " + filename, e);
}
}
public static String decorate(String classname, String method, String signature)
{
// make all these assembly-friendly names. Note that the
// constructor is for instance called <init>
return classname.replace("/","_") + "__" + method.replace("<","_").replace(">","_");
}
public static void jumpgroup(List<String> output, String jumpcode, LabelNode dest)
{
output.add("pop edx");
output.add("pop ecx");
output.add("cmp ecx, edx");
output.add(jumpcode + " .l" + dest.getLabel());
}
public static void main(String args[])
{
if (args.length != 2) throw new RuntimeException("Usage: compiler input-file output-file");
ClassNode node = new ClassNode();
ClassReader reader = new ClassReader(readEntireFile(args[0]));
reader.accept(node, 0);
List<String> outputdata = new LinkedList<String>();
outputdata.add("section .text");
for (MethodNode method : node.methods)
{
method.visitCode();
String methodname = decorate(node.name, method.name, method.signature);
if ((method.access & Opcodes.ACC_NATIVE) != 0) continue;
// prologue
System.out.println("; attributes: " + method.attrs);
outputdata.add("global " + methodname);
outputdata.add(methodname + ":");
outputdata.add("push ebp");
outputdata.add("mov ebp, esp");
int locals = (method.localVariables == null) ? 0 : method.localVariables.size();
outputdata.add("; locals: + " + locals);
int arguments = (method.parameters == null) ? 0 : method.parameters.size();
if ((method.access & Opcodes.ACC_STATIC) == 0) arguments++; // hidden "this"
outputdata.add("; params: + " + arguments);
// copy params so that they correspond with java indexing and join with the local numbering
for (int i = 0; i < arguments; i++)
{
outputdata.add("push dword [ebp + " + (8 + 4 * i) + "]");
}
// do some frame checking for many variables, locals is not of much use...
outputdata.add("sub esp, 32");
Iterator<AbstractInsnNode> iterator = method.instructions.iterator();
while (iterator.hasNext())
{
AbstractInsnNode insn = iterator.next();
int opcode = insn.getOpcode() & 0xff;
outputdata.add(" ; " + opcode + " = " + insn.getClass().getSimpleName());
//outputdata.add("mov byte [" + (0xb8000 + 156) + "], '0' + " + (opcode % 10));
//outputdata.add("mov byte [" + (0xb8000 + 154) + "], '0' + " + (opcode / 10)%10);
//outputdata.add("mov byte [" + (0xb8000 + 152) + "], '0' + " + (opcode / 100));
if (insn instanceof LabelNode)
{
LabelNode labelinsn = (LabelNode)insn;
outputdata.add(".l" + labelinsn.getLabel());
}
else if (insn instanceof LineNumberNode)
{
// ignore these
}
else if (insn instanceof VarInsnNode)
{
// copy a variable
VarInsnNode varinsn = (VarInsnNode) insn;
switch(varinsn.getOpcode())
{
case Opcodes.ILOAD:
case Opcodes.ALOAD:
// todo: verify offset
outputdata.add("push dword [ebp - " + (4 + 4 * varinsn.var) + "]");
break;
case Opcodes.ISTORE:
case Opcodes.ASTORE:
outputdata.add("pop dword [ebp - " + (4 + 4 * varinsn.var) + "]");
break;
default:
outputdata.add("Can't deal with varinsnnode: " + varinsn.getOpcode());
}
}
else if (insn instanceof MethodInsnNode)
{
MethodInsnNode methodinsn = (MethodInsnNode) insn;
String calledmethod = decorate(methodinsn.owner, methodinsn.name, methodinsn.desc);
outputdata.add("extern " + calledmethod);
outputdata.add("call " + calledmethod);
if (!methodinsn.desc.endsWith("V"))
{
// not a void return value
outputdata.add("push eax");
}
/*switch (methodinsn.getOpcode())
{
default:
outputdata.add("Can't deal with methodcall: " + methodinsn.getOpcode());
}*/
}
else if (insn instanceof IntInsnNode)
{
IntInsnNode intinsn = (IntInsnNode)insn;
switch(intinsn.getOpcode())
{
case Opcodes.BIPUSH:
outputdata.add("push " + intinsn.operand);
break;
default:
outputdata.add("Can't deal with intinsnnode: " + intinsn.getOpcode());
}
}
else if (insn instanceof JumpInsnNode)
{
JumpInsnNode jmpinsn = (JumpInsnNode)insn;
switch(jmpinsn.getOpcode())
{
case Opcodes.IF_ICMPEQ: jumpgroup(outputdata, "je", jmpinsn.label); break; // 159
case Opcodes.IF_ICMPNE: jumpgroup(outputdata, "jne", jmpinsn.label); break; // 160
case Opcodes.IF_ICMPLT: jumpgroup(outputdata, "jb", jmpinsn.label); break; // 161
case Opcodes.IF_ICMPGE: jumpgroup(outputdata, "jae", jmpinsn.label); break; // 162
case Opcodes.IF_ICMPGT: jumpgroup(outputdata, "ja", jmpinsn.label); break; // 163
case Opcodes.IF_ICMPLE: jumpgroup(outputdata, "jbe", jmpinsn.label); break; // 164
case Opcodes.GOTO: // 167
outputdata.add("jmp .l" + jmpinsn.label.getLabel());
break;
default:
outputdata.add("Can't deal with jumpinsnnode: " + jmpinsn.getOpcode());
}
}
else if (insn instanceof LdcInsnNode)
{
LdcInsnNode ldcinsn = (LdcInsnNode) insn;
if (ldcinsn.cst instanceof Integer)
{
outputdata.add("push " + ldcinsn.cst.toString());
}
else
{
outputdata.add("Can't deal with data in ldcinsnnode (" + ldcinsn.getOpcode() +"): " + ldcinsn.cst.getClass().getSimpleName());
}
}
else if (insn instanceof IincInsnNode)
{
IincInsnNode incinsn = (IincInsnNode) insn;
switch (incinsn.getOpcode())
{
case Opcodes.IINC:
outputdata.add("add dword [ebp - " + (4 + 4 * incinsn.var) + "], " + incinsn.incr);
break;
default:
outputdata.add("Can't deal with iincinsnnode: " + incinsn.getOpcode());
}
}
else if (insn.getOpcode() >= Opcodes.ICONST_M1 && insn.getOpcode() <= Opcodes.ICONST_5) // 2...8
{
outputdata.add("push " + (insn.getOpcode() - Opcodes.ICONST_M1 - 1));
}
else if (insn.getOpcode() == Opcodes.IADD) // 96
{
outputdata.add("pop edx");
outputdata.add("add [esp], edx");
}
else if (insn.getOpcode() == Opcodes.IMUL) // 104
{
outputdata.add("pop eax");
outputdata.add("pop ecx");
outputdata.add("imul ecx"); // eax:edx = eax * ecx
outputdata.add("push eax");
}
else if (insn.getOpcode() == Opcodes.I2B) // 145
{
outputdata.add("and dword [esp], 0xff");
}
else if (insn.getOpcode() == Opcodes.RETURN) // 177
{
outputdata.add("xor eax, eax");
}
else if (insn instanceof FrameNode)
{
outputdata.add("; framenode");
}
else
{
outputdata.add("Can't deal with " + insn.getClass().getSimpleName() + ", fix it (" + insn.getOpcode() + ")");
}
}
// epilogue, stdcall to save complexity on decoding methodinsns
outputdata.add("leave");
outputdata.add("ret " + arguments);
}
writeOutput(args[1], outputdata);
}
}
Yes, that's a compiler in a magic 256 lines of code. The result is Intel syntax assembly, because there already are decent tools for assembly out there that deal with object formats out there. Of course you can write your own later as well.