I Am In Control of These Electrons: Word-aligned bytecode

Thinking about bytecode interpreters again... specifically variable-size instruction sets, and the detail that bugs me is that full-size words are often not word-aligned. In some cases (x86), the cpu will read an unaligned word, no problem. Otherwise the word needs to read byte-by-byte, and this inefficiency bothers me.

So here's an idea for a bytecode format that keeps all words aligned, separate from the instructions, but still interleaved with them. Assume a 32-bit word machine, 4 byte-sized instructions per word.

+--+--+--+--+ +-+-+-+-+ +-+-+-+-+ +--+--+--+--+
|i1|i2|i3|i4| | w1 | | w2 | |i5|i6|i7|i8|
+--+--+--+--+ +-+-+-+-+ +-+-+-+-+ +--+--+--+--+

Of the first four instructions, i2 and i4 require a word-sized argument. Each word argument moves the program counter after it is used. When the word-sized batch of instructions are complete, the pc should point to the next batch of instructions. Here's the general idea for the interpreter loop:

while (...) {

  u32 ins = word[pc++];

  while (ins) {

    switch(ins & 0xff) {

      case 0: break; //noop

      case i2: arg = word[pc++]; break;

      case i4: arg = word[pc++]; break; 

      case branch: pc = word[pc]; ins = word[pc++]; continue;

...

}

    ins >>= 8;

}

}

I Am In Control of These Electrons

Friday, June 15, 2012

Word-aligned bytecode

No comments:

Post a Comment