YaK:: WebLog #535 Topic : 2007-06-27 06.22.48 matt : analyzing Ruby 1.9's bytecode [Changes]   [Calendar]   [Search]   [Index]   [PhotoTags]   
  [Back to weblog: pretention]  
[mega_changes]
[photos]

analyzing Ruby 1.9's bytecode

Some of my initial thoughts on how to approach binary code analysis for Ruby 2.0 and its x86 JIT.


Unless I'm reading incorrectly, it appears as though YARV (Yet Another Ruby VM) has been chosen as the runtime for Ruby 1.9. Many Ruby applications are data-driven and browser-based. As such, there is a general perception that Ruby apps are *not* compute bound, but I/O bound most of the time. That being said, there are more and more little piece of Ruby that are rewritten in C and given a Ruby-language binding. Some folks are even introducing security issues by doing some sensitive calculations in the browser-side JavaScript to ease the load on the Ruby-based server-side.

Whatever the reality is, a VM is coming, and anyone using Ruby professionally will most likely need to be deploying this new compiler and VM. The good news is that it will be relatively easy to do whole-program code analysis with the compiled code.

With my previous company, we used mono and had great success with it for what we were doing. This didn't come for free, though, we had to pay close attention to bugs, commits, etc and deploy relevant fixes accordingly. One aspect that had continual issues were regressions in the IL compiler and x86 optimizing JIT compiler. Learning from that experience, I think there are two steps to ensuring the quality of the Ruby VM/JIT:

1. analyze the bytecode for the mine canary of compiler (and programmer) screw-ups, a dead store (explained below).

2. analyze the x86 JIT code for dead stores.

A dead store is the assignment of a value to a location (register or memory location) that is clobbered by another assignment before ever being read.

a = 4;
a = c;

mov eax, 4
move eax, ecx

In my experience, dead stores either means (in order of likelihood): a compiler bug that generated bad/non-working code, a programming bug (like using the ++ operator on an Integer in Java without assigning the result), superfluous code in the source that could be deleted, or a sloppy compiler that generates working but suboptimal code.

Dead stores are also relatively easy to detect, as they don't require inter-function value tracking. They are also the most valuable check in findbugs and FxCop that results in the most true positives that are real bugs. Of course, IntelliJ and Resharper warn of this kind of issue and many more in the IDE as you type, anyways.

Hopefully my work in this area will be sponsored so that people can deploy Ruby 1.9 and take advantage of the JIT without the issues I ran into previously with other bytecode/JIT compilers.

Discussion:

showing all 0 messages    

(No messages)

>
Post a new message:

   

(unless otherwise marked) Copyright 2002-2014 YakPeople. All rights reserved.
(last modified 2007-06-27)       [Login]
(No back references.)