title WebLog #535 Topic: 2007-06-27 06.22.48 matt: analyzing Ruby 1.9's bytecode user matt ip 65.57.245.11 vol 1 lock ******** Some of my initial thoughts on how to approach binary code analysis for Ruby 2.0 and its x86 JIT. /more Unless I'm reading incorrectly, it appears as though /link http://eigenclass.org/hiki/non-synthetic-benchmarks-for-yarv YARV (Yet Another Ruby VM) has been chosen as the runtime for Ruby 1.9. Many Ruby applications are data-driven and browser-based. As such, there is a general perception that Ruby apps are *not* compute bound, but I/O bound most of the time. That being said, there are more and more little piece of Ruby that are rewritten in C and given a Ruby-language binding. Some folks are even introducing security issues by doing some sensitive calculations in the browser-side JavaScript to ease the load on the Ruby-based server-side. Whatever the reality is, a VM is coming, and anyone using Ruby professionally will most likely need to be deploying this new compiler and VM. The good news is that it will be relatively easy to do whole-program code analysis with the compiled code. With my previous company, we used /link http://mono-project.com mono and had great success with it for what we were doing. This didn't come for free, though, we had to pay close attention to bugs, commits, etc and deploy relevant fixes accordingly. One aspect that had continual issues were regressions in the IL compiler and x86 optimizing JIT compiler. Learning from that experience, I think there are two steps to ensuring the quality of the Ruby VM/JIT: 1. analyze the /link http://www.atdot.net/yarv/insnstbl.html bytecode for the mine canary of compiler (and programmer) screw-ups, a dead store (explained below). 2. analyze the x86 JIT code for dead stores. A /link http://findbugs.sourceforge.net/api/edu/umd/cs/findbugs/detect/DeadLocalStoreProperty.html dead store is the assignment of a value to a location (register or memory location) that is clobbered by another assignment before ever being read. /pre( a = 4; a = c; /pre) /pre( mov eax, 4 move eax, ecx /pre) In my experience, dead stores either means (in order of likelihood): a compiler bug that generated bad/non-working code, a programming bug (like using the ++ operator on an Integer in Java without assigning the result), superfluous code in the source that could be deleted, or a sloppy compiler that generates working but suboptimal code. Dead stores are also relatively easy to detect, as they don't require inter-function value tracking. They are also the most valuable check in /link http://findbugs.sf.net findbugs and /link http://blogs.msdn.com/fxcop FxCop that results in the most true positives that are real bugs. Of course, /link http://jetbrains.net IntelliJ and Resharper warn of this kind of issue and many more in the IDE as you type, anyways. Hopefully my work in this area will be sponsored so that people can deploy Ruby 1.9 and take advantage of the JIT without the issues I ran into previously with other bytecode/JIT compilers. /dis