Dynamic recompilation

In computer science, dynamic recompilation (sometimes abbreviated to dynarec or the pseudo-acronym DRC) is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code to reflect the program's run-time environment, and potentially produce more efficient code by exploiting information that is not available to a traditional static compiler.

Uses

Most dynamic recompilers are used to convert machine code between architectures at runtime. This is a task often needed in the emulation of legacy gaming platforms. In other cases, a system may employ dynamic recompilation as part of an adaptive optimization strategy to execute a portable program representation such as Java or .NET Common Language Runtime bytecodes. Full-speed debuggers also utilize dynamic recompilation to reduce the space overhead incurred in most deoptimization techniques, and other features such as dynamic thread migration.

Tasks

The main tasks a dynamic recompiler has to perform are:

  • Reading in machine code from the source platform
  • Emitting machine code for the target platform

A dynamic recompiler may also perform some auxiliary tasks:

  • Managing a cache of recompiled code
  • Updating of elapsed cycle counts on platforms with cycle count registers
  • Management of interrupt checking
  • Providing an interface to virtualized support hardware, for example a GPU
  • Optimizing higher level code structures to run efficiently on the target hardware (see below)

Example

Suppose a program is being run in an emulator and needs to copy a null-terminated string. The program is compiled originally for a very simple processor. This processor can only copy a byte at a time, and must do so by first reading it from the source string into a register, then writing it from that register into the destination string. The original program might look something like this:

beginning:
    mov A,[first string pointer]    ; Put location of first character of source string
                                    ; in register A
    mov B,[second string pointer]   ; Put location of second character of destination string
                                    ; in register B
loop:
    mov C,[A]            ; Copy byte at address in register A to register C
    mov [B],C            ; Copy byte in register C to the address in register B
    inc A                ; Increment the address in register A to point to
                         ; the next byte
    inc B                ; Increment the address in register B to point to
                         ; the next byte
    cmp C,#0             ; Compare the data we just copied to 0 (string end marker)
     jnz loop            ; If it wasn't 0 then we have more to copy, so go back
                         ; and copy the next byte
end:                     ; If we didn't loop then we must have finished,
                         ; so carry on with something else.

The emulator might be running on a processor which is similar, but extremely good at copying strings, and the emulator knows it can take advantage of this. It might recognize the string copy sequence of instructions and decide to rewrite them more efficiently just before execution, to speed up the emulation.

Say there is an instruction on our new processor called movs, specifically designed to copy strings efficiently. Our theoretical movs instruction copies 16 bytes at a time, without having to load them into register C in between, but will stop if it copies a 0 byte (which marks the end of a string) and set the zero flag. It also knows that the addresses of the strings will be in registers A and B, so it increments A and B by 16 every time it executes, ready for the next copy.

Our new recompiled code might look something like this:

beginning:
    mov A,[first string pointer]    ; Put location of first character of source string
                                    ; in register A
    mov B,[second string pointer]   ; Put location of first character of destination string
                                    ; in register B
loop:
    movs [B],[A]            ; Copy 16 bytes at address in register A to address
                            ; in register B, then increment A and B by 16
     jnz loop               ; If the zero flag isn't set then we haven't reached
                            ; the end of the string, so go back and copy some more.
end:                        ; If we didn't loop then we must have finished,
                            ; so carry on with something else.

There is an immediate speed benefit simply because the processor doesn't have to load so many instructions to do the same task, but also because the movs instruction is likely to be optimized by the processor designer to be more efficient than the sequence used in the first example. For example, it may make better use of parallel execution in the processor to increment A and B while it is still copying bytes.

Applications

General purpose

  • Many Java virtual machines feature dynamic recompilation.
  • Apple's Rosetta for Mac OS X on x86, allows PowerPC code to be run on the x86 architecture.
  • Later versions of the Mac 68K emulator used in classic Mac OS to run 680x0 code on the PowerPC hardware.
  • Psyco, a specializing compiler for Python.
  • The HP Dynamo project, an example of a transparent binary dynamic optimizer.[1]
  • DynamoRIO, an open-source successor to Dynamo that works with the ARM, x86-64 and IA-64 (Itanium) instruction sets.[2][3]
  • The Vx32 virtual machine employs dynamic recompilation to create OS-independent x86 architecture sandboxes for safe application plugins.
  • Microsoft Virtual PC for Mac, used to run x86 code on PowerPC.
  • QEMU, an open-source full system emulator.
  • FreeKEYB, an international DOS keyboard and console driver with many usability enhancements utilized self-modifying code and dynamic dead code elimination to minimize its in-memory image based on its user configuration (selected features, languages, layouts) and actual runtime environment (OS variant and version, loaded drivers, underlying hardware), automatically resolving dependencies, dynamically relocating and recombining code sections on byte-level granularity and optimizing opstrings based on semantic information provided in the source code, relocation information generated by special tools during assembly and profile information obtained at load time.[4][5]
  • OVPsim,[6] a freely available full system emulator.
  • VirtualBox uses dynamic recompilation.
  • Valgrind, a programming tool for memory debugging, memory leak detection, and profiling, uses dynamic recompilation.

Gaming

  • MAME uses dynamic recompilation in its CPU emulators for MIPS, SuperH, PowerPC and even the Voodoo graphics processing units.
  • Wii64, a Nintendo 64 emulator for the Wii.
  • WiiSX, a Sony PlayStation emulator for the Nintendo Wii.
  • Mupen64Plus, a multi-platform Nintendo 64 emulator.[7]
  • Yabause, a multi-platform Saturn emulator.[8]
  • The backwards compatibility functionality of the Xbox 360 (i.e. running games written for the original Xbox) is widely assumed to use dynamic recompilation.
  • PPSSPP, a Sony PlayStation Portable emulator. Recompilers for both x86 and ARM.
  • PSEmu Pro, a Sony PlayStation emulator.
  • Ultrahle, the first Nintendo 64 emulator to fully run commercial games.
  • PCSX2,[9] a Sony PlayStation 2 emulator, has a recompiler called "microVU", the successor of "SuperVU".
  • Dolphin, a Nintendo GameCube and Wii emulator, has a dynarec option.
  • GCemu,[10] a Nintendo GameCube emulator.
  • NullDC, a Sega Dreamcast emulator for x86.
  • GEM,[11] a Nintendo Game Boy emulator for MSX uses an optimizing dynamic recompiler.
  • DeSmuME,[12] a Nintendo DS emulator, has a dynarec option.
  • Soywiz's Psp,[13] a Sony PlayStation Portable emulator, has a dynarec option.
  • RPCS3, a Sony PlayStation 3 emulator. Recompilers both PPU and SPU on Cell Processor for x86-64
  • Decaf-emu, a Wii U emulator, uses dynamic recompilation (JIT) from PowerPC32 to x86_64 code hardware using libbinrec library (library itself can run on any hardware architecture).
gollark: Yes, not leaking memory is good.
gollark: It can even compile to WASM, hence browser NAP™.
gollark: Yes, rust good?
gollark: The standard library is a TOTAL apiohazard.
gollark: I mean, it is somewhat annoying, yes.

See also

References

  1. "HP Labs' technical report on Dynamo".
  2. http://www.dynamorio.org/home.html
  3. https://github.com/DynamoRIO/dynamorio
  4. Paul, Matthias R.; Frinke, Axel C. (1997-10-13) [first published 1991], FreeKEYB - Enhanced DOS keyboard and console driver (User Manual) (v6.5 ed.) (NB. FreeKEYB is a Unicode-based dynamically configurable successor of K3PLUS supporting most keyboard layouts, code pages, and country codes. K3PLUS was an extended keyboard driver for DOS widely distributed in Germany at its time, with adaptations to a handful of other European languages available. It did support a sub-set of the FreeKEYB features already, but was statically configured and did not support dynamic dead code elimination.)
  5. Paul, Matthias R.; Frinke, Axel C. (2006-01-16), FreeKEYB - Advanced international DOS keyboard and console driver (User Manual) (v7 preliminary ed.)
  6. "OVPsim".
  7. Mupen64Plus
  8. "SH2".
  9. "PCSX 2".
  10. petebernert. "GCemu". SourceForge.
  11. "Gameboy Emulator for MSX | The New Image". GEM. Retrieved 2014-01-12.
  12. "DeSmuME v0.9.9".
  13. Publicado por Carlos Ballesteros Velasco (2013-07-28). "Soywiz's PSP Emulator: Release : Soywiz's Psp Emulator 2013-07-28 (r525)". Pspemu.soywiz.com. Retrieved 2014-01-12.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.