Binary translation

Last updated

In computing, binary translation is a form of binary recompilation where sequences of instructions are translated from a source instruction set to the target instruction set. In some cases such as instruction set simulation, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot detection.

Contents

The two main types are static and dynamic binary translation. Translation can be done in hardware (for example, by circuits in a CPU) or in software (e.g. run-time engines, static recompiler, emulators).

Motivation

Binary translation is motivated by a lack of a binary for a target platform, the lack of source code to compile for the target platform, or otherwise difficulty in compiling the source for the target platform.

Statically-recompiled binaries run potentially faster than their respective emulated binaries, as the emulation overhead is removed. This is similar to the difference in performance between interpreted and compiled programs in general.

Static binary translation

A translator using static binary translation aims to convert all of the code of an executable file into code that runs on the target architecture without having to run the code first, as is done in dynamic binary translation. This is very difficult to do correctly, since not all the code can be discovered by the translator. For example, some parts of the executable may be reachable only through indirect branches, whose value is known only at run-time.

One such static binary translator uses universal superoptimizer peephole technology (developed by Sorav Bansal and Alex Aiken from Stanford University) to perform efficient translation between possibly many source and target pairs, with considerably low development costs and high performance of the target binary. In experiments of PowerPC-to-x86 translations, some binaries even outperformed native versions, but on average they ran at two-thirds of native speed. [1]

Examples for static binary translations

Honeywell provided a program called the Liberator for their Honeywell 200 series of computers; it could translate programs for the IBM 1400 series of computers into programs for the Honeywell 200 series. [2]

In 2014, an ARM architecture version of the 1998 video game StarCraft was generated by static recompilation and additional reverse engineering of the original x86 version. [3] [4] The Pandora handheld community was capable of developing the required tools [5] on their own and achieving such translations successfully several times. [6] [7]

For instance, a successful x86-to-x64 static recompilation was generated for the procedural terrain generator of the video game Cube World in 2014. [8]

Another example is the NES-to-x86 statically recompiled version of the videogame Super Mario Bros. which was generated under usage of LLVM in 2013. [9]

In 2004 Scott Elliott and Phillip R. Hutchinson at Nintendo developed a tool to generate "C" code from Game Boy binary that could then be compiled for a new platform and linked against a hardware library for use in airline entertainment systems. [10]

In 1995 Norman Ramsey at Bell Communications Research and Mary F. Fernandez at Department of Computer Science, Princeton University developed The New Jersey Machine-Code Toolkit that had the basic tools for static assembly translation. [11]

Dynamic binary translation

Dynamic binary translation (DBT) looks at a short sequence of code—typically on the order of a single basic block—then translates it and caches the resulting sequence. Code is only translated as it is discovered and when possible, and branch instructions are made to point to already translated and saved code (memoization).

Dynamic binary translation differs from simple emulation (eliminating the emulator's main read-decode-execute loop—a major performance bottleneck), paying for this by large overhead during translation time. This overhead is hopefully amortized as translated code sequences are executed multiple times.

More advanced dynamic translators employ dynamic recompilation where the translated code is instrumented to find out what portions are executed a large number of times, and these portions are optimized aggressively. This technique is reminiscent of a JIT compiler, and in fact such compilers (e.g. Sun's HotSpot technology) can be viewed as dynamic translators from a virtual instruction set (the bytecode) to a real one.

Examples for dynamic binary translations in software

Examples for dynamic binary translations in hardware

See also

Related Research Articles

<span class="mw-page-title-main">Itanium</span> Family of 64-bit Intel microprocessors

Itanium is a discontinued family of 64-bit Intel microprocessors that implement the Intel Itanium architecture. The Itanium architecture originated at Hewlett-Packard (HP), and was later jointly developed by HP and Intel. Launched in June 2001, Intel initially marketed the processors for enterprise servers and high-performance computing systems. In the concept phase, engineers said "we could run circles around PowerPC...we could kill the x86." Early predictions were that IA-64 would expand to the lower-end servers, supplanting Xeon, and eventually penetrate into the personal computers, eventually to supplant reduced instruction set computing (RISC) and complex instruction set computing (CISC) architectures for all general-purpose applications.

IA-64 is the instruction set architecture (ISA) of the discontinued Itanium family of 64-bit Intel microprocessors. The basic ISA specification originated at Hewlett-Packard (HP), and was subsequently implemented by Intel in collaboration with HP. The first Itanium processor, codenamed Merced, was released in 2001.

Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software.

<span class="mw-page-title-main">Transmeta Crusoe</span>

The Transmeta Crusoe is a family of x86-compatible microprocessors developed by Transmeta and introduced in 2000.

In computer science, dynamic recompilation is a feature of some emulators and virtual machines, where the system may recompile some part of a program during execution. By compiling during execution, the system can tailor the generated code to reflect the program's run-time environment, and potentially produce more efficient code by exploiting information that is not available to a traditional static compiler.

In computing, just-in-time (JIT) compilation is compilation during execution of a program rather than before execution. This may consist of source code translation but is more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.

The Mac 68k emulator is a software emulator built into all versions of the classic Mac OS for PowerPC. This emulator enabled running applications and system code that were originally written for the 680x0-based Macintosh models. With a few exceptions, notably Connectix's RAM Doubler, the emulator ran all software with no noticeable impact other than lower performance relative to the same program when compiled for PowerPC.

In software engineering, a compatibility layer is an interface that allows binaries for a legacy or foreign system to run on a host system. This translates system calls for the foreign system into native system calls for the host system. With some libraries for the foreign system, this will often be sufficient to run foreign binaries on the host system. A hardware compatibility layer consists of tools that allow hardware emulation.

The Popek and Goldberg virtualization requirements are a set of conditions sufficient for a computer architecture to support system virtualization efficiently. They were introduced by Gerald J. Popek and Robert P. Goldberg in their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures". Even though the requirements are derived under simplifying assumptions, they still represent a convenient way of determining whether a computer architecture supports efficient virtualization and provide guidelines for the design of virtualized computer architectures.

FX!32 is a software emulator program that allows Win32 programs built for the Intel x86 instruction set to execute on DEC Alpha-based systems running Windows NT. Released in 1996, FX!32 was developed by Digital Equipment Corporation (DEC) to support their Alpha microprocessors. At the time, there was a belief that RISC-based microprocessors were likely to replace x86-based microprocessors, due to a more efficient and simplified implementation that could reach higher clock frequencies. The one thing that held the Alpha back was application-compatibility with existing Win32 x86 applications.

<span class="mw-page-title-main">QEMU</span> Free virtualization and emulation software

QEMU is a free and open-source emulator. It emulates a computer's processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it to run a variety of guest operating systems. It can interoperate with Kernel-based Virtual Machine (KVM) to run virtual machines at near-native speed. QEMU can also do emulation for user-level processes, allowing applications compiled for one architecture to run on another.

<span class="mw-page-title-main">Rosetta (software)</span> Operating system component

Rosetta is a dynamic binary translator developed by Apple Inc. for macOS, an application compatibility layer between different instruction set architectures. It enables a transition to newer hardware, by automatically translating software. The name is a reference to the Rosetta Stone, the artifact which enabled translation of Egyptian hieroglyphs.

<span class="mw-page-title-main">Binary Modular Dataflow Machine</span>

Binary Modular Dataflow Machine (BMDFM) is a software package that enables running an application in parallel on shared memory symmetric multiprocessing (SMP) computers using the multiple processors to speed up the execution of single applications. BMDFM automatically identifies and exploits parallelism due to the static and mainly dynamic scheduling of the dataflow instruction sequences derived from the formerly sequential program.

SoftPC is a software emulator of x86 hardware. It was developed by Rod MacGregor, Henry Nash & Phil Bousfield, following the founding of Insignia Solutions in 1986 by MacGregor, with "about a dozen people who had left the CAD/CAM workstation specialist Computervision", believing in a market opportunity for an independent CAD/CAM consultancy. Originally a side project, SoftPC quickly became the main focus for the company, running PC software on workstations. Originally developed on SPARCstations, the first customer to license the software was Intergraph whose workstation had a proprietary processor architecture. Available originally on UNIX workstations to run MS-DOS, the software was ported to the Macintosh in 1987, and later gained the ability to run Microsoft Windows software. Besides Mac OS, supported platforms included SGI IRIX, Sun Solaris, HP-UX, IBM AIX, NeXTSTEP, Motorola 88000, OpenVMS on VAX and DEC Alpha systems, DEC ULTRIX, and others.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

The IA-32 Execution Layer is a software emulator in the form of a software driver that improves performance of 32-bit applications running on 64-bit Intel Itanium-based systems, particularly those running Linux and Windows Server 2003. The IA-32 EL bypasses the slow x86 hardware emulation which is available on pre-Montecito Itanium models.

PowerVM Lx86 was a binary translation layer for IBM's System p servers. It enabled 32-bit x86 Linux binaries to run unmodified on the Power ISA-based hardware. IBM used this feature to migrate x86 Linux servers to the PowerVM virtualized environment; it was supported on all POWER5 and POWER6 hardware as well as BladeCenter JS21 and JS22 systems.

Cross-platform virtualization is a form of computer virtualization that allows software compiled for a specific instruction set and operating system to run unmodified on computers with different CPUs and/or operating systems, through a combination of dynamic binary translation and operating system call mapping.

<span class="mw-page-title-main">Emulator</span> System allowing a device to imitate another

In computing, an emulator is hardware or software that enables one computer system to behave like another computer system. An emulator typically enables the host system to run software or use peripheral devices designed for the guest system. Emulation refers to the ability of a computer program in an electronic device to emulate another program or device.

Project Denver is the codename of a central processing unit designed by Nvidia that implements the ARMv8-A 64/32-bit instruction sets using a combination of simple hardware decoder and software-based binary translation where "Denver's binary translation layer runs in software, at a lower level than the operating system, and stores commonly accessed, already optimized code sequences in a 128 MB cache stored in main memory". Denver is a very wide in-order superscalar pipeline. Its design makes it suitable for integration with other SIPs cores into one die constituting a system on a chip (SoC).

References

  1. Bansal, Sorav; Aiken, Alex (December 2008). "Binary Translation Using Peephole Superoptimizers" (PDF). Proceedings of the 8th USENIX conference on Operating systems design and implementation. pp. 177–192.
  2. Honeywell Series 200 Summary Description (PDF). Honeywell. February 1966. p. 11. For example, the instruction repertoire of Series 200 processors is similar enough to those of several other processing systcms, viz., the IBM 1400 series, to allow automated, one-time translation of programs written for these competitive systems to a form suitable for execution on higher-performance Series 200 systems.
  3. Steinlechner, Peter (2014-03-10). "Starcraft für ARM-Handheld kompiliert" (in German). golem.de. Retrieved 2014-03-25.
  4. notaz (2014-03-04). "StarCraft". repo.openpandora.org. Retrieved 2014-03-26.
  5. notaz (2014-03-01). "ia32rtools/". GitHub . Retrieved 2015-01-09.
  6. notaz (2014-03-04). "Starcraft". openpandora.org. Retrieved 2014-03-29. The "no source, no port" rule is not completely true, you can get something similar (but not the same) as a port through static recompilation. Similar stuff was done several times by M-HT for some DOS games. The game was also converted for Android with somewhat similar approach.
  7. M-HT. "Warcraft: Orcs & Humans". repo.openpandora.org.
  8. Kærlev, Mathias (2014-04-14). "Practical and Portable X86 Recompilation" . Retrieved 2014-08-08. but then the idea of somehow using the original x86 machine code presented itself. However, for our open server, we need to support x86-64 as well, and in that case, we absolutely need emulation or recompilation. […] Static recompilation to assembler seemed like a much better option, but to keep it portable, we would need to write backends for x86, x86-64, and possibly ARM/PowerPC.
  9. Kelley, Andrew (2013-07-07). "Statically Recompiling NES Games into Native Executables with LLVM and Go" . Retrieved 2013-08-08. This article presents original research regarding the possibility of statically disassembling and recompiling Nintendo Entertainment System games into native executables.
  10. US 7765539,Elliott, Scott&Hutchinson, Phillip,"System and method for trans-compiling video games",issued 2010
  11. Ramsey, Norman; Fernandez, Mary F. (1995). "The New Jersey Machine-Code Toolkit". Proceeding TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings. USENIX Association Berkeley, CA, USA. p. 24.
  12. 1 2 Wharton, John Harrison (1994-08-01). "Gary Kildall, industry pioneer, dead at 52: created first microcomputer languages, disk operating systems". Microprocessor Report . 8 (10). MicroDesign Resources Inc. (MDR). Archived from the original on 2016-11-18. Retrieved 2016-11-18. […] Ironically, many of the techniques Gary pioneered are being rediscovered now, ten years later. Apple and DEC are touting binary recompilation as a "new" technology for porting existing software to the PowerPC or Alpha architecture. Actually, DRI introduced an 8080-to-8086 binary recompiler in the early 1980s. […]
  13. Carlson, Jim; Huck, Jerry (2003). Itanium Rising: Breaking Through Moore's Second Law of Computing Power. Prentice Hall PTR. ISBN   978-0-13046415-6 . Retrieved 2015-01-09.
  14. "HP ARIES Dynamic Binary Translator". HP. Archived from the original on 2015-01-10. Retrieved 2015-01-09.
  15. Stokes, Jon. "Transmeta Crusoe Explored". Ars Technica . Retrieved 2015-01-09.
  16. Hughes, Rob (2000-01-20). "Transmeta's Crusoe Microprocessor". geek.com. Archived from the original on 2007-09-27.
  17. "Transmeta Crusoe Processor Frequently Asked Questions FAQ". Transmeta. 2007. Archived from the original on 2007-01-10.

Further reading