Can machine code be translated to a different architecture?

So this is kind of related to a question about running a Windows server on ARM. So the premise of my question is, can machine code be translated from one architecture to another in order to execute a binary on an architecture different than the one it was compiled to run on.

QEMU and other emulators can translate the instructions on the fly, and therefore run an executable on a computer it wasn't compiled for. Why not do this translation ahead of time, instead of on the fly in order to speed up the process? From my somewhat limited knowledge of assembly, most of the instructions like MOV, ADD and others should be portable across architectures.

Anything that's doesn't have a direct mapping could be mapped to some other set of instructions, since all machines are Turing Complete. Would doing this be too complicated? Would it not work at all for some reason I'm unfamiliar with? Would it work, but yield no better results than using an emulator?

Kibbee

Posted 2011-08-26T20:52:28.997

Reputation: 1 372

The technique has likely fallen into disfavor because (in addition to its flakiness) it isn't needed much. Portability/standardization is (slightly) better these days (if only because Wintel has taken over the world), and, where cross-machine emulation is really needed (eg, for a phone emulator in an app development environment), direct emulation provides a more reliable and accurate result. Plus, processors are fast enough that the cost of emulation is not as severe a problem as in the past. – Daniel R Hicks – 2011-08-27T11:59:46.783

Answers

The short answer: You can't translate a compiled, linked executable. While technically possible, it's highly improbable to accomplish (see below). However, if you have the assembly source file (containing the instructions and labels), it is very possible to do (although if you somehow obtain the assembly source, unless the program is written in assembly, you should have the original program source code as well, so you'd be better off compiling it for the different architecture to begin with).

The long answer:

QEMU and other emulators can translate the instructions on the fly, and therefore run an executable on a computer it wasn't compiled for. Why not do this translation ahead of time, instead of on the fly in order to speed up the process?

I know it might seem easy in principle, but in practice, it's nearly impossible for a few main reasons. To start, different instruction sets use largely different addressing modes, different opcode structures, different word sizes, and some don't even have the instructions you need.

Let's say you needed to replace instruction XYZ with two more instructions, ABC and DEF. Now you've effectively shifted all of the relative/offset addresses in the entire program from that point on, so you would need to analyze and go through the entire program and update the offsets (both before and after the change). Now, let's say one of the offsets changes significantly - now you need to change addressing modes, which might change the size of the address. This will again force you to re-scan the entire file and re-compute all of the addresses, and so on and so fourth.

When you write assembly programs, you might use labels, but the CPU doesn't - when the file is assembled, all of the labels are calculated to be relative, absolute, or offset locations. You can see why this quickly becomes a non-trivial task, and next to impossible. Replacing a single instruction might require you to pass through the entire program hundreds times before moving on.

From my somewhat limited knowledge of assembly, most of the instructions like MOV, ADD and others should be portable across architectures.

Yes, but look at the issues I outlined above. What about the machine's word size? Address length? Does it even have the same addressing modes? Again, you can't just "find and replace" instructions. Every segment of a program has a specifically defined address. Jumps to other labels are replaced with literal or offset memory addresses when a program is assembled.

Anything that's doesn't have a direct mapping could be mapped to some other set of instructions, since all machines are Turing Complete. Would doing this be too complicated? Would it not work at all for some reason I'm unfamiliar with? Would it work, but yield no better results than using an emulator?

You're 100% correct that it is both possible, and would be a lot faster. However, writing a program to accomplish this is incredibly difficult and highly improbable, if not for anything except the issues I outlined above.

If you had the actual assembly source code, it would be trivial to translate the machine code to another instruction set architecture. Machine code itself, however, is assembled, so without the assembly source (which contains various labels used to compute memory addresses), it becomes incredibly difficult. Again, changing a single instruction might change memory offsets in the entire program, and require hundreds of passes to re-compute the addresses.

Doing this for a program with a few thousand instructions would require tens if not hundreds of thousands of passes. For relatively small programs, this may be possible, but remember that the number of passes will exponentially increase with the number of machine instructions in the program. For any program of a decent enough size, it's near impossible.

Breakthrough

Posted 2011-08-26T20:52:28.997

Reputation: 32 927

Essentially what one has to do is "decompile" or "disassemble" the source object code. For relatively straight-forward code (especially code generated by certain compilers or code generation packages where there is a known "style") the re-insertion of labels and the like is fairly simple. Certainly, however, newer highly-optimizing compilers would generate code that was far harder to "grock" this way. – Daniel R Hicks – 2011-08-26T23:34:08.513

@DanH if you have the source object code, you pretty much have the assembly source (not the machine code). The object file contains named (read: labeled) sequences of machine code to be linked together. The problem comes when you link the object code files into an executable. These smaller segments can be handled (or reverse engineered) much easier then an entire linked executable. – Breakthrough – 2011-08-30T14:38:15.133

Certainly, certain object file formats make the job a bit easier. Some may even contain debugging info, allowing you to restore most of the labels. Others are less helpful. In some cases much of this info is preserved even in the linked file format, in other cases not. There are a tremendous number of different file formats. – Daniel R Hicks – 2011-08-31T00:36:21.330

Yes, what you suggest can be and has been done. It's not too common, and I don't know of any current systems that use the technique, but it's definitely well within the realm of technical feasibility.

It used to be done a lot to enable the porting of code from one system to another, before anyone had achieved the even crude "portability" we have now. It required complex analysis of the "source" and could be stymied by code modification and other oddball practices, but it was still done.

More recently, systems like the IBM System/38 -- iSeries -- System i have taken advantage of the portability of intermediate code (similar to Java bytecodes) stored with compiled programs to enable portability between incompatible instruction set architectures.

Daniel R Hicks

Posted 2011-08-26T20:52:28.997

Reputation: 5 783

Agree that this has been done, usually with much older (simpler) instruction sets. There was an IBM project in the 1970s to convert old 7xx binary programs to System/360. – sawdust – 2011-08-26T23:01:18.447

Machine code itself is architecture specific.

Languages that allow for easy portability across multiple architectures (Java is probably the most well known) tend to be very high level, requiring interpreters or frameworks to be installed on a machine in order for them to work.

These frameworks or interpreters are written for each specific system architecture they'll run on and so are not, in and of themselves, any more portable than a "normal" program.

music2myear

Posted 2011-08-26T20:52:28.997

Reputation: 34 957

2Compiled languages are portable too, not just interpreted languages, it is the compiler that is architecture specific as it is what ultimately translates the code to what the platform it is on can recognize. The only difference is that compiled languages are translated at compile time and interpreted languages are translated line by line as needed. – MaQleod – 2011-08-26T22:46:24.840

Absolutely, its possible. What is machine code? Its just the language that a particular computer understands. Think of yourself as the computer and you are trying to understand a book written in German. You cant do it, because you dont understand the language. Now if you were to take a German dictionary and look up the word "Kopf" you would see it translate to the English word "head." The dictionary you used is whats called an emulation layer in the computer world. Easy right? Well, it gets more difficult. Take the German word "Schadenfruede," and translate it to English. You will see there is no word in the English language, but there is a definition. The same problem exists in the computer world, translating things that dont have an equivalent word. This makes direct ports difficult as the developers of the emulation layer have to make an interpretation of what that word means and make the host computer understand it. Sometimes it just doesnt work the way one would expect. We have all seen funny translations of books, phrases, etc on the internet right?

Keltari

Posted 2011-08-26T20:52:28.997

Reputation: 57 019

The process you describe is called Static Recompilation, and it's been done, just not in a generally applicable way. Meaning it's beyond possible, it's been done many times, but it required manual work.

There are many historical examples worth researching, but they are less able to demonstrate the modern concerns. I've found two examples that should essentially make any complete sceptics question the people who claim everything hard is impossible.

First this guy did a full Static Archetecture AND Platform for an NES ROM. http://andrewkelley.me/post/jamulator.html

He makes some very good points, but concludes that JIT is still more practical. I'm actually not sure why he didn't already know that for this situation, this might be the type of situation most people consider. Taking no shortcuts, demanding full cycle accuracy, and essentially using no ABI at all. If it was all there was, we could throw the concept in the trash and call it a day, but it's not all and never was.... How do we know this? Because all the successful projects didn't use this approach.

Now for the possibilities less obvious, Leverage the platform you already have... Starcraft on an Linux ARM handheld? Yup, the approach works when you don't constrain the task to exactly what you'd do dynamically. By using Winlib the Windows platform calls are all native, all we have to worry about is the Architecture.

http://www.geek.com/games/starcraft-has-been-reverse-engineered-to-run-on-arm-1587277/

I'd throw dollars to donuts that the slowdown is almost negligible, considering that ARM handheld pandora is only a bit stronger than the Pi. The tools he used are in this repository.

https://github.com/notaz/ia32rtools

That guy decompiled very manually, I believe that process could be automated significantly with less work... but still a labor of love at the moment. Don't let anyone tell you something isn't possible, don't even let me tell you it isn't practical... It could be practical, soon as you innovate a new way to make it so.

J. M. Becker

Posted 2011-08-26T20:52:28.997

Reputation: 593

Theoretically, yes this can be done. The bigger problem that comes into play is translating an application for one operating system (or kernel) to another. There are significant differences between the Windows, Linux, OSX and iOS kernels low level operations, that all applications for those devices have to use.

Once again, theoretically, one could write an application that could decompose an application as well as all of the machine code associated with the operating system it was compiled to run on and then recompile all of that machine code for another device. However, that would be highly illegal in just about every case and would be extremely difficult to write. It fact, the gears in my head are starting to seize up just thinking about it.

UPDATE

A couple of comments below seem to disagree with my response, however, I think they are missing my point. To my knowledge, there is no application that can take a sequence of executable bytes for one architecture, decompose it at the bytecode level, including all necessary calls to extrnal libraries including calls to the underlying OS kernel and reassemble it for another system and save the resulting executable bytecode. In other words, there is no application that could take something as simple as Notepad.exe, decompose the small 190k file that it is, and 100% reassemble it into an application that could run on Linux or OSX.

It is my understanding that the asker of the question wanted to know that if we can virtualize software or run applications through programs like Wine or Parallels, why can't we simply re-translate the byte-code for different systems. The reason is if you want to fully reassemble an application for another architecture, you must decompose all of the byte-code it takes to run it before reassembling it. There is more to every application than just the exe file, say, for a Windows machine. All Windows applications use the low-level Windows kernel objects and functions to create menus, text areas, methods for window resizing, drawing to the display, send/receiving OS messages, and so on and so on...

All of that byte-code must be disassembled if you want to reassemble to application and get it to run on a different architecture.

Applications like Wine interpret Windows binaries at the byte level. They recognize calls to the kernel and translate those calls to either related Linux functions or they emulate the Windows environment. But, that isn't a byte-for-byte (or opcode for opcode) retranslation. It is more of a function-for-function translation and that is quite a bit different.

RLH

Posted 2011-08-26T20:52:28.997

Reputation: 3 525

It isnt theoretical at all. And there are plenty of applications that run other binaries on different operating systems. Have you heard of Wine? It runs Windows binaries on different OSs, such as Linux, Solaris, Mac OSX, BSD, and others. – Keltari – 2011-08-26T23:18:09.933

The difference in operating systems can easily be finessed on most systems by using a hypervisor to run multiple operating systems (or to run a "layer" such as Wine on one system emulating another). AFAIK, all "modern" non-embedded processors are "virtualizable", so this requires no instruction set emulation/translation. – Daniel R Hicks – 2011-08-27T12:03:35.313

Seem all the experts are missing this point: The 'translation' is complex but very suitable for the computer (no intelligent, just labourious). But after translation, programs needs OS support, ex: GetWindowVersion does not exist in Linux. This is normally supplied by the emulator (very large). So you could 'pre-translate' a simple programs but you have to link to a huge libary to run independly. Imaging every windows' programs come with its own kernel.dll+user.dll+shell.dll...

qak

Posted 2011-08-26T20:52:28.997

Reputation: 11

It's not just laborious, it requires intelligence. For example, say you see some computation whose result determines the address you jump to, which may be in the middle of something that appears to be a single instruction. – David Schwartz – 2014-07-06T09:44:40.847