Interpreted language

An interpreted language is a type of programming language for which most of its implementations execute instructions directly and freely, without previously compiling a program into machine-language instructions. The interpreter executes the program directly, translating each statement into a sequence of one or more subroutines, and then into another language (often machine code).

The terms interpreted language and compiled language are not well defined because, in theory, any programming language can be either interpreted or compiled. In modern programming language implementation, it is increasingly popular for a platform to provide both options.

Interpreted languages can also be contrasted with machine languages. Functionally, both execution and interpretation mean the same thing — fetching the next instruction/statement from the program and executing it. Although interpreted byte code is additionally identical to machine code in form and has an assembler representation, the term "interpreted" is sometimes reserved for "software processed" languages (by virtual machine or emulator) on top of the native (i.e. hardware) processor.

In principle, programs in many languages may be compiled or interpreted, emulated or executed natively, so this designation is applied solely based on common implementation practice, rather than representing an essential property of a language.

Many languages have been implemented using both compilers and interpreters, including BASIC, C, Lisp, and Pascal. Java and C# are compiled into bytecode, the virtual-machine-friendly interpreted language. Lisp implementations can freely mix interpreted and compiled code.

The distinction between a compiler and an interpreter is not always well defined, and many language processors do a combination of both.

Historical background

In the early days of computing, language design was heavily influenced by the decision to use compiling or interpreting as a mode of execution. For example, Smalltalk (1980), which was designed to be interpreted at run-time, allows generic objects to dynamically interact with each other.

Initially, interpreted languages were compiled line-by-line; that is, each line was compiled as it was about to be executed, and if a loop or subroutine caused certain lines to be executed multiple times, they would be recompiled every time. This has become much less common. Most so-called interpreted languages use an intermediate representation, which combines compiling and interpreting.

Examples include:

The intermediate representation can be compiled once and for all (as in Java), each time before execution (as in Ruby), or each time a change in the source is detected before execution (as in Python).

Advantages

Interpreting a language gives implementations some additional flexibility over compiled implementations. Features that are often easier to implement in interpreters than in compilers include:

  • platform independence (Java's byte code, for example)
  • reflection and reflective use of the evaluator (e.g. a first-order eval function)
  • dynamic typing
  • smaller executable program size (since implementations have flexibility to choose the instruction code)
  • dynamic scoping

Furthermore, source code can be read and copied, giving users more freedom.

Disadvantages

Disadvantages of interpreted languages are:

  • Without static type-checking, which is usually performed by a compiler, programs can be less reliable, because type checking eliminates a class of programming errors (though type-checking of the code can be done by using additional stand-alone tools. See TypeScript for instance)
  • Interpreters can be susceptible to Code injection attacks.
  • Slower execution compared to direct native machine code execution on the host CPU. A technique used to improve performance is just-in-time compilation which converts frequently executed sequences of interpreted instruction to host machine code. JIT is most often combined with compilation to byte-code as in Java.
  • Source code can be read and copied (e.g. JavaScript in web pages), or more easily reverse engineered through reflection in applications where intellectual property has a commercial advantage. In some cases, obfuscation is used as a partial defense against this.

Litmus tests

Several criteria can be used to determine whether a particular language is likely to be called compiled or interpreted by its users:

  • If a subroutine can be invoked prior to where it's defined in the source code, the entire source is likely being compiled to an intermediate representation before execution. Examples: Perl, Java
  • If an intermediate representation (e.g. bytecode) is typically created and invoked directly as a separate step when executing the code, the language is likely to be considered compiled. Examples: Java, C
  • If a syntax error in the source code doesn't prevent prior statements from being executed, it's likely an interpreted paradigm. Examples: Unix shell languages

These are not definitive. Compiled languages can have interpreter-like properties and vice versa.

List of frequently used interpreted languages

Languages usually compiled to bytecode

Many languages are first compiled to bytecode. Sometimes, bytecode can also be compiled to a native binary using an AOT compiler or executed natively, by hardware processor.

gollark: See, I don't get randomly angry or something. Make me moderator.
gollark: I said they were! I didn't say they weren't! You're interpreting me oddly!
gollark: You are clearly a most 1337 h4xx0r.
gollark: What did you do, put the instruction in a weird form? Offset it by some weird number of bytes?
gollark: It's technically *possible*.

See also

Citation

  1. "CodePlex Archive". CodePlex Archive. Retrieved 7 April 2018.

References

  • Brown, P.J. (1979). Writing Interactive Compilers and Interpreters. John Wiley. ISBN 0-471-27609-X.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.