Designer | Sun Microsystems |
---|---|
Bits | 32-bit |
Introduced | 1994 |
Version | 20.0.1 [1] |
Type | Stack and register–register |
Encoding | Variable |
Branching | Compare and branch |
Endianness | Big |
Open | Yes |
Registers | |
General-purpose | Per-method operand stack (up to 65535 operands) plus per-method local variables (up to 65535) |
A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.
The JVM reference implementation is developed by the OpenJDK project as open source code and includes a JIT compiler called HotSpot. The commercially supported Java releases available from Oracle are based on the OpenJDK runtime. Eclipse OpenJ9 is another open source JVM for OpenJDK.
The Java virtual machine is an abstract (virtual) computer defined by a specification. It is a part of the Java runtime environment. The garbage collection algorithm used and any internal optimization of the Java virtual machine instructions (their translation into machine code) are not specified. The main reason for this omission is to not unnecessarily constrain implementers. Any Java application can be run only inside some concrete implementation of the abstract specification of the Java virtual machine. [2]
Starting with Java Platform, Standard Edition (J2SE) 5.0, changes to the JVM specification have been developed under the Java Community Process as JSR 924. [3] As of 2006 [update] , changes to the specification to support changes proposed to the class file format (JSR 202) [4] are being done as a maintenance release of JSR 924. The specification for the JVM was published as the blue book, [5] whose preface states:
We intend that this specification should sufficiently document the Java Virtual Machine to make possible compatible clean-room implementations. Oracle provides tests that verify the proper operation of implementations of the Java Virtual Machine.
One of Oracle's JVMs is named HotSpot; the other, inherited from BEA Systems, is JRockit. Oracle owns the Java trademark and may allow its use to certify implementation suites as fully compatible with Oracle's specification.
One of the organizational units of JVM byte code is a class. A class loader implementation must be able to recognize and load anything that conforms to the Java class file format. Any implementation is free to recognize other binary forms besides class files, but it must recognize class files.
The class loader performs three basic activities in this strict order:
In general, there are three types of class loader: bootstrap class loader, extension class loader and System / Application class loader.
Every Java virtual machine implementation must have a bootstrap class loader that is capable of loading trusted classes, as well as an extension class loader or application class loader. The Java virtual machine specification does not specify how a class loader should locate classes.
The JVM operates on specific types of data as specified in Java Virtual Machine specifications. The data types can be divided [6] into primitive types (integers, Floating-point, long etc.) and Reference types. The earlier JVM were only 32-bit machines. long
and double
types, which are 64-bits, are supported natively, but consume two units of storage in a frame's local variables or operand stack, since each unit is 32 bits. boolean
, byte
, short
, and char
types are all sign-extended (except char
which is zero-extended) and operated on as 32-bit integers, the same as int
types. The smaller types only have a few type-specific instructions for loading, storing, and type conversion. boolean
is operated on as 8-bit byte
values, with 0 representing false
and 1 representing true
. (Although boolean
has been treated as a type since The Java Virtual Machine Specification, Second Edition clarified this issue, in compiled and executed code there is little difference between a boolean
and a byte
except for name mangling in method signatures and the type of boolean arrays. boolean
s in method signatures are mangled as Z
while byte
s are mangled as B
. Boolean arrays carry the type boolean[]
but use 8 bits per element, and the JVM has no built-in capability to pack booleans into a bit array, so except for the type they perform and behave the same as byte
arrays. In all other uses, the boolean
type is effectively unknown to the JVM as all instructions to operate on booleans are also used to operate on byte
s.) However, newer JVM releases, such as the OpenJDK HotSpot JVM, support 64-bit architecture. Consequently, you can install a 32-bit or 64-bit JVM on a 64-bit operating system. The primary advantage of running Java in a 64-bit environment is the larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads, which is needed for certain kinds of large applications; however there is a performance hit in using 64-bit JVM compared to 32-bit JVM.
The JVM has a garbage-collected heap for storing objects and arrays. Code, constants, and other class data are stored in the "method area". The method area is logically part of the heap, but implementations may treat the method area separately from the heap, and for example might not garbage collect it. Each JVM thread also has its own call stack (called a "Java Virtual Machine stack" for clarity), which stores frames. A new frame is created each time a method is called, and the frame is destroyed when that method exits.
Each frame provides an "operand stack" and an array of "local variables". The operand stack is used for operands to run computations and for receiving the return value of a called method, while local variables serve the same purpose as registers and are also used to pass method arguments. Thus, the JVM is both a stack machine and a register machine. In practice, HotSpot eliminates every stack besides the native thread/call stack even when running in Interpreted mode, as its Templating Interpreter technically functions as a compiler.
The JVM has instructions for the following groups of tasks:
The aim is binary compatibility. Each particular host operating system needs its own implementation of the JVM and runtime. These JVMs interpret the bytecode semantically the same way, but the actual implementation may be different. More complex than just emulating bytecode is compatibly and efficiently implementing the Java core API that must be mapped to each host operating system.
These instructions operate on a set of common abstracted data types rather the native data types of any specific instruction set architecture.
A JVM language is any language with functionality that can be expressed in terms of a valid class file which can be hosted by the Java Virtual Machine. A class file contains Java Virtual Machine instructions (Java byte code) and a symbol table, as well as other ancillary information. The class file format is the hardware- and operating system-independent binary format used to represent compiled classes and interfaces. [7]
There are several JVM languages, both old languages ported to JVM and completely new languages. JRuby and Jython are perhaps the most well-known ports of existing languages, i.e. Ruby and Python respectively. Of the new languages that have been created from scratch to compile to Java bytecode, Clojure, Groovy, Scala and Kotlin may be the most popular ones. A notable feature with the JVM languages is that they are compatible with each other, so that, for example, Scala libraries can be used with Java programs and vice versa. [8]
Java 7 JVM implements JSR 292: Supporting Dynamically Typed Languages [9] on the Java Platform, a new feature which supports dynamically typed languages in the JVM. This feature is developed within the Da Vinci Machine project whose mission is to extend the JVM so that it supports languages other than Java. [10] [11]
A basic philosophy of Java is that it is inherently safe from the standpoint that no user program can crash the host machine or otherwise interfere inappropriately with other operations on the host machine, and that it is possible to protect certain methods and data structures belonging to trusted code from access or corruption by untrusted code executing within the same JVM. Furthermore, common programmer errors that often led to data corruption or unpredictable behavior such as accessing off the end of an array or using an uninitialized pointer are not allowed to occur. Several features of Java combine to provide this safety, including the class model, the garbage-collected heap, and the verifier.
The JVM verifies all bytecode before it is executed. This verification consists primarily of three types of checks:
The first two of these checks take place primarily during the verification step that occurs when a class is loaded and made eligible for use. The third is primarily performed dynamically, when data items or methods of a class are first accessed by another class.
The verifier permits only some bytecode sequences in valid programs, e.g. a jump (branch) instruction can only target an instruction within the same method. Furthermore, the verifier ensures that any given instruction operates on a fixed stack location, [12] allowing the JIT compiler to transform stack accesses into fixed register accesses. Because of this, that the JVM is a stack architecture does not imply a speed penalty for emulation on register-based architectures when using a JIT compiler. In the face of the code-verified JVM architecture, it makes no difference to a JIT compiler whether it gets named imaginary registers or imaginary stack positions that must be allocated to the target architecture's registers. In fact, code verification makes the JVM different from a classic stack architecture, of which efficient emulation with a JIT compiler is more complicated and typically carried out by a slower interpreter. Additionally, the Interpreter used by the default JVM is a special type known as a Template Interpreter, which translates bytecode directly to native, register based machine language rather than emulate a stack like a typical interpreter. [13] In many aspects the HotSpot Interpreter can be considered a JIT compiler rather than a true interpreter, meaning the stack architecture that the bytecode targets is not actually used in the implementation, but merely a specification for the intermediate representation that can well be implemented in a register based architecture. Another instance of a stack architecture being merely a specification and implemented in a register based virtual machine is the Common Language Runtime. [14]
The original specification for the bytecode verifier used natural language that was incomplete or incorrect in some respects. A number of attempts have been made to specify the JVM as a formal system. By doing this, the security of current JVM implementations can more thoroughly be analyzed, and potential security exploits prevented. It will also be possible to optimize the JVM by skipping unnecessary safety checks, if the application being run is proven to be safe. [15]
A virtual machine architecture allows very fine-grained control over the actions that code within the machine is permitted to take. It assumes the code is "semantically" correct, that is, it successfully passed the (formal) bytecode verifier process, materialized by a tool, possibly off-board the virtual machine. This is designed to allow safe execution of untrusted code from remote sources, a model used by Java applets, and other secure code downloads. Once bytecode-verified, the downloaded code runs in a restricted "sandbox", which is designed to protect the user from misbehaving or malicious code. As an addition to the bytecode verification process, publishers can purchase a certificate with which to digitally sign applets as safe, giving them permission to ask the user to break out of the sandbox and access the local file system, clipboard, execute external pieces of software, or network.
Formal proof of bytecode verifiers have been done by the Javacard industry (Formal Development of an Embedded Verifier for Java Card Byte Code [16] )
For each hardware architecture a different Java bytecode interpreter is needed. When a computer has a Java bytecode interpreter, it can run any Java bytecode program, and the same program can be run on any computer that has such an interpreter.
When Java bytecode is executed by an interpreter, the execution will always be slower than the execution of the same program compiled into native machine language. This problem is mitigated by just-in-time (JIT) compilers for executing Java bytecode. A JIT compiler may translate Java bytecode into native machine language while executing the program. The translated parts of the program can then be executed much more quickly than they could be interpreted. This technique gets applied to those parts of a program frequently executed. This way a JIT compiler can significantly speed up the overall execution time.
There is no necessary connection between the Java programming language and Java bytecode. A program written in Java can be compiled directly into the machine language of a real computer and programs written in other languages than Java can be compiled into Java bytecode.
Java bytecode is intended to be platform-independent and secure. [17] Some JVM implementations do not include an interpreter, but consist only of a just-in-time compiler. [18]
At the start of the Java platform's lifetime, the JVM was marketed as a web technology for creating Rich Web Applications. As of 2018 [update] , most web browsers and operating systems bundling web browsers do not ship with a Java plug-in, nor do they permit side-loading any non-Flash plug-in. The Java browser plugin was deprecated in JDK 9. [19]
The NPAPI Java browser plug-in was designed to allow the JVM to execute so-called Java applets embedded into HTML pages. For browsers with the plug-in installed, the applet is allowed to draw into a rectangular region on the page assigned to it. Because the plug-in includes a JVM, Java applets are not restricted to the Java programming language; any language targeting the JVM may run in the plug-in. A restricted set of APIs allow applets access to the user's microphone or 3D acceleration, although applets are not able to modify the page outside its rectangular region. Adobe Flash Player, the main competing technology, works in the same way in this respect.
As of June 2015 [update] according to W3Techs, Java applet and Silverlight use had fallen to 0.1% each for all web sites, while Flash had fallen to 10.8%. [20]
Since May 2016, JavaPoly allows users to import unmodified Java libraries, and invoke them directly from JavaScript. JavaPoly allows websites to use unmodified Java libraries, even if the user does not have Java installed on their computer. [21]
With the continuing improvements in JavaScript execution speed, combined with the increased use of mobile devices whose web browsers do not implement support for plugins, there are efforts to target those users through transpilation to JavaScript. It is possible to either transpile the source code or JVM bytecode to JavaScript.
Compiling the JVM bytecode, which is universal across JVM languages, allows building upon the language's existing compiler to bytecode. The main JVM bytecode to JavaScript transpilers are TeaVM, [22] the compiler contained in Dragome Web SDK, [23] Bck2Brwsr, [24] and j2js-compiler. [25]
Leading transpilers from JVM languages to JavaScript include the Java-to-JavaScript transpiler contained in Google Web Toolkit, Clojurescript (Clojure), GrooScript (Apache Groovy), Scala.js (Scala) and others. [26]
Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers write once, run anywhere (WORA), meaning that compiled Java code can run on all platforms that support Java without the need to recompile. Java applications are typically compiled to bytecode that can run on any Java virtual machine (JVM) regardless of the underlying computer architecture. The syntax of Java is similar to C and C++, but has fewer low-level facilities than either of them. The Java runtime provides dynamic capabilities that are typically not available in traditional compiled languages.
In computer programming, a P-code machine is a virtual machine designed to execute P-code, the assembly language or machine code of a hypothetical central processing unit (CPU). The term "P-code machine" is applied generically to all such machines, as well as specific implementations using those machines. One of the most notable uses of P-Code machines is the P-Machine of the Pascal-P system. The developers of the UCSD Pascal implementation within this system construed the P in P-code to mean pseudo more often than portable; they adopted a unique label for pseudo-code meaning instructions for a pseudo-machine.
Java Platform, Standard Edition is a computing platform for development and deployment of portable code for desktop and server environments. Java SE was formerly known as Java 2 Platform, Standard Edition (J2SE).
In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:
Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.
A Java class file is a file containing Java bytecode that can be executed on the Java Virtual Machine (JVM). A Java class file is usually produced by a Java compiler from Java programming language source files containing Java classes. If a source file has more than one class, each class is compiled into a separate class file. Thus, it is called a .class file because it contains the bytecode for a single class.
A Java compiler is a compiler for the Java programming language.
javac is the primary Java compiler included in the Java Development Kit (JDK) from Oracle Corporation. Martin Odersky implemented the GJ compiler, and his implementation became the basis for javac.
HotSpot, released as Java HotSpot Performance Engine, is a Java virtual machine for desktop and server computers, developed by Sun Microsystems which was purchased by and became a division of Oracle Corporation in 2010. Its features improved performance via methods such as just-in-time compilation and adaptive optimization. It is the de facto Java Virtual Machine, serving as the reference implementation of the Java programming language.
Application virtualization software refers to both application virtual machines and software responsible for implementing them. Application virtual machines are typically used to allow application bytecode to run portably on many different computer architectures and operating systems. The application is usually run on the computer using an interpreter or just-in-time compilation (JIT). There are often several implementations of a given virtual machine, each covering a different set of functions.
Java is a set of computer software and specifications that provides a software platform for developing application software and deploying it in a cross-platform computing environment. Java is used in a wide variety of computing platforms from embedded devices and mobile phones to enterprise servers and supercomputers. Java applets, which are less common than standalone Java applications, were commonly run in secure, sandboxed environments to provide many features of native applications through being embedded in HTML pages.
The Java programming language and Java software platform have been criticized for design choices including the implementation of generics, forced object-oriented programming, the handling of unsigned numbers, the implementation of floating-point arithmetic, and a history of security vulnerabilities in the primary Java VM implementation, HotSpot. Software written in Java, especially its early versions, has been criticized for its performance compared to software written in other programming languages. Developers have also remarked that differences in various Java implementations must be taken into account when writing complex Java programs that must work with all of them.
The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community Process (JCP), which uses Java Specification Requests (JSRs) to propose and specify additions and changes to the Java platform. The language is specified by the Java Language Specification (JLS); changes to the JLS are managed under JSR 901. In September 2017, Mark Reinhold, chief Architect of the Java Platform, proposed to change the release train to "one feature release every six months" rather than the then-current two-year schedule. This proposal took effect for all following versions, and is still the current release schedule.
In software development, the programming language Java was historically considered slower than the fastest third-generation typed languages such as C and C++. In contrast to those languages, Java compiles by default to a Java Virtual Machine (JVM) with operations distinct from those of the actual computer hardware. Early JVM implementations were interpreters; they simulated the virtual operations one-by-one rather than translating them into machine code for direct hardware execution.
Eclipse OpenJ9 is a high performance, scalable, Java virtual machine (JVM) implementation that is fully compliant with the Java Virtual Machine Specification.
Dalvik is a discontinued process virtual machine (VM) in the Android operating system that executes applications written for Android. Dalvik was an integral part of the Android software stack in the Android versions 4.4 "KitKat" and earlier, which were commonly used on mobile devices such as mobile phones and tablet computers, and more in some devices such as smart TVs and wearables. Dalvik is open-source software, originally written by Dan Bornstein, who named it after the fishing village of Dalvík in Eyjafjörður, Iceland.
The Da Vinci Machine, also called the Multi Language Virtual Machine, was a Sun Microsystems project aiming to prototype the extension of the Java Virtual Machine (JVM) to add support for dynamic languages.
The Java Development Kit (JDK) is a distribution of Java technology by Oracle Corporation. It implements the Java Language Specification (JLS) and the Java Virtual Machine Specification (JVMS) and provides the Standard Edition (SE) of the Java Application Programming Interface (API). It is derivative of the community driven OpenJDK which Oracle stewards. It provides software for working with Java applications. Examples of included software are the Java virtual machine, a compiler, performance monitoring tools, a debugger, and other utilities that Oracle considers useful for Java programmers.
Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact form of data.
GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.