Table of Contents for Programming Languages: a survey

Local variable storage and operand stack

The JVM uses the common system of stack frames, where each stack frame contains local variables. The stack frame also contains an 'operand stack' which is the stack that JVM instructions manipulate. The JVM has instructions which load a local variable onto the operand stack (eg aload for objects, iload for ints, etc), or store a value from the operand stack into a local variable (eg astore, istore).

" The word size must be large enough to hold a value of type byte, short, int, char, float, returnAddress, or reference. Two words must be large enough to hold a value of type long or double. An implementation designer must therefore choose a word size that is at least 32 bits, but otherwise can pick whatever word size will yield the most efficient implementation. The word size is often chosen to be the size of a native pointer on the host platform.

...the local variables and operand stack-- are defined in terms of words....

As they run, Java programs cannot determine the word size of their host virtual machine implementation. " -- [1]


Safety: Bytecode verification

The JVM guarantees no segfault/null pointer error crashes. This simple idea is an innovative one, particularly in light of its high performance.

The bytecode is syntactically capable of expressing type errors (and i think: including errors that could cause a segfault if executed, eg attempting to dereferencing something which is not a valid pointer), however the JVM actually verifies the bytecode before executing it, ensuring that it typechecks. The low-level expressiveness of the bytecode allows relatively good performance, while the verification allows an entity to call bytecode within the JVM while trusting that the JVM will not crash itself, even if the bytecode has been created by a malicious actor (however, of course if calling into with native code supplied by a malicious actor is permitted, anything could happen).

One might think that this would significantly slow down initialization, but in fact, a 40% speedup of the verifier only translated to a 5% speedup of large program startup time [2], suggesting that verification is only a small portion of total startup time.

CLR has a similar feature. LLVM does not:


Memory pools

I read somewhere (todo) that upon startup, the JVM initially pre-allocates a bunch of memory (a "memory pool" from which it can sub-allocate to programs later). This is because when a program needs memory later, sub-allocating from a memory pool will be faster than going back to the OS and asking for more memory. However, the cost is that the JVM is grabbing a bunch of memory initially which it may or may not actually need. I read somewhere that this is big contributor to the perception that the JVM is a memory hog (not entirely inaccurate, if it is 'hogging' more memory than it needs..). Again, like prioritization of throughput over latency, this is a design decision that makes the JVM good for long-running servers on beefy machines, but not so good for underpowered desktop clients.

The issue with JSR/RET

Apparently the JSR and RET instructions have been deprecated, because it makes code harder to verify.


Stack maps

In the process of verification, 'stack maps' must be created; a stack map says, at a given point in the code, for each position in the stack, what data type the data at that position is. In other words, we're doing type inference on all temporary variables created during the program. This step used to be the responsibility of the JVM verifier, but in version 7 it changed to be the responsibility of whoever generates the bytecode classfile, presumably in order to save time for the JVM at class load time.

Some people opined that this was a mistake, because the cost of this minor speedup was making it much harder to write bytecode-emitting tools, which is worse for the JVM ecosystem:

JVM stack map format

JVM limits

JVM implementations and variants


embedded: todo cp info from proj-oot-lowEndTargets-embeddableJvmAndJavaImplementations; todo some of these may be Java, not JVM