Table of Contents for Programming Languages: a survey
Other intermediate target languages designed for implementing a single high-level language
CPython bytecode
Two stacks:
- main stack
- block stack: "Per frame, there is a stack of blocks, denoting nested loops, try statements, and such."
Stack ops:
- various rots, dups
- in-place binary arithmetic on top0, top1 (doesn't consume top1)
Arithmetic:
- arithmetical negation (-6 -> 6), boolean negation (-6 -> False), inversion (-6 -> 5), unary positive (does nothing by default but can be overridden by __pos__ in custom classes)
instructions:
quoted and paraphrased from https://docs.python.org/release/3.3.2/library/dis.html
General instructions:
- NOP
- POP_TOP (pop), ROT_TWO, ROT_THREE, DUP_TOP, DUP_TOP_TWO
Unary operations:
Unary operations take the top of the stack, apply the operation, and push the result back on the stack.
- UNARY_POSITIVE (+TOS), UNARY_NEGATIVE, UNARY_NOT (not TOS), UNARY_INVERT (~TOS), GET_ITER (iter(TOS))
Binary operations:
Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack.
- BINARY_POWER (TOS1 TOS), BINARY_MULTIPLY, BINARY_FLOOR_DIVIDE (TOS1 TOS), BINARY_TRUE_DIVIDE (TOS1 / TOS), BINARY_MODULO, BINARY_ADD, BINARY_SUBTRACT, BINARY_SUBSCR(TOS1[TOS]), BINARY_LSHIFT (TOS1 TOS), BINARY_AND, BINARY_XOR, BINARY_OR
In-place operations:
In-place operations are like binary operations, in that they remove TOS and TOS1, and push the result back on the stack, but the operation is done in-place when TOS1 supports it, and the resulting TOS may be (but does not have to be) the original TOS1.
INPLACE_POWER, INPLACE_MULTIPLY, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE, INPLACE_MODULO, INPLACE_ADD, INPLACE_SUBTRACT, INPLACE_LSHIFT, INPLACE_RSHIFT, INPLACE_AND, INPLACE_XOR, INPLACE_OR, STORE_SUBSCR (TOS1[TOS] = TOS2), DELETE_SUBSCR (del TOS1[TOS]),
Miscellaneous opcodes:
PRINT_EXPR
- BREAK_LOOP, CONTINUE_LOOP(target) (target is the address to jump to (which should be a FOR_ITER instruction))
- SET_ADD(i) (Calls set.add(TOS1[-i], TOS)), LIST_APPEND(i) (list.append(TOS[-i], TOS)), MAP_ADD(i) (dict.setitem(TOS1[-i], TOS, TOS1)). For all of the SET_ADD, LIST_APPEND and MAP_ADD instructions, while the added value or key/value pair is popped off, the container object remains on the stack so that it is available for further iterations of the loop.
- RETURN_VALUE (Returns with TOS to the caller of the function), YIELD_VALUE (Pops TOS and yields it from a generator), YIELD_FROM (Pops TOS and delegates to it as a subiterator from a generator)
- IMPORT_STAR (Loads all symbols not starting with '_' directly from the module TOS to the local namespace. The module is popped after loading all names. This opcode implements 'from module import *')
- POP_BLOCK (Removes one block from the block stack. Per frame, there is a stack of blocks, denoting nested loops, try statements, and such)
- POP_EXCEPT (Removes one block from the block stack. The popped block must be an exception handler block, as implicitly created when entering an except handler. In addition to popping extraneous values from the frame stack, the last three popped values are used to restore the exception state.)
- END_FINALLY (Terminates a finally clause. The interpreter recalls whether the exception has to be re-raised, or whether the function returns, and continues with the outer-next block)
- LOAD_BUILD_CLASS (builtins.__build_class__() onto the stack. It is later called by CALL_FUNCTION to construct a class)
- SETUP_WITH(delta) (This opcode performs several operations before a with block starts. First, it loads __exit__() from the context manager and pushes it onto the stack for later use by WITH_CLEANUP. Then, __enter__() is called, and a finally block pointing to delta is pushed. Finally, the result of calling the enter method is pushed onto the stack. The next opcode will either ignore it (POP_TOP), or store it in (a) variable(s) (STORE_FAST, STORE_NAME, or UNPACK_SEQUENCE).)
- WITH_CLEANUP (Cleans up the stack when a with statement block exits. TOS is the context manager’s __exit__() bound method. (see https://docs.python.org/release/3.3.2/library/dis.html#opcode-WITH_CLEANUP for details)
- STORE_LOCALS (Pops TOS from the stack and stores it as the current frame’s f_locals. This is used in class construction)
Miscellaneous opcodes with arguments:
All of the following opcodes expect arguments. An argument is two bytes, with the more significant byte last.
- STORE_NAME(namei) (name = TOS; namei is the index of name in the attribute co_names of the code object. The compiler tries to use STORE_FAST or STORE_GLOBAL if possible)
- DELETE_NAME(namei) (del name)
- UNPACK_SEQUENCE(count) (Unpacks TOS into count individual values, which are put onto the stack right-to-left)
- UNPACK_EX(counts) (Implements assignment with a starred target: Unpacks an iterable in TOS into individual values, where the total number of values can be smaller than the number of items in the iterable: one the new values will be a list of all leftover items)
- STORE_ATTR(namei) (TOS.name = TOS1)
- DELETE_ATTR(namei) (del TOS.name)
- STORE_GLOBAL(namei) (STORE_NAME, but stores the name as a global)
- DELETE_GLOBAL(namei)
- LOAD_CONST(consti) (Pushes co_consts[consti] onto the stack), LOAD_NAME(namei) (Pushes the value associated with co_names[namei] onto the stack)
- BUILD_TUPLE(count) (Creates a tuple consuming count items from the stack), BUILD_LIST(count), BUILD_SET(count), BUILD_MAP(count) (Pushes a new dictionary object onto the stack. The dictionary is pre-sized to hold count entries)
- LOAD_ATTR(namei) (getattr(TOS, co_names[namei]))
- COMPARE_OP(opname) (Performs a Boolean operation. The operation name can be found in cmp_op[opname])
- IMPORT_NAME(namei) (Imports the module co_names[namei]. TOS and TOS1 are popped and provide the fromlist and level arguments of __import__(). The module object is pushed onto the stack. The current namespace is not affected: for a proper import statement, a subsequent STORE_FAST instruction modifies the namespace)
- IMPORT_FROM(namei) (Loads the attribute co_names[namei] from the module found in TOS)
- JUMP_FORWARD(delta) (Increments bytecode counter by delta)
- POP_JUMP_IF_TRUE(target) (If TOS is true, sets the bytecode counter to target. TOS is popped)
- POP_JUMP_IF_FALSE(target)
- JUMP_IF_TRUE_OR_POP(target) (If TOS is true, sets the bytecode counter to target and leaves TOS on the stack. Otherwise (TOS is false), TOS is popped)
- JUMP_IF_FALSE_OR_POP(target) (If TOS is false, sets the bytecode counter to target and leaves TOS on the stack. Otherwise (TOS is true), TOS is popped)
- JUMP_ABSOLUTE(target) (Set bytecode counter to target)
- FOR_ITER(delta) (TOS is an iterator. Call its __next__() method. If this yields a new value, push it on the stack (leaving the iterator below it). If the iterator indicates it is exhausted TOS is popped, and the byte code counter is incremented by delta)
- LOAD_GLOBAL(namei) (Loads the global named co_names[namei] onto the stack)
- SETUP_LOOP(delta): Pushes a block for a loop onto the block stack. The block spans from the current instruction with a size of delta bytes.
- SETUP_EXCEPT(delta): Pushes a try block from a try-except clause onto the block stack. delta points to the first except block.
- SETUP_FINALLY(delta): Pushes a try block from a try-except clause onto the block stack. delta points to the finally block.
- STORE_MAP: Store a key and value pair in a dictionary. Pops the key and value while leaving the dictionary on the stack.
- LOAD_FAST(var_num), STORE_FAST(var_num), DELETE_FAST(var_num): local co_varnames[var_num]
- LOAD_CLOSURE(i): Pushes a reference to the cell contained in slot i of the cell and free variable storage. The name of the variable is co_cellvars[i] if i is less than the length of co_cellvars. Otherwise it is co_freevars[i - len(co_cellvars)].
- LOAD_DEREF(i) (Loads the cell contained in slot i of the cell and free variable storage. Pushes a reference to the object the cell contains on the stack), STORE_DEREF(i)
- DELETE_DEREF(i)
- RAISE_VARARGS(argc): Raises an exception. argc indicates the number of parameters to the raise statement, ranging from 0 to 3. The handler will find the traceback as TOS2, the parameter as TOS1, and the exception as TOS.
- CALL_FUNCTION(argc): Calls a function. The low byte of argc indicates the number of positional parameters, the high byte the number of keyword parameters. Pops all function arguments, and the function itself off the stack, and pushes the return value.
- MAKE_FUNCTION(argc): Pushes a new function object on the stack. TOS is the qualified name of the function; TOS1 is the code associated with the function. The function object is defined to have argc default parameters, which are found below TOS1.
- MAKE_CLOSURE(argc): Creates a new function object, sets its __closure__ slot, and pushes it on the stack. TOS is the qualified name of the function, TOS1 is the code associated with the function, and TOS2 is the tuple containing cells for the closure’s free variables. The function also has argc default parameters, which are found below the cells.
- BUILD_SLICE(argc): Pushes a slice object on the stack. argc must be 2 or 3. If it is 2, slice(TOS1, TOS) is pushed; if it is 3, slice(TOS2, TOS1, TOS) is pushed. See the slice() built-in function for more information.
todo
Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.
CALL_FUNCTION_VAR(argc)
Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the variable argument list, followed by keyword and positional arguments.
CALL_FUNCTION_KW(argc)
Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by explicit keyword and positional arguments.
CALL_FUNCTION_VAR_KW(argc)
Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by the variable-arguments tuple, followed by explicit keyword and positional arguments.
- HAVE_ARGUMENT: This is not really an opcode. It identifies the dividing line between opcodes which don’t take arguments < HAVE_ARGUMENT and those which do >= HAVE_ARGUMENT.
Note: as of Python 3.6, "The Python interpreter now uses a 16-bit wordcode instead of bytecode which made a number of opcode optimizations possible. (Contributed by Demur Rumed with input and reviews from Serhiy Storchaka and Victor Stinner in bpo-26647 and bpo-28050.)" [1].
(rationale: "The change is to have all instructions take an argument. This removes the branch on each instruction on whether to load oparg. It then also aligns instructions to always be 2 bytes rather than 1 or 3 by having arguments only take up 1 byte. In the case that an argument to an instruction is greater than 255, it can chain EXTENDED_ARG up to 3 times. In practice this rarely occurs, mostly only for jumps, & abarnert measured stdlib to be ~5% smaller
The rationale is that this offers 3 benefits: Smaller code size, simpler instruction iteration/indexing (One may now scan backwards, as peephole.c does in this patch), which between the two results in a small perf gain (the only way for perf to be negatively impacted is by an increase in EXTENDED_ARGs, when I post benchmarking I'll also post a count of how many more EXTENDED_ARGs are emitted)
This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs
" ) -- [2]