proj-plbook-plChSingleIntermedLangs

Difference between revision 56 and current revision

No diff available.

Table of Contents for Programming Languages: a survey

Other intermediate target languages designed for implementing a single high-level language


CPython bytecode

Two stacks:

Stack ops:

Arithmetic:

instructions:

quoted and paraphrased from https://docs.python.org/release/3.3.2/library/dis.html

General instructions:

Unary operations:

Unary operations take the top of the stack, apply the operation, and push the result back on the stack.

Binary operations:

Binary operations remove the top of the stack (TOS) and the second top-most stack item (TOS1) from the stack. They perform the operation, and put the result back on the stack.

In-place operations:

In-place operations are like binary operations, in that they remove TOS and TOS1, and push the result back on the stack, but the operation is done in-place when TOS1 supports it, and the resulting TOS may be (but does not have to be) the original TOS1.

INPLACE_POWER, INPLACE_MULTIPLY, INPLACE_FLOOR_DIVIDE, INPLACE_TRUE_DIVIDE, INPLACE_MODULO, INPLACE_ADD, INPLACE_SUBTRACT, INPLACE_LSHIFT, INPLACE_RSHIFT, INPLACE_AND, INPLACE_XOR, INPLACE_OR, STORE_SUBSCR (TOS1[TOS] = TOS2), DELETE_SUBSCR (del TOS1[TOS]),

Miscellaneous opcodes:

PRINT_EXPR

Miscellaneous opcodes with arguments:

All of the following opcodes expect arguments. An argument is two bytes, with the more significant byte last.

todo

    Prefixes any opcode which has an argument too big to fit into the default two bytes. ext holds two additional bytes which, taken together with the subsequent opcode’s argument, comprise a four-byte argument, ext being the two most-significant bytes.

CALL_FUNCTION_VAR(argc)

    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the variable argument list, followed by keyword and positional arguments.

CALL_FUNCTION_KW(argc)

    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by explicit keyword and positional arguments.

CALL_FUNCTION_VAR_KW(argc)

    Calls a function. argc is interpreted as in CALL_FUNCTION. The top element on the stack contains the keyword arguments dictionary, followed by the variable-arguments tuple, followed by explicit keyword and positional arguments.

Note: as of Python 3.6, "The Python interpreter now uses a 16-bit wordcode instead of bytecode which made a number of opcode optimizations possible. (Contributed by Demur Rumed with input and reviews from Serhiy Storchaka and Victor Stinner in bpo-26647 and bpo-28050.)" [1].

(rationale: "The change is to have all instructions take an argument. This removes the branch on each instruction on whether to load oparg. It then also aligns instructions to always be 2 bytes rather than 1 or 3 by having arguments only take up 1 byte. In the case that an argument to an instruction is greater than 255, it can chain EXTENDED_ARG up to 3 times. In practice this rarely occurs, mostly only for jumps, & abarnert measured stdlib to be ~5% smaller

The rationale is that this offers 3 benefits: Smaller code size, simpler instruction iteration/indexing (One may now scan backwards, as peephole.c does in this patch), which between the two results in a small perf gain (the only way for perf to be negatively impacted is by an increase in EXTENDED_ARGs, when I post benchmarking I'll also post a count of how many more EXTENDED_ARGs are emitted)

This also means that if I want to create something like a tracer that tracks some information for each instruction, I can allocate an array of codesize/2 bytes, then index off of half the instruction index. This isn't currently done in peephole.c, nor does this include halving jump opargs

" ) -- [2]