Table of Contents for Programming Languages: a survey
RPython
RPython is a restricted subset of Python, with restrictions on dynamic typing, reflection, and metaprogramming to enable type inference at compile time.
RPython was built as part of the PyPy? project, but this section is only about RPython considered as a language in itself. The PyPy? project has two goals (1) a reimplementation of Python in RPython (see the relevant section of Implementation Case Studies), (2) to be a generic toolkit for programming language implementation based on RPython (see PyPy (seen as a toolkit for creating programming languages) ).
RPython restrictions
Some notes on some of RPython's restrictions (some of these are direct quotes from the documentation):
- each variable must have a single type (at each 'control flow point') (except that you can "mix None (basically with the role of a null pointer) with many other types: wrapped objects, class instances, lists, dicts, strings, etc. but not with int, floats or tuples."
- all module globals are considered constants (in the sense that their binding may not change at runtime)
- global (i.e. prebuilt) lists and dictionaries must not be modified; however global instances may be modified
- for loops restricted to builtin types
- ('range' is supported)
- no run-time definition of classes or functions
- "generators are supported, but their exact scope is very limited. you can’t merge two different generator in one control point."
- Exceptions are supported, except that exceptions implicitly raised locally (not bubbling up from a called function, nor explicitly raised or re-raised) are ignored unless there is an exception handler for them.
- (int, float, bool are supported)
- strings: in some cases (esp. slicing), negative indices are not supported. No implicit str-to-unicode cast.
- % is supported but the format string must be known at translation time, and the only "formatting specifiers are %s, %d, %x, %o, %f, plus %r but only for user-defined instances. Modifiers such as conversion flags, precision, length etc. are not supported. Moreover, it is forbidden to mix unicode and strings when formatting."
- only fixed-length tuples. Note: "There is no general way to convert a list into a tuple, because the length of the result would not be known statically. (You can of course do t = (lst[0], lst[1], lst[2]) if you know that lst has got 3 items.)"
- lists: in some cases (esp. slicing), negative indices are not supported. Slice assignment cannot change the length of a list. +, +=, in, *, *=, ==, !=, append, index, insert, extend, reverse, pop are supported on lists
- dicts: dicts must have a hashable, unique key type. No custom hash functions or custom equality (there is a library function for custom hash functions though).
- no sets
- functions: No *args or keywords (actually you can use *args but not variadicly). A function variable must have an unchanging signature.
- Class attributes are unchanging after startup. Class methods must have an unchanging signature.
- Single inheritance (plus mixins).
- (classes are first-class objects)
- supported special methods: __init__, __del__, __len__, __getitem__, __setitem__, __getslice__, __setslice__, and __iter__. To handle slicing, __getslice__ and __setslice__ must be used; using __getitem__ and __setitem__ for slicing isn’t supported.
- " __del__ should only contain simple operations; for any kind of more complex destructor, consider using instead rpython.rlib.rgc.FinalizerQueue?."
- "int, float, str, ord, chr... are available as simple conversion functions. Note that int, float, str... have a special meaning as a type inside of isinstance only."
- builtin functions: range, enumerate, reversed, bool, int, float, chr, unichar, unicode, bytearray, hasattr, tuple, list, zip, min, max.
alternate list from [1] (todo combine with above):
- "variables should contain values of at most one type as described in Object restrictions at each control flow point, that means for example that joining control paths using the same variable to contain both a string and a int must be avoided."
- "all module globals are considered constants. Their binding must not be changed at run-time. Moreover, global (i.e. prebuilt) lists and dictionaries are supposed to be immutable: modifying e.g. a global list will give inconsistent results. However, global instances don’t have this restriction, so if you need mutable global state, store it in the attributes of some prebuilt singleton instance."
- "for loops restricted to builtin types"
- "generators very restricted."
- "range does not necessarily create an array, only if the result is modified"
- "run-time definition of classes or functions is not allowed."
- "generators are supported, but their exact scope is very limited. you can’t merge two different generator in one control point."
- "exceptions fully supported"
- integer, float, boolean work
- strings: "a lot of, but not all string methods are supported and those that are supported, not necesarilly accept all arguments. Indexes can be negative. In case they are not, then you get slightly more efficient code if the translator can prove that they are non-negative. When slicing a string it is necessary to prove that the slice start and stop indexes are non-negative. There is no implicit str-to-unicode cast anywhere. Simple string formatting using the % operator works, as long as the format string is known at translation time; the only supported formatting specifiers are %s, %d, %x, %o, %f, plus %r but only for user-defined instances. Modifiers such as conversion flags, precision, length etc. are not supported."
- tuples: "no variable-length tuples...Each combination of types for elements and length constitute a separate and not mixable type."
- lists: "lists are used as an allocated array....if you use a fixed-size list, the code is more efficient....Negative or out-of-bound indexes are only allowed for the most common operations..."
- dicts: "dicts with a unique key type only, provided it is hashable"
- "sets are not directly supported in RPython. Instead you should use a plain dict and fill the values with None. Values in that dict will not consume space."
- "function declarations may use defaults and *args, but not keywords."
- "function calls may be done to a known function or to a variable one, or to a method....If you need to call a function with a dynamic number of arguments, refactor the function itself to accept a single argument which is a regular list."
- A number of builtin functions can be used. The precise set can be found in rpython/annotator/builtin.py (see def builtin_xxx())....int, float, str, ord, chr... are available as simple conversion functions. Note that int, float, str... have a special meaning as a type inside of isinstance only."
- note: the current set is: range, enumerate, reversed, hasattr, zip, min, max. Plus conversions: bool, int, float, chr, unichr, bytearray, tuple, list
- "methods and other class attributes do not change after startup...single inheritance is fully supported...classes are first-class objects too"
- "The only special methods that are honoured are __init__, __del__, __len__, __getitem__, __setitem__, __getslice__, __setslice__, and __iter__. To handle slicing, __getslice__ and __setslice__ must be used; using __getitem__ and __setitem__ for slicing isn’t supported. Additionally, using negative indices for slicing is still not support, even when using __getslice__. Note that the destructor __del__ should only contain simple operations; for any kind of more complex destructor, consider using instead rpython.rlib.rgc.FinalizerQueue?."
- "Exceptions are by default not generated for simple cases...Code with no exception handlers does not raise exceptions...By supplying an exception handler, you ask for error checking. Without, you assure the system that the operation cannot fail. This rule does not apply to function calls: any called function is assumed to be allowed to raise any exception...Exceptions explicitly raised or re-raised will always be generated"
Integer Types: how RPython deals with the need for unboxed integers
"Starting with Python 2.4, integers mutate into longs on overflow. In contrast, we need a way to perform wrap-around machine-sized arithmetic by default, while still being able to check for overflow when we need it explicitly. Moreover, we need a consistent behavior" when RPython is run as RPython vs. when it is run as Python. To get control over this, RPython uses:
- ovfcheck: "This special function should only be used with a single arithmetic operation as its argument, e.g. z = ovfcheck(x+y). Its intended meaning is to perform the given operation in overflow-checking mode."
- intmask: "This function is used for wrap-around arithmetic."
- r_uint: A class for machine-sized unsigned arithmetic.