PyPy? is a reimplementation of Python in Python.
RPython is a restricted subset of Python, with restrictions on dynamic typing, reflection, and metaprogramming to enable type inference at compile time. The RPython compiler is written in Python. The PyPy? Python compiler/interpreter is written in a mixture of Python (for slow initialization) and RPython (for the fast part) (i think?).
I think it provides an extension API called CPyExt?, not sure though.
RPython:
Perl6 source code is parsed and the parse tree is annotated by firing "action methods" during parsing. The annotated AST is called a QAST. The QAST is then compiled to a virtual machine bytecode. Various virtual machines are planned to be supported, include Parrot, JVM, and MoarVM?. The compilation steps from source code to bytecode are implemented in a subset of Perl6 called 'NQP' (Not Quite Perl).
The Perl6 object model is being reimplemented in the 6model project.
Links:
Squeak runs on a VM.
The VM is implemented in Slang, a subset of Smalltalk that can be efficiently optimized.
Generally the implementation of a high-level language in a restricted 'core' version of that same language define the core by producing a statically typed variant and disallowing various metaprogramming constructs.
(todo move most of the above here)
We'll refer to the top of the stack as TOP0, and to the second position on the stack as TOP1, and to the third position as TOP2, etc.
stacks:
cpython: block stack
stack ops: cpython:
arithmetic:
stack-oriented
primitive types: int, long, short, byte, char, float, double, bool, reference
invokedynamic Dynalink (Dynamic Linker Framework)
Links:
Android has a non-canonical JVM interpreter called 'Dalvik'.
It doesn't support invokedynamic.
DLR for dynamic languages (built on top of CLR)
alternate, open implementation: Mono
https://en.wikipedia.org/wiki/List_of_CIL_instructions
Some issues with LLVM:
Two stacks:
Stack ops:
Arithmetic:
Links:
Parrot started out as a runtime for Perl6. Then it refocused on being an interoperable VM target for a variety of languages. However, it hasn't been very successful (due to not enough volunteers being motivated to spend enough hours hacking on it), and even Perl6 is moving away from it.
Even if unsuccessful, it is still of interest because it is one of the few VMs designed with interoperation between multiple HLLs in mind.
Note that multiple core Parrot devs claim that all core Parrot devs hate Parrot's object model: http://whiteknight.github.io/2011/09/10/dust_settles.html http://www.modernperlbooks.com/mt/2012/12/the-implementation-of-perl-5-versus-perl-6.html
It's register-based. It provides garbage collection.
It has a syntactic-sugar IR language called PIR (which handles register allocation and supports named registers), an assembly-language called PASM, an AST serialization format called PAST, and a bytecode called PBT.
Its objects are called PMCs (Polymorphic Containers).
Its set of opcodes are extensible (a program written in Parrot can define custom opcodes). Parrot itself contains a lot of opcodes: http://docs.parrot.org/parrot/devel/html/ops.html
At one point there was an effort called M0 to redefine things from a small, core set of opcodes but i don't know what happened to it; this appears to be the list of M0 opcodes: https://github.com/parrot/parrot/blob/m0/src/m0/m0.ops . I dunno if the M0 project is still ongoing, see http://leto.net/dukeleto.pl/2011/05/what-is-m0.html https://github.com/parrot/mole http://reparrot.blogspot.com/2011/07/m0-roadmap-goals-for-q4-2011.html http://gerdr.github.io/on-parrot/rethinking-m0.html . The repo seems to be at https://github.com/parrot/parrot/tree/m0 . There is also an IL that compiles to M0: https://github.com/parrot/m1/blob/master/docs/pddxx_m1.pod .
There was an earlier effort for some sort of core language called L1 http://wknight8111.blogspot.com/2009/06/l1-language-of-parrot-internals.html . Not sure what happened with that either.
Links:
MoarVM? is a VM built for Perl6's Rakudo implementation (the most canonical Perl6 implementation as of this writing).
Links:
From http://wiki.squeak.org/squeak/2267 , the operations available in Slang are:
"&" "
| " and: or: not |
Links:
The Smalltalk-80 Bytecodes Range Bits Function 0-15 0000iiii Push Receiver Variable #iiii 16-31 0001iiii Push Temporary Location #iiii 32-63 001iiiii Push Literal Constant #iiiii 64-95 010iiiii Push Literal Variable #iiiii 96-103 01100iii Pop and Store Receiver Variable #iii 104-111 01101iii Pop and Store Temporary Location #iii 112-119 01110iii Push (receiver, true, false, nil, -1, 0, 1, 2) [iii] 120-123 011110ii Return (receiver, true, false, nil) [ii] From Message 124-125 0111110i Return Stack Top From (Message, Block) [i] 126-127 0111111i unused 128 10000000 jjkkkkkk Push (Receiver Variable, Temporary Location, Literal Constant, Literal Variable) [jj] #kkkkkk 129 10000001 jjkkkkkk Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk 130 10000010 jjkkkkkk Pop and Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk 131 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments 132 10000100 jjjjjjjj kkkkkkkk Send Literal Selector #kkkkkkkk With jjjjjjjj Arguments 133 10000101 jjjkkkkk Send Literal Selector #kkkkk To Superclass With jjj Arguments 134 10000110 jjjjjjjj kkkkkkkk Send Literal Selector #kkkkkkkk To Superclass With jjjjjjjj Arguments 135 10000111 Pop Stack Top 136 10001000 Duplicate Stack Top 137 10001001 Push Active Context 138-143 unused 144-151 10010iii Jump iii + 1 (i.e., 1 through 8) 152-159 10011iii Pop and Jump 0n False iii +1 (i.e., 1 through 8) 160-167 10100iii jjjjjjjj Jump(iii - 4) *256+jjjjjjjj 168-171 101010ii jjjjjjjj Pop and Jump On True ii *256+jjjjjjjj 172-175 101011ii jjjjjjjj Pop and Jump On False ii *256+jjjjjjjj 176-191 1011iiii Send Arithmetic Message #iiii 192-207 1100iiii Send Special Message #iiii 208-223 1101iiii Send Literal Selector #iiii With No Arguments 224-239 1110iiii Send Literal Selector #iiii With 1 Argument 240-255 1111iiii Send Literal Selector #iiii With 2 Arguments
Links:
Links:
Links:
Links:
"OS threads, blocking IO, and other system things"
asynchronous I/O and some other stuff.
Used by nodejs, Rust language, Luvit, Julia, pyuv, MoarVM?.
For more inspiration about the sorts of instructions that might go into a VM, one might look at popular RISC CPU instruction sets.
My purpose in including this section is NOT to teach the reader the basics of assembly language and computer architecture; i assume that the reader already knows that. I just want to give you more food for thought about 'minimal' programming languages.
RISC is "Reduced Instruction Set Computer", in constrast to CISC, "Complex Instruction Set Computer". The difference between RISC and CISC is not that RISC necessarily has fewer instructions (although it often does), but rather that RISC instructions are less complex and typically can be executed within a single data memory cycle ( http://en.wikipedia.org/wiki/Reduced_instruction_set_computing#Instruction_set ). Note that this means that RISC instruction sets sometimes eschew operations which access main memory and also do something else, preferring to provide only load/store operations and not other ways of accessing main memory. In more detail:
" what exactly is a RISC processor? This turns out to be quite hard to answer. Here is a list of possible criteria that have been used in the past.
Instructions are conceptually simple — that is, no baroque things like `evaluate polynomial', or `edit string', both of which were found in the VAX.
Instructions are uniform length — as opposed, to say, the VAX or M68000 which have a wide range of instruction lengths.
Instructions use one, or very few, formats — again, unlike the VAX or M68000.
The instruction set is orthogonal — that is, there are no special rules about what operations are permitted with particular addressing modes (which would complicate the life of a compiler writer).
There is one, or very few, addressing modes.
The architecture is load-and-store — that is, only load and store operations access memory — all operate instructions (e.g. arithmetic) only operate on registers.
The architecture supports two (or perhaps a few more) datatypes — integer and floating point usually." -- http://euler.mat.uson.mx/~havillam/ca/CS323/0708.cs-323004.htmlAs of this writing, ARM is the most commerical successful RISC architecture. Other often-noted ones are SPARC, PowerPC?, and MIPS. Of these, some say that MIPS is the prototypical, most elegant example of RISC:
"MIPS is the cleanest successful RISC. PowerPC?