Table of Contents for Programming Languages: a survey
This chapter is about the LLVM IR programming language. For information about other parts of LLVM, the language implementation toolkit, see LLVM.
" LLVM IR is designed to host mid-level analyses and transformations that you find in the optimizer section of a compiler. It was designed with many specific goals in mind, including supporting lightweight runtime optimizations, cross-function/interprocedural optimizations, whole program analysis, and aggressive restructuring transformations, etc. The most important aspect of it, though, is that it is itself defined as a first class language with well-defined semantics. To make this concrete, here is a simple example of a .ll file:
define i32 @add1(i32 %a, i32 %b) { entry: %tmp1 = add i32 %a, %b ret i32 %tmp1 }
define i32 @add2(i32 %a, i32 %b) { entry: %tmp1 = icmp eq i32 %a, 0 br i1 %tmp1, label %done, label %recurse
recurse: %tmp2 = sub i32 %a, 1 %tmp3 = add i32 %b, 1 %tmp4 = call i32 @add2(i32 %tmp2, i32 %tmp3) ret i32 %tmp4
done: ret i32 %b }
This LLVM IR corresponds to this C code, which provides two different ways to add integers:
unsigned add1(unsigned a, unsigned b) { return a+b; }
Perhaps not the most efficient way to add two numbers. unsigned add2(unsigned a, unsigned b) { if (a == 0) return b; return add2(a-1, b+1); }
As you can see from this example, LLVM IR is a low-level RISC-like virtual instruction set. Like a real RISC instruction set, it supports linear sequences of simple instructions like add, subtract, compare, and branch. These instructions are in three address form, which means that they take some number of inputs and produce a result in a different register.5 LLVM IR supports labels and generally looks like a weird form of assembly language.
Unlike most RISC instruction sets, LLVM is strongly typed with a simple type system (e.g., i32 is a 32-bit integer, i32 is a pointer to pointer to 32-bit integer) and some details of the machine are abstracted away. For example, the calling convention is abstracted through call and ret instructions and explicit arguments. Another significant difference from machine code is that the LLVM IR doesn't use a fixed set of named registers, it uses an infinite set of temporaries named with a % character. " -- http://www.aosabook.org/en/llvm.html
comparing LLVM to alternatives:
" Java and .NET virtual machines. These systems provide a JIT compiler, runtime support, and a very well defined bytecode format. This means that any language that can compile to the bytecode format (and there are dozens of them3) can take advantage of the effort put into the optimizer and JIT as well as the runtime. The tradeoff is that these implementations provide little flexibility in the choice of runtime: they both effectively force JIT compilation, garbage collection, and the use of a very particular object model. This leads to suboptimal performance when compiling languages that don't match this model closely, such as C (e.g., with the LLJVM project).
A second success story is perhaps the most unfortunate, but also most popular way to reuse compiler technology: translate the input source to C code (or some other language) and send it through existing C compilers. This allows reuse of the optimizer and code generator, gives good flexibility, control over the runtime, and is really easy for front-end implementers to understand, implement, and maintain. Unfortunately, doing this prevents efficient implementation of exception handling, provides a poor debugging experience, slows down compilation, and can be problematic for languages that require guaranteed tail calls (or other features not supported by C).
A final successful implementation of this model is GCC4. GCC supports many front ends and back ends, and has an active and broad community of contributors. GCC has a long history of being a C compiler that supports multiple targets with hacky support for a few other languages bolted onto it. As the years go by, the GCC community is slowly evolving a cleaner design. As of GCC 4.4, it has a new representation for the optimizer (known as "GIMPLE Tuples") which is closer to being separate from the front-end representation than before. Also, its Fortran and Ada front ends use a clean AST.
While very successful, these three approaches have strong limitations to what they can be used for, because they are designed as monolithic applications " -- http://www.aosabook.org/en/llvm.html
Some issues with LLVM:
http://llvm.org/docs/LangRef.html
http://llvm.org/docs/LangRef.html#instruction-reference
Terminator Instructions (control flow between basic blocks):
Terminator Instructions, added later:
Binary Operations:
Bitwise Binary Operations:
Vector Operations:
Aggregate Operations:
Memory Access and Addressing Operations:
Conversion Operations (all take a destination type):
Conversion Operations, added later:
Other Operations:
Other operations, added later:
Variable Argument Handling Intrinsics:
Accurate Garbage Collection Intrinsics:
Code Generator Intrinsics: llvm.returnaddress, llvm.frameaddress, llvm.stacksave, llvm.stackrestore, llvm.prefetch, llvm.pcmarker, llvm.readcyclecounter,
Code Generator Intrinsics, added later: llvm.addressofreturnaddress, llvm.localescape, llvm.localrecover, llvm.readregister, llvm.writeregister, llvm.get.dynamic.area.offset, llvm.clearcache, llvm.instrprof.increment, llvm.instrprof.increment.step, llvm.instrprof.value.profile, llvm.thread.pointer
Standard C Library Intrinsics: llvm.memcpy, llvm.memmove, llvm.memset, llvm.sqrt, llvm.powi, llvm.sin, llvm.cos, llvm.pow, llvm.exp, llvm.exp2, llvm.log, llvm.log10, llvm.log2, llvm.fma, llvm.fabs, llvm.copysign, llvm.floor, llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint, llvm.round,
Standard C Library Intrinsics, added later: llvm.minnum, llvm.maxnum
Bit Manipulation Intrinsics: llvm.bswap, llvm.ctpop, llvm.ctlz, llvm.cttz,
Bit Manipulation Intrinsics, added later: llvm.bitreverse, llvm.fshl, llvm.fshr
Arithmetic with Overflow Intrinsics: llvm.sadd.with.overflow, llvm.uadd.with.overflow, llvm.ssub.with.overflow, llvm.usub.with.overflow, llvm.smul.with.overflow, llvm.umul.with.overflow,
Specialised Arithmetic Intrinsics: llvm.fmuladd,
Specialised Arithmetic Intrinsics, added later: llvm.canonicalize
Experimental Vector Reduction Intrinsics, added later: llvm.experimental.vector.reduce.: add, fadd, mul, fmal, and, or, xor, smax, smin, umax, umin, fmax, fmin
Half Precision Floating Point Intrinsics: llvm.convert.to.fp16, llvm.convert.from.fp16,
Debugger Intrinsics: llvm.dbg.declare, llvm.dbg.value¶
Debugger Intrinsics, added later: llvm.dbg.addr
Exception Handling Intrinsics: llvm.eh.typeid.for, llvm.eh.sjlj.setjmp, llvm.eh.sjlj.longjmp, llvm.eh.sjlj.lsda, llvm.eh.sjlj.callsite,
Exception Handling Intrinsics, added later: llvm.eh.begincatch, llvm.eh.endcatch, llvm.eh.exceptionpointer
Trampoline Intrinsics: llvm.init.trampoline, llvm.adjust.trampoline
Masked Vector Intrinsics, added later: llvm.masked.: load, store, gather, scatter, expandload, compresstore
Memory Use Markers: llvm.lifetime.start, llvm.lifetime.end, llvm.invariant.start, llvm.invariant.end
Memory Use Markers, added later: llvm.launder.invariant.group, llvm.strip.invariant.group
Constrained Floating-Point Intrinsics, added later: llvm.experimental.constrained.: fadd, fsub, fmul, fdiv, frem, fma
Constrained libm-equivalent Intrinsics, added later: llvm.experimental.constrained.: sqrt, pow, powi, sin, cos, exp, exp2, log, log10, log2, rint, nearbyint
General Intrinsics: llvm.var.annotation, llvm.ptr.annotation.*, llvm.annotation.*, llvm.trap, llvm.debugtrap, llvm.stackprotector, llvm.stackprotectorcheck (later removed), llvm.objectsize, llvm.expect, llvm.donothing,
General Intrinsics, added later: llvm.codeview.annotation, llvm.stackguard, llvm.ssa_copy, llvm.type.test, llvm.type.checked.load, llvm.experimental.deoptimize, llvm.experimental.guard, llvm.load.relative, llvm.sideeffect
Stack map intrinsics, added later: llvm.experimental.stackmap, llvm.experimental.patchpoint
Element Wise Atomic Memory Intrinsics, added later: llvm.memcpy.element.unordered.atomic, llvm.memmove.element.unordered.atomic, llvm.memset.element.unordered.atomic
" The LLVM instruction set defines a register based virtual machine with an interesting twist: it has an infinite number of registers. In keeping with its design point as a compiler intermediate representation, LLVM registers enable static single assignment form. A register is used for exactly one value and never reassigned, making it easy for subsequent processing to determine whether values are live or can be eliminated. " -- http://codingrelic.geekhold.com/2010/07/virtual-instruction-sets-opcode.html
todo: llva: "There has been work on things like virtual ISA's using LLVM (see http://llvm.org/pubs/2003-10-01-LLVA.html), but the result of this research was, IMHO, that it's not currently a worthwhile endeavor. You also can do things like restrict bitcode in ways that make it portable (like, for example, PNaCL?), but this is closer to the LLVA work than anything else (It's essentially a portability layer) It actually still requires porting, just porting to the single portability layer. "
PNaCl? is (mostly) a subset of LLVM that Google wants to use as a portable low-level browser 'assembly language'. As a (mostly) subset, yet which is still complete enough to write programs, i figure it might be useful as a description of a 'simpler' LLVM, esp. for someone first learning LLVM (like me). See the PNaCl? section in [1].