proj-plbook-plChLlvmLang

Table of Contents for Programming Languages: a survey

LLVM IR

This chapter is about the LLVM IR programming language. For information about other parts of LLVM, the language implementation toolkit, see LLVM.

" LLVM IR is designed to host mid-level analyses and transformations that you find in the optimizer section of a compiler. It was designed with many specific goals in mind, including supporting lightweight runtime optimizations, cross-function/interprocedural optimizations, whole program analysis, and aggressive restructuring transformations, etc. The most important aspect of it, though, is that it is itself defined as a first class language with well-defined semantics. To make this concrete, here is a simple example of a .ll file:

define i32 @add1(i32 %a, i32 %b) { entry: %tmp1 = add i32 %a, %b ret i32 %tmp1 }

define i32 @add2(i32 %a, i32 %b) { entry: %tmp1 = icmp eq i32 %a, 0 br i1 %tmp1, label %done, label %recurse

recurse: %tmp2 = sub i32 %a, 1 %tmp3 = add i32 %b, 1 %tmp4 = call i32 @add2(i32 %tmp2, i32 %tmp3) ret i32 %tmp4

done: ret i32 %b }

This LLVM IR corresponds to this C code, which provides two different ways to add integers:

unsigned add1(unsigned a, unsigned b) { return a+b; }

Perhaps not the most efficient way to add two numbers. unsigned add2(unsigned a, unsigned b) { if (a == 0) return b; return add2(a-1, b+1); }

As you can see from this example, LLVM IR is a low-level RISC-like virtual instruction set. Like a real RISC instruction set, it supports linear sequences of simple instructions like add, subtract, compare, and branch. These instructions are in three address form, which means that they take some number of inputs and produce a result in a different register.5 LLVM IR supports labels and generally looks like a weird form of assembly language.

Unlike most RISC instruction sets, LLVM is strongly typed with a simple type system (e.g., i32 is a 32-bit integer, i32 is a pointer to pointer to 32-bit integer) and some details of the machine are abstracted away. For example, the calling convention is abstracted through call and ret instructions and explicit arguments. Another significant difference from machine code is that the LLVM IR doesn't use a fixed set of named registers, it uses an infinite set of temporaries named with a % character. " -- http://www.aosabook.org/en/llvm.html

comparing LLVM to alternatives:

" Java and .NET virtual machines. These systems provide a JIT compiler, runtime support, and a very well defined bytecode format. This means that any language that can compile to the bytecode format (and there are dozens of them3) can take advantage of the effort put into the optimizer and JIT as well as the runtime. The tradeoff is that these implementations provide little flexibility in the choice of runtime: they both effectively force JIT compilation, garbage collection, and the use of a very particular object model. This leads to suboptimal performance when compiling languages that don't match this model closely, such as C (e.g., with the LLJVM project).

A second success story is perhaps the most unfortunate, but also most popular way to reuse compiler technology: translate the input source to C code (or some other language) and send it through existing C compilers. This allows reuse of the optimizer and code generator, gives good flexibility, control over the runtime, and is really easy for front-end implementers to understand, implement, and maintain. Unfortunately, doing this prevents efficient implementation of exception handling, provides a poor debugging experience, slows down compilation, and can be problematic for languages that require guaranteed tail calls (or other features not supported by C).

A final successful implementation of this model is GCC4. GCC supports many front ends and back ends, and has an active and broad community of contributors. GCC has a long history of being a C compiler that supports multiple targets with hacky support for a few other languages bolted onto it. As the years go by, the GCC community is slowly evolving a cleaner design. As of GCC 4.4, it has a new representation for the optimizer (known as "GIMPLE Tuples") which is closer to being separate from the front-end representation than before. Also, its Fortran and Ada front ends use a clean AST.

While very successful, these three approaches have strong limitations to what they can be used for, because they are designed as monolithic applications " -- http://www.aosabook.org/en/llvm.html

Some issues with LLVM:

LLVM ISA

Instructions

http://llvm.org/docs/LangRef.html

http://llvm.org/docs/LangRef.html#instruction-reference

Terminator Instructions (control flow between basic blocks):

Terminator Instructions, added later:

Binary Operations:

Bitwise Binary Operations:

Vector Operations:

Aggregate Operations:

Memory Access and Addressing Operations:

Conversion Operations (all take a destination type):

Conversion Operations, added later:

Other Operations:

Other operations, added later:

Intrinsics

Variable Argument Handling Intrinsics:

Accurate Garbage Collection Intrinsics:

Code Generator Intrinsics: llvm.returnaddress, llvm.frameaddress, llvm.stacksave, llvm.stackrestore, llvm.prefetch, llvm.pcmarker, llvm.readcyclecounter,

Code Generator Intrinsics, added later: llvm.addressofreturnaddress, llvm.localescape, llvm.localrecover, llvm.readregister, llvm.writeregister, llvm.get.dynamic.area.offset, llvm.clearcache, llvm.instrprof.increment, llvm.instrprof.increment.step, llvm.instrprof.value.profile, llvm.thread.pointer

Standard C Library Intrinsics: llvm.memcpy, llvm.memmove, llvm.memset, llvm.sqrt, llvm.powi, llvm.sin, llvm.cos, llvm.pow, llvm.exp, llvm.exp2, llvm.log, llvm.log10, llvm.log2, llvm.fma, llvm.fabs, llvm.copysign, llvm.floor, llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint, llvm.round,

Standard C Library Intrinsics, added later: llvm.minnum, llvm.maxnum

Bit Manipulation Intrinsics: llvm.bswap, llvm.ctpop, llvm.ctlz, llvm.cttz,

Bit Manipulation Intrinsics, added later: llvm.bitreverse, llvm.fshl, llvm.fshr

Arithmetic with Overflow Intrinsics: llvm.sadd.with.overflow, llvm.uadd.with.overflow, llvm.ssub.with.overflow, llvm.usub.with.overflow, llvm.smul.with.overflow, llvm.umul.with.overflow,

Specialised Arithmetic Intrinsics: llvm.fmuladd,

Specialised Arithmetic Intrinsics, added later: llvm.canonicalize

Experimental Vector Reduction Intrinsics, added later: llvm.experimental.vector.reduce.: add, fadd, mul, fmal, and, or, xor, smax, smin, umax, umin, fmax, fmin

Half Precision Floating Point Intrinsics: llvm.convert.to.fp16, llvm.convert.from.fp16,

Debugger Intrinsics: llvm.dbg.declare, llvm.dbg.value¶

Debugger Intrinsics, added later: llvm.dbg.addr

Exception Handling Intrinsics: llvm.eh.typeid.for, llvm.eh.sjlj.setjmp, llvm.eh.sjlj.longjmp, llvm.eh.sjlj.lsda, llvm.eh.sjlj.callsite,

Exception Handling Intrinsics, added later: llvm.eh.begincatch, llvm.eh.endcatch, llvm.eh.exceptionpointer

Trampoline Intrinsics: llvm.init.trampoline, llvm.adjust.trampoline

Masked Vector Intrinsics, added later: llvm.masked.: load, store, gather, scatter, expandload, compresstore

Memory Use Markers: llvm.lifetime.start, llvm.lifetime.end, llvm.invariant.start, llvm.invariant.end

Memory Use Markers, added later: llvm.launder.invariant.group, llvm.strip.invariant.group

Constrained Floating-Point Intrinsics, added later: llvm.experimental.constrained.: fadd, fsub, fmul, fdiv, frem, fma

Constrained libm-equivalent Intrinsics, added later: llvm.experimental.constrained.: sqrt, pow, powi, sin, cos, exp, exp2, log, log10, log2, rint, nearbyint

General Intrinsics: llvm.var.annotation, llvm.ptr.annotation.*, llvm.annotation.*, llvm.trap, llvm.debugtrap, llvm.stackprotector, llvm.stackprotectorcheck (later removed), llvm.objectsize, llvm.expect, llvm.donothing,

General Intrinsics, added later: llvm.codeview.annotation, llvm.stackguard, llvm.ssa_copy, llvm.type.test, llvm.type.checked.load, llvm.experimental.deoptimize, llvm.experimental.guard, llvm.load.relative, llvm.sideeffect

Stack map intrinsics, added later: llvm.experimental.stackmap, llvm.experimental.patchpoint

Element Wise Atomic Memory Intrinsics, added later: llvm.memcpy.element.unordered.atomic, llvm.memmove.element.unordered.atomic, llvm.memset.element.unordered.atomic

" The LLVM instruction set defines a register based virtual machine with an interesting twist: it has an infinite number of registers. In keeping with its design point as a compiler intermediate representation, LLVM registers enable static single assignment form. A register is used for exactly one value and never reassigned, making it easy for subsequent processing to determine whether values are live or can be eliminated. " -- http://codingrelic.geekhold.com/2010/07/virtual-instruction-sets-opcode.html

Extensions, proposals, and simplifications

todo: llva: "There has been work on things like virtual ISA's using LLVM (see http://llvm.org/pubs/2003-10-01-LLVA.html), but the result of this research was, IMHO, that it's not currently a worthwhile endeavor. You also can do things like restrict bitcode in ways that make it portable (like, for example, PNaCL?), but this is closer to the LLVA work than anything else (It's essentially a portability layer) It actually still requires porting, just porting to the single portability layer. "

PNaCL

PNaCl? is (mostly) a subset of LLVM that Google wants to use as a portable low-level browser 'assembly language'. As a (mostly) subset, yet which is still complete enough to write programs, i figure it might be useful as a description of a 'simpler' LLVM, esp. for someone first learning LLVM (like me). See the PNaCl? section in [1].

Links