Table of Contents for Programming Languages: a survey

Chapter : Type systems

"A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute." – Benjamin Pierce

note: other sorts of compile-time analysis are called "static analysis"

note: success typing does not prove the absence of (bad) behavior, but rather its presence (less conservative, looks for proofs of bad behavior rather than proofs of correctness)

"The type of a function specifes (partially) what it does. Although weak as a specifcation language, static types have compensating virtues: they are

As a result, static type checking is by far the most widely used verifcation technology today.

Every type system excludes some "good" programs, and permits some "bad" ones.

 For example, a language that lacks polymorphism will reject this "good" program:

f :: [Int] -> [Bool] -> Int f is bs = length is + length bs

Why? Because the length function cannot apply to both a list of Ints and a list of Bools. The solution is to use a more sophisticated type system in which we can give length a polymorphic type. Conversely, most languages will accept the expression 'speed + distance' where speed is a variable representing speed, and distance represents distance, even though adding a speed to a distance is as much nonsense as adding a character to a boolean.


The type-system designer wants to accommodate more good programs and exclude more bad ones, without going overboard and losing the virtues mentioned above. "

-- Oleg Kiselyov, Simon Peyton Jones, Chung-chieh Shan,

type system criteria

provably: Sound and/or complete Tract able typechecking Tractable type inference


ability for the typical program to simulate it (including type inference, if that is provided) in their head

sources of complexity

trying to hit all the criteria above

also, side effect of academia: in academia it's better for your career to do something:

so we get (a) a proliferation of (b) overly abstract work that is (c) uses intimidatingly powerful math.

type systems and verification

there are other opinions: "...the philosophy that verification should be based on explicit types in the original program source. This intricately ties the process of authoring software with that of verifying it. The result is needlessly constraining both the developer and the verifier. It only lets you (easily) write code within what it can prove, and only lets you prove properties that can be expressed by the types that you use.

Other approaches say, we'll let people write code however they feel most comfortable doing (more or less), using types whose expressivity is defined not by the limits of the verifier but only by various concerns over language complexity, mental overhead, performance and teamwork, while we let the verifier prove whatever types it can. If it proves that the variable x is of type "an integer which is always equal to the current time in seconds modulo seven, except on Saturdays when it's always zero", then that's great! " --

Type system features

first-class functions; second-order types; dependent types; union types, intersection types; etc

Simply typed lambda calculus


Russell's paradox

and foundational crisis in math, etc

Pure type theory

Henk: A Typed Intermediate Language sections 3 and 4 contains a 5-page tutorial on Pure Type Systems (PTS) and the lambda cube. It assumes you know lambda calculus and maybe a little typed lambda calculus.

PTS 'expressions' include both terms and types. The syntax of PTS is:

E ::=
      K                  Constant
    | x                  Variable
    | EE                 Application
    | lambda x:E . E     Abstraction
    | Pi x:E . E         Quantification

Constants, variables, applications, and lambda abstractions operate just as in lambda calculus. Pi-abstraction is the key idea of PTS. Note that the type assigned to the bound variable in lambda abstraction is a subexpression (syntactic nonterminal 'E') (because terms and types are both expressions in PTS).

The meaning of Pi is:

" Pi x:A . B is the type of functions from values of type A to values of type B, in which the result type B may perhaps depend on the value of the argument, x. " -- [1]

For example (from [2]), the K combinator is the polymorphic function that takes two arguments and returns its second argument (K x y = y). The PTS expression of the K combinator (the term) is written:

(lambda A:* . lambda B:* . lambda x:A . lambda y:B . y)

where '*', as in Haskell, denotes the 'kind' called 'type' (the kind of basic types, eg Int is of kind type; 'kind' is a type system for types; in Haskell, the polymorphic type constructor 'List' is of kind * -> *, because List is like a function that takes another type and returns a type, eg List Int is a list of Ints). Note that this polymorphic function first takes two type arguments, which 'specialize' it to the types A, B of its two arguments x,y, and then takes its two actualy arguments x,y.

The PTS-notation K combinator's type is written:

(Pi A:* . Pi B:* . A -> B -> B)


A -> B is an abbreviation for Pi _:A . B

so, putting these together and expanding the '->'s, the PTS expression of the typing judgement of the K combinator is:

|- (lambda A:* . lambda B:* . lambda x:A . lambda y:B . y) : (Pi A:* . Pi B:* . Pi _:A . Pi _:B . B)

=== Lambda cube ===
todo explain:

"In general, a PTS may have more than three levels (or even
an in nite number), but in this paper we study a particular
family of PTSs called the lambda cube. The type system of
the lambda cube ensures that no more than three levels are
required, apart from a solitary constant, 2, at the fourth
level, as we shall see."
-- [ Henk: A Typed Intermediate Language]

=== System F ===

Also known as the (Girard–Reynolds) polymorphic lambda calculus or the second-order lambda calculus [].

=== Hindley-Milner (sp?) ===

=== Category theory ===

=== gradual typing ===

* (recc. by )

=== dont know where to put this === gives some examples that should be included somewhere:

 The basic feature of any is that you can implicitly convert from any type to any and you can implicitly convert from any to any other type. For example, ...

var answer : string = "42";
var a : any = answer;
var the_answer : string = a;
document.body.innerHTML = the_answer;

On the other hand, a gradual type system acts like a static type system when the any type is not involved. For example, the following program tries to implicitly convert from a number to a string, so the type system rejects this program.

var answer : number = 42;
var the_answer : string = answer;


Consider the following [legal in typescript] example...

function f(x:string):string { return x; }
var g : (any)=>string = f;
var h : any = g;
document.body.innerHTML = h("42");

The first conversion is interesting because, if g is called with an argument of type any, then the argument needs to be implicitly converted to the string that f expects. This is an implicit down-cast, and doesn't follow the contra-variance rule for functions that one sees in the subtyping rules for object-oriented languages. Indeed, in a gradually typed system, assignment compatibility is co-variant in the parameter type of a function, at least, with respect to the any type. The second conversion, from (any)=>string to any is not so surprising, it's just up-casting from (any)=>string to any. Interestingly, there is a third implicit conversion in this program. Can you see it? It's in the call to h. The fact that we're calling h implies that h needs to be a function (or something callable), so there's essentially an implicit conversion here from any to (string)=>any. 

Next let's look at implicit conversions involving object types. Like function types, object types are also structural. Consider the following well-typed program in TypeScript, in which an object of type {x:number; y:any} is implicitly converted to {x:any; y:string}, then {x:number}, and finally to any.

var o : {x:number; y:any;} = {x:1, y:"42"};
var p : {x:any; y:string;} = o;
var q : {x:number;} = p;
var r : any = p;
document.body.innerHTML = r.y;

The assignment of o to p shows structural changes within an object type, both to and from any. The next conversion, to {x:number}, shows that the type system allows implicit narrowing of object types. Thus, the rules governing implicit conversion are quite close to the consistent-subtyping relation described in Gradual Typing for Objects. This relation combines the consistency relation that governs the static behavior of any (sometimes called compatibility) with the traditional subtyping relation of structural type systems that allows the implicit narrowing of object types. Getting back to the above example, similar to the function call at type any, TypeScript allows member access on things of type any.

The next example is not well-typed in TypeScript.

var o : {x:number; y:any; } = {x:1, y:"42"};
var q : {x:number;} = o;
var r : {x:number; y:any;} = q;
document.body.innerHTML = r.y;


which shows the TypeScript doesn't allow implicit widening (again in line with the consistent-subtyping relation). 


Consider the following example.

function f(o: {x:number;}):string { return "42"; };
var g : (o: {x:number; y:number;})=>string = f;
var h : (o: {x:number;})=>string = g;
document.body.innerHTML = h({x:1,y:2});

The conversion from f to g should be OK, because it only requires an argument of type {x:number; y:number;} to be up-cast (narrowed) to {x:number;}. However, the conversion from g to h should not be OK because it requires an argument of type {x:number;} to be implicitly down-cast (widened) to {x:number; y:number;}. Surprisingly, the tsc compiler does not give a type error for the above example! So what I said above about TypeScript disallowing implicit widening is not quite true. In many cases it disallows widening, but here we see an exception to the rule. I don't like exceptions in language design because they increase the complexity of the language. So on this one point, TypeScript differs from the design in Gradual Typing for Objects.


see also for a more formal discussion

== notation ==
great post:

i skipped the first few sections and started reading at section "Operational Semantics".

It covers big-step and small-step operational semantics, and also the common notation for type systems, each with a very concise and easy example.

'x : A' is often notation for 'term x is of type A'. Note that the thing on the left side is a term, so "1 + 2 : Integer" is another example.

note: the following may, especially, have errors!

The [ 'turnstile' symbol] (latex '\vdash'), used 'premise \vdash conclusion', is used to represent "this implies that within some formal deductive system" or "from this you may deduce that, within some formal deductive system". If multiple premises are needed to reach a conclusion, they are commas-separated, e.g. 'x : A, y : B \vdash z : C' means 'if it is known that x is of type A, and also that y is of type B, then the system deduces that z is of type C'. (why don't we just use a forward arrow, which is often used for logical implication? In the context of mathematical logic, formulas sometimes contain this 'implication' symbol, which is considered a 'boolean connective' within predicate logic, which means it is just treated as a function that takes two bools and returns another bool; so since this symbol can occur within formulas, we use a different symbol to talk about implication between formulas)

(note: actually the comma operator is not quite what i said above; the stuff on the right side of a comma is allow to overwrite previous type assertions, e.g. 'x : A, y : B, x : C' is equivalent to just 'y : B, x : C'; that is, you can think of assignment to a mutable variable in an imperative programming language, and 'x : A, y : B, x : C' works like 'x = A; y = B; x = C')

Often the [ capitalized version of Greek letter Gamma] ( latex \Gamma ) is used to represent 'the currently active set of assumptions'. For example, instead of just 'x : A \vdash z : C', you might see '\Gamma, x : A  \vdash z : C', where the former means 'if it is known that x is of type A, then the system deduces that z is of type C', and the latter says that, 'if it is known that all of the assumptions in the set \Gamma are true, and also that x is of type A, then the system deduces that z is of type C'.

'A[t/x]' is sometimes used as notation for the formula A with t substituted for x ( , Chapter 1, page 20). Sometimes i've seen this written '[t/x]A', also.

Putting a bunch of stuff above and below a line is used to indicate "if the stuff above the line is true, then the system deduces that the stuff below the line is true". For example,

\Gamma \vdash x : Integer
\Gamma \vdash -x : Integer

means, if the system already knows that, (assuming the set of assertions in \Gamma, x is of type Integer), then it can deduce that, (assuming the set of assertions in \Gamma, -x is also of type Integer).

== Chapter : typing and correctness/safety ==

dynamic vs static vs semidynamic

=== tradeoffs ===
expressive type system -> complex

no type inference -> simpler but less concise

type inference -> concise but complex

dynamic typing -> less strict (safe)

want: safe, expressive, simple, concise (unsolved problem)

=== non-safety tradeoffs regarding of types ===
types also help a ton with IDEs, see below

they can also make code more self-documenting

they can also help enforce architectural constraints in teams

they can also help performance

they can also lengthen compile-time

=== nonstrict type coercion ===

nullable pointer types

=== pre- and post-conditions ===

=== parametric polymorphism (generics) ===
See the excellent introduction

==== types as OOP class members ====
in some languages, instead of or in addition to special syntax to define a class parameterized by a type, one can simply declare a type as a member of the class.

e.g. in Scala:

notes on differences between Abstract Type Members versus Generic Type Parameters:

* (interaction with Swift 'protocols')
*!topic/swift-language/3PtydRXR0ao (interaction with Swift 'protocols')

==== ... ====
and type functions

and C++ templates

* [ What's Wrong with C++ Templates?] is an excellent article that compares C++ templates to Java interfaces and ML modules

* what java interfaces and ML functors and haskell typeclasses have that templates don't:
** typechecked whether or not they are instantiated, and typechecked 'in general', not just for what it was instantiated with
** you can specify the constraint that the type in the template must support certain operations

see "The Generic Dilemma"

if you have interfaces but not generics (like Go), what you are missing is the ability to write a function with a signature like:

KeyIntersection(Map T1 T2, Map T1, T2) : Map T1 T2

instead you must write something like:

KeyIntersection(Mappable, Mappable) : Mappable

the difference is that in the interface version, the compiler can't infer that the types of the keys and values in the result are the same as the types of the keys and values in the arguments.

to substitute: covariant arrays (old C# and its probs)

=== ad-hoc polymorphism ===
see also typeclasses

=== subtyping ===

=== co- and contra- variance ===
array prob; fn prob

to substitute for generics: covariant arrays (old C# and its probs)

=== immutable ===

=== unique ===

=== implicit type conversion ===

=== success typing ===

== Chapter : types ==

=== primitive data types ===

multiple literals for the same value:
 hex literals
 scientific notation

bool, int, real, char, string, 

complex # e.g. python's notation 1.j

usually also data constructors for list or table or tuple or something like that (not technically literals since there is variable substitution in them)


Python list: [1,2,3]

Octave list: [1 2 3]

Python dict: {'a': 1, 'b': 2, 'c':3}

Python set: {1,2,3}

in Apple Swift, lists and associative arrays have the same syntax as Python, except they both use []

different languages have many different names for [ associative arrays]: dict, hash, map, table, associative array, association list/property list (lisp)

some languages distinguish between 'lists', 'arrays', and 'vectors'

tuples (arrays whose length is fixed and statically known)

(note somewhere: "statically known" means "known by the compiler at compile time")

=== user created types ===
* type synonyms
* structs
* operations on types (e.g. sum types (see below), applications of type constructors (see below))
* creation of OOP classes
* perl6's subset types ( ) -- define a subset type T by statements of the form "a value v is a member of type T iff (v is a member of type T2 AND p(v))" where p is an arbitrary predicate that is coded in the language

=== function types ===

In some systems (e.g. Haskell), non-functional value types are equivalent to zeroary functions.

todo currying notation in function types:
* association to the right
* example: the type of '+' is Int -> (Int -> Int); by right associativity we can write this as just Int -> Int -> Int
* example: the type of a function 'applyToInt' which takes a function from Int to Int, and an Int, and produces an Int, is (Int -> Int) -> (Int -> Int); by right associativity we can write this as just (Int -> Int) -> Int -> Int

=== generics, e.g. parametric polymorphism ===
e.g. a List of Ints.

one can conceptualize these in terms of an abstract function from types to types called List; now "a List of Ints" is written "List(Int)".

"List" is called a type constructor, a 'higher order type', a functor or a template. There are subtle differences between these phrases:

A type constructor is an operator which yields a type.

A higher order type is a type which is also a function on types, just as (in a language that supports first-class functions) a function is a value which is also a function on values.

A functor is a mathematical concept from the part of mathematics called 'category theory'. Functors are sometimes used to model higher-order types.

A template is a metaprogramming technique used to achieve parametric polymorphism.

==== more on functors ====
A functor is a thing that can be mapped over. For example:

    map abs [-1, 2, 3] => [1, 2, 3]

Arrays are the simplest example, but it looks like an iterable. However functors retain shape whereas iterables don't. For example if we had a tree (represented visually):

    oak = -1
          / \
         2   3

If we used oak as an iterable, we would lose the structure of the tree:

    map abs iter(oak) => [1, 2, 3]

However if tree belongs to the functor typeclass (i.e. implements functor interface), then:

    map abs oak => 1
                  / \
                 2   3

-- william ting,

" The name "Functor" can be confusing; it's a thing you apply a function to, not a function itself." -- Josh Triplett

==== Kinds ====
Just as the type system with function types keeps track of the signature of functions, one might ask for a system to keep track of the signature of higher-order types. The typical answer is a meta-type-system, a type system for types, where the meta-type of a type is called a "kind".

The typical notation is:
* the "kind" of a type like Int is "*"
* the "kind" of a type like List is "* -> *"
* the "kind" of a type like Pair is "* -> * -> *" (we are using currying notation, like with function types above)

(todo example of (* -> *) -> * ? wikipedia says "See Pierce (2002), chapter 32 for an application.")

=== sum and product types ===

composites of simpler types

==== the simple dual conception ====

* a sum type says that a value is EXACTLY ONE OF the component types
** e.g. "a List is either an Empty or a Cons(item, List)"
* a product type says that a value is ALL OF the component types
** e.g. "an Immutable Int is both Immutable and an Int"

in this conception, a product type is like an AND, and a sum type is like the way we use the word 'or' in English (but not like logical OR)

==== the traditional conception ====
* a sum type says that a value is EXACTLY ONE OF the component types
** same as in the simple dual conception
* a product type says that a value is a STRUCTURE which is the mathematical DIRECT PRODUCT of values of the component types
** e.g. "a Pair of Ints is a 2-tuple whose first element is an Int and whose second element is an Int"

== subtypes ==

== values vs. references ==

== Types as a partial order ==
defn partial order

defn join, meet

should this be moved to the 'formal stuff' section?

== Types as semilattices or lattices ==
defn semilattice, lattice

should this be moved to the 'formal stuff' section?

== Any ==

e.g. a type that contains all values; a greatest element in the algebra of types

== Bottom ==
e.g. a type that contains no values; a least element in the algebra of types

usage to indicate divergence (e.g. Haskell)

== Type ==

e.g. the type of all types

theoretical problem with type of all types (Girad's paradox)

== operations on types ==

instanceOf (sometimes = elem)

subclassOf (sometimes = <=)


kindOf(t) (get the kind of a type)


are types sets?
  if so, are set-theoretic operations implemented for types?

are types just sets? if so, what about the algebra of type constructors?

synonym, sum, product, struct

type constructors

== unsorted ==

=== phantom types, GADTs ===
==== phantom types ====
One thing that people like in Haskell is phantom types.
Note that afaict phantom types can be done in any language with
generics, not just Haskell:


* phantom types in Rust:
* phantom types in C#: 

==== GADTs ====

Haskell also provides GADTs, also called 'first-class phantom types',
see eg

i dont understand what GADTs are good for over and above phantom
types; some links i dont understand that might bear on this question

==== covariant and contravariant ====

Let's say you have a generic class parameterized on type X, for example G = IEnumerable<A>, or G = Action<A>, G = Array<A>. Let's say you replace A with a subtype of A, A_sub, so now you have for example G_2 = IEnumerable<A>, or G_2 = Action<A>, or G_2 = Array<A>. Is G_2 a subtype of G, or vice versa, or neither?

To restate this, let's say you have a parameterized type, T. T has the form of a type expression with at least one type variable in it; let's say the variable is X (and to make this exposition simpler, let's assume there's no other type variables in T). Given an assignment X=A, T evaluates to a particular type G. Let's say that type A_sub is a subclass of A. Given the assignment A_sub, T evaluates to a particular type G_2. The question is, is G_2 a subclass of G, or vice versa, or neither?

Well, in order for A_sub to be a subtype of A, it has to be safe to provide an instance of A_sub whereever an A was expected. This is the [ "Liskov substitution principal"] (more precisely, the Liskov substitution principal states that any property provable about objects of some type should also be provable about objects of any subtype of that type; of course, this has to be conditioned on what universe of properties you are considering, because if the values in the types are truly different, then there are some properties that could be stated about them that would be different; Liskov and Wing included in the universe of relevant properties properties like 'immutable', which is broader than the properties about argument and return types that we are considering here). What are the conditions for this to be true? You can look at it in terms of preconditions and postconditions. If A_sub is a subtype of A, then it must accept equal or weaker preconditions to A; and it must promise equal or stronger postconditions.

A (pure) function that takes no inputs, or only inputs of types other than X, and which provides an output of type X, cannot have any preconditions involving X, because no values of type X were provided. But it has at least one postconditions involving X, namely, it provides something of type X in its output. If T is the signature of a function of such a type, then G_2 is a subclass of G, because as we go from X=A to X=A_sub, the function goes from guaranteeing an A output to an A_sub output, and a guarantee of an A_sub is more informative than a guarantee of an A, so this is a stronger postcondition. So, as the type parameter gave us more information, the postcondition gave us more information too. So in this case, G_2 can be a subclass of G. We call this co-variant, because as we go from superclass to subclass in the type parameter, we also go from superclass to subclass in T.

For example, consider a class MakeA<X> with a single function MakeA(x) that returns some arbitrary value of type X. If Cat is a subtype of Animal, and you have a calling function that expects a MakeA<Animal>, you can substitue a MakeA<Cat> because it will return a Cat, and the caller expects to get back an Animal, and a Cat is an Animal. But if you have a calling function that expects a MakeA<Cat>, you cannot substitue a MakeA<Animal> because it might return some Animal that is not a Cat, the caller expects to get back a Cat. So MakeA<Cat> is a subtype of MakeA<Animal>. So this is covariant, because it goes in the same direction as Cat <= Animal.

Now consider a (pure) function that take an input of type X, and gives no outputs of type X. Everything is similar to the above but reversed. The function can't have any postconditions depending on X (todo i dont think this is the right way to phrase this because if the function is eg len() then a postcondition is that the length is what it says it is), but it does have a precondition (namely, that the input is of type X). As X passes from A to A_sub, the precondition strengthens. This is the opposite direction from what we'd need for G_2 to be a subclass of G; so in fact G can be a subclass of G_2. We call this contra-variant, because as we go from superclass to subclass in the type parameter, we go the other way around in T, from subclass to superclass.

For example, consider a class Size<X> with a single function size(x) that returns a number that tells you how big x is. If Cat is a subtype of Animal, and you have a calling function that expects a Size<Cat>, you can substitue a Size<Animal> because a Size<Animal> knows how to deal with any Animal input it is given, including cats. But if you have a calling function that expects a Size<Animal>, you cannot substitue a Size<Cat> because the caller expects to be able to pass in any animal, not just Cats. So Size<Animal> is a subtype of Size<Cat>. So this is contravariant, because it goes in the opposite direction as Cat <= Animal.

Now consider a (pure) function that takes an input of type X, and also gives an output of type X. Now as X passes from A to A_sub, the preconditions and the postconditions both strengthen. This is compatible with neither of (G is a subclass or G_2) nor with (G_2 is a subclass of G). So we call this invariant.

Now an interface or class has not one function signature, but a set of them. So we say that an interface or class is covariant if every function signature in it is covariant, and the same for contravariant.

For example, consider an interface for an Array<X> with elements of type X in the array. It has functions to put Xs into the array, and other functions to get Xs out of the array. The former are contravariant, and the latter are covariant, so in total, the interface is invariant. For example, if Cat is a subtype of Animal, and you have a calling function that expects an Array<Animal>, you can't provide it with an Array<Cat>, because an Array<Cat> might only know how to put Cats into the array, and the caller expects to be able to put any Animal in there.

todo the above is a little bit verbose..

== dynamically typed languages with static typing modules ==
Typed Racket

== Substructural type systems ==

Substructural type systems are a family of type systems analogous to substructural logics where one or more of the structural rules are absent or allowed under controlled circumstances. Such systems are useful for constraining access to system resources such as files, locks and memory by keeping track of changes of state that occur and preventing invalid states.[1][2]


Several type systems have emerged by discarding some of the structural rules exchange, weakening and contraction:

    Linear type systems (allow exchange, not weakening or contraction): Every variable is used exactly once.
    Affine type systems (allow exchange and weakening, not contraction): Every variable is used at most once.
    Relevant type systems (allow exchange and contraction, not weakening): Every variable is used at least once.
    Ordered type systems (discard exchange, contraction and weakening): Every variable is used exactly once in the order it was introduced.
" --

== todo ==

dynamic languages can be thought of as "unitype" languages:

"dynamic types (sets of values)" and "static types (sets of expressions)" --

Some comments on reddit were deleted, so the arguments can not be inferred from that discussion. Shortly, in a lazy language we have an equation
fst (x,y) = x (where (x,y) is of product type), which does not hold true in the case of a language with strict semantics (evaluation of ‘y’ may not terminate). Sum types have dual behaviour, as was illustrated here in other comments, but an example which gives a clear evidence of it seems to be trickier to construct." --

==== design choice: implict type coercion? ====

eg scala (yes), golang (no [], except numeric constants (including but not limited to literals) [] ), C (yes)

== random comparisons ==
* [ Ralf Vogler. Type system of Go/C#]

== typed lambda calculi ==

== Simply typed lambda calculus (\lambda^\to) ==
"a typed interpretation of the lambda calculus with only one type constructor: \to that builds function types. It is the canonical and simplest example of a typed lambda calculus." -- []

"The term simple type is also used to refer to extensions of the simply typed lambda calculus such as products, coproducts or natural numbers (System T) or even full recursion (like PCF). In contrast, systems which introduce polymorphic types (like System F) or dependent types (like the Logical Framework) are not considered simply typed. The former are still considered simple because the Church encodings of such structures can be done using only \to and suitable type variables, while polymorphism and dependency cannot." -- []


== pure type systems ==
"The lambda cube is generalized further by pure type systems." -- []


== lambda cube ==

== dependent typing ==

=== Dependent typing systems of the lambda cube ===

==== First order dependent type theory (lambda Pi) ====
"The system \lambda \Pi of pure first order dependent types, corresponding to the logical framework LF, is obtained by generalising the function space type of the simply typed lambda calculus to the dependent product type." --


==== Second order dependent type theory (lambda Pi 2) ====
"The system \lambda \Pi 2 of second order dependent types is obtained from \lambda \Pi by allowing quantification over type constructors. In this theory the dependent product operator subsumes both the \to operator of simply typed lambda calculus and the \forall binder of System F." --

==== Higher order dependently typed polymorphic lambda calculus (lambda Pi omega) ====
"The higher order system \lambda \Pi \omega extends \lambda \Pi 2 to all four forms of abstraction from the lambda cube: functions from terms to terms, types to types, terms to types and types to terms. The system corresponds to the Calculus of constructions whose derivative, the calculus of inductive constructions is the underlying system of the Coq proof assistant." --


== Martin-Löf Type Theory ==

variant: the system of [ NuPRL], which is still predictative []

* [ Programming in Martin-Lof’s Type Theory: An Introduction] by Bengt Nordstrom, Kent Petersson, Jan M. Smith

== System T ==
"a finite type extension of primitive recursive arithmetic" -- []

"The functions definable in System T are exactly the ones which can be proved terminating in Peano Arithmetic" -- []

proof-theoretic ordinal is epsilon_0 []


== system F (second-order lambda calculus) ==
Also known as:
* (Girard–Reynolds) polymorphic lambda calculus []
* second-order lambda calculus []
* F2 []

"a typed lambda calculus that differs from the simply typed lambda calculus by the introduction of a mechanism of universal quantification over types. System F thus formalizes the notion of parametric polymorphism in programming languages" -- []

"forms a theoretical basis for languages such as Haskell and ML" -- []

"Under the Curry–Howard isomorphism, System F corresponds to the fragment of second-order intuitionistic logic that uses only universal quantification. System F can be seen as part of the lambda cube, together with even more expressive typed lambda calculi, including those with dependent types."  -- []



== Hindley–Milner ==
"...type inference for System F is impossible. A restriction of System F known as "Hindley–Milner", or simply "HM", does have an easy type inference algorithm and is used for many statically typed functional programming languages such as Haskell 98 and ML. Over time, as the restrictions of HM-style type systems have become apparent, languages have steadily moved to more expressive logics for their type systems. As of 2008, GHC, a Haskell compiler, goes beyond HM, and now uses System F extended with non-syntactic type equality, for example." --


== system Fw ==
also called "higher-order polymorphic lambda calculus".

"While System F corresponds to the first axis of the Barendregt's lambda cube, System Fω or the higher-order polymorphic lambda calculus combines the first axis (polymorphism) with the second axis (type operators); it is a different, more complex system." --

"System Fw...has higher-kinds and polymorphism, but no dependent types" []

Fw is the extension of second-order lambda calculus to higher-order kinds [].

The question that begs to be asked is this: is * the only kind? The answer is "yes" for F2, and "no" for Fw. In Fw, type variables can range over type constructors as well as over types; indeed, this is precisely what distinguishes it from F2. For example, in Fw we might write: (lambda m : * -> * . lambda x : m Int....) Tree. Here, the first lambda abstracts over m, whose kind is * -> *. The second lambda abstracts over values of type m Int. The whole abstraction is applied to a type constructor Tree, whose kind is presumably * -> *. ...Is the extra power of F ! required...? Clearly this depends on the source language, but the extra power is certainly required for Haskell, whose type system explicitly includes higher-kinded type variables..., most particularly to support constructor classes (Jones [ A system of constructor classes: overloading and implicit higher-order polymorphism]).
-- [ Henk: A Typed Intermediate Language]

[ A system of constructor classes: overloading and implicit higher-order polymorphism]) provides an example of a constructor class:

class Functor f where
  map :: (a -> b) -> (f a -> f b)
instance Functor List where
  map f [ ] = [ ]
  map f (x : xs) = f x : map f xs
instance Functor Tree where
  map f (Leaf x ) = Leaf (f x )
  map f (l :ˆ: r ) = map f l :ˆ: map f r
instance Functor Opt where
  map f (Just x ) = Just (f x )
  map f Nothing = Nothing

Note that the type parameter 'f' accepted by the Functor class is itself a type constructor ('List' is a type constructor that takes an argument giving the type of items in the list, eg List Int is a list of Ints).


system F-sub (F<)

"extends system F with subtyping, bringing it a lot closer to actual programming languages from the ML family." --

system Fc

used by the GHC Haskell compiler

"For many years, GHC's intermediate language was essentially: System Fw, plus algebraic data types (including existentials). But that is inadequate to describe GADTs and associated types. So in 2006 we extended GHC to support System FC, which adds equality constraints and coercions" -- [3]


system U

and Girard's paradox

and Self-Representation in Girard’s System U

Lambda-mu calculus

"the lambda-mu calculus is an extension of the lambda calculus introduced by M. Parigot.[1] It introduces two new operators: the μ operator (which is completely different both from the μ operator found in computability theory and from the μ operator of modal μ-calculus) and the bracket operator. Proof-theoretically, it provides a well-behaved formulation of classical natural deduction.

One of the main goals of this extended calculus is to be able to describe expressions corresponding to theorems in classical logic. According to the Curry–Howard? isomorphism, lambda calculus on its own can express theorems in intuitionistic logic only, and several classical logical theorems can't be written at all. However with these new operators one is able to write terms that have the type of, for example, Peirce's law.

Semantically these operators correspond to continuations, found in some functional programming languages. ... μ-variables can be used to name or freeze arbitrary subterms, allowing us to later abstract on those names. " -- [4]

Intuitionistic type theory

Predicativity and impredicativity

"we cannot have a layered language with impredicativity in two layers: Girard proved this leads to inconsistency, with his analysis of System U; see the discussion by Coquand here. This is why in Luo’s Extended Calculus of Constructions (ECC), which is part of the type theory of Coq, only the base level of the universe hierarchy is impredicative, and the other levels are predicative." -- [5]


Canonical typeclasses vs implicits

Modular implicits. This paper discusses Haskell's typeclasses, and notes that in any one program there may be only one instance gives for a pair (type, typeclass) (the "canonicality" restriction). It then discusses Scala's implicits, and notes that they can serve a similar function as typeclasses, but without the canonicality restriction; at the cost of having to explicitly annotate functions that require an implicit (to allow the compiler to distinguish between places where the implict argument should be searched for at the site of the definition of the function using the function (which is what happens when there is no explicitly annotated implicit), verses places where the should be searched for at the site of the use of the function (which is what happens when there IS an explicitly annotated implicit)). They describe the addition of a variant of the latter systems (explicit 'implicits') in ML.

See also , particularly .



tel 777 days ago

I think the biggest advantage of parametric polymorphism comes from program analysis, not just increased capability. The notion of parametricity depends crucially on parametric polymorphism and is super useful. Parametricity is the property that says greater polymorphism means lesser variation in implementation. Highly parametric functions can be almost completely described by their types along meaning it provides incredibly high value as documentation.

The most common simple example is reasoning about the following pair of functions

    f :: a -> a           g :: Int -> Int

While (f > g) obviously, we can completely characterize the behavior of `f` while we know almost nothing at all about `g`. Up to non-termination, `f` must be `id` while `g` can be any recursive function on integers.

                g a = 3
    f a = a     g a = a * 2
                g a = if (a == 0) then 1 else a * g (a - 1)

If you hold in advance the triviality of this example, then we can already notice how much better documentation `f`'s type provides than `g`'s. We also can produce a list of properties and laws about `f`'s behavior that we cannot easily do with `g`. This means that I'm able to reason about `f` better and more confidently.


So we can take it up a notch. Consider Functors in Haskell. In order to avoid the full generality, let's just say Functor is the container interface and types like [], Set, and Map instantiate it. This gives them the function

    fmap :: (a -> b) -> f a -> f b

where `f` is specialized to the particular container of interest

    fmap :: (a -> b) -> [a]     -> [b]    
    fmap :: (a -> b) -> Set a   -> Set b  
    fmap :: (a -> b) -> Map k a -> Map k b      -- for some key type k

Now what can we reason about `fmap`s knowing only their type? Well, since the type `a` contained inside each container is unknown (or, to be more clear, any implementation of fmap must work for any type `a` and thus it's "unknown" in that we can't take advantage of information about it) then the only thing `fmap` is capable of doing is applying the first function to those types.

Furthermore, since every mention of `a` vanishes in the result of the function we know that each value in the input container must either be transformed by the input function or dropped.

    -- a specific example, not a general function
    fmap f [a, b, c] = [f a, f b, f b]

In this way, parametricity has pruned down the possibilities of functions that can implement `fmap` a great deal. If `a` or `b` were specialized to specific types then this would not be the case as we'd have a large variety of specific functions that could apply to each one of them.

    notFmap :: (Int -> Char) -> [Int] -> [Char]
    notFmap f [n1, n2, n3] = [f (n1 + 2), head (show n2), chr n3] ++ "hello world!"

Indeed, in order to ensure that type-checking `fmap`s behave correctly we say they must follow a few laws. First, they respect function composition and second they respect identity

    fmap g (fmap f x) == fmap (g . f) x
    fmap id x         == id x

And we only ever need to check the first one because the second one is already guaranteed by a more sophisticated parametricity argument.

    fmap f []     = []
    fmap f (x:xs) = f x : fmap f xs


The take away is that while you can definitely write something that "maps over containers" without using parametric polymorphism, you have a harder time taking advantage of parametricity to have the types provide guarantees about the safe behavior of functions. Types (and in particular type inference) can sometimes make things more expressive, but their primary power lies in making things easier to understand and reason about.

shachaf 775 days ago

The last part is backwards -- when F is a functor, the only law we need to check is the identity law, but checking the composition law isn't enough. For example,

  fmap _ _ = []

satisfies fmap f . fmap g = fmap (f . g), but not fmap id = id.

tel 775 days ago

Whoops, that's correct.

andolanra 777 days ago

I'm going to note that even if Go added some kind of parametric polymorphism at this point, it still wouldn't have the same advantages as parametric polymorphism in Haskell and related languages. In Go, you can branch on types at runtime, so in a hypothetical generic Go, you could write something which appears to only ever return its argument, but does something special if the argument is of a particular type:

    func<T> id(t T) T {
      switch t := T.(type) {
        case int: return t + 1;
        default:  return t;

The ability to examine on types at runtime can enable some interesting programs, but it also means that you don't get guarantees like you do in languages like Haskell. An example of this biting someone in practice is this blog post[1], where the author reasoned that because a library function expected an io.Reader, all it would do is use the Read method as exposed by Reader. In reality, though, the function was able to branch on the type and determine whether it also exposed a Close method, and called that if it could. That kind of feature destroys the useful static guarantees provided by parametric polymorphism in other languages like ML or Haskell.


tel 777 days ago

As a second addendum, while andolanra's comment is completely true, you can actually do type dispatch in Haskell using the Typeable machinery; however, the type system requires that you mark your types when this occurs.

    what :: Typeable a => a -> a

Here `what` is no longer guaranteed to be `id`. Instead, we can translate it like this

    what :: (a -> TypeRep) -> a -> a

where `TypeRep?` is, as it sounds, a representation of the type of a returned dynamically. This means that we can do careful coercions to, say, be `id` unless `a` is `Int` in which case we add `1`.

Suffice to say that the degree of guarantee breaking that this kind of code invites makes Typeable a very specialized library that's almost never recommended without a "unless you really know what you're doing" kind of warning.

" -- [6]




latent vs manifest types:

latent typing: types associated with values rather than variables


variable lifetimes

sometimes the word 'extent' is used synonymously with 'lifetime' [7]


variable scope: sometimes the word 'visibility' is used synonymously with 'scope' [8]

"The term "manifest typing" is often used with the term latent typing to describe the difference between the static, compile-time type membership of the object and its run-time type identity." [9]


CoC?: Calculus of Constructions CIC: Calculus of Inductive Constructions. "The Martin-Lof type theory can be seen as a fragment of CIC" [10] MLTT: Martin-Löf's Type Theory ; see HoTT?: Homotopy type theory; see

" Contemporary provers (built upon consistent Martin-Löf Type Theory, 1972) like Agda, Coq, Lean, F*, Idris are based on Barendregt and Coquand' CoC? with different flavors of infinity universe hierarchies and Calculus of Inductive Constructions. Some of them are automated and some are trying to be and general purpose programming languages with proving facilities. " [11]


"if the expression a can be rewritten into b, then we say that b is a reduct of a (alternatively, a reduces to b...)" [12]

confluence or Church-Rosser property of a rewriting system: for any object A in the system, if an object A can be rewritten (possibly after many steps) in multiple ways B,C, then there is some other object D such that both B and C can be rewritten (possibly after many steps) to D [13]

You can also say that just one particular object A within a rewriting system is confluent, without claiming that the whole system is confluent (if all objects in the system are confluent, then the system is confluent). [14]

normalization properties of rewrite systems:

the 'substitution' notation u[v/x] (replace x with v in u)

static/dynamic typing/untyped: dynamic typing: the values at runtime carry 'type tags' or signatures that identify their type. Static typing: at compile time, each variable is assigned a type (and each expression can be assigned a type). untyped: types are not known at compile time, and they are not identified at runtime, either.

gradual typing: some parts of the program can be statically typed while others are dynamically typed

a sound type system: one that provides some sort of guarantee [16]

structural/nominal types: structural "For example, a record with fields called x, y, and z might be automatically considered a subtype of one with fields x and y. With nominal typing, there would be no such assumed relationship unless it were declared somewhere." [17]

"Duck Typing..the dynamic type analogue of structural typing. It means that rather than checking a tag to see whether a value has the correct general type to be used in some way, the runtime system merely checks that it supports all of the operations performed on it." [18]


a very introductory link:

some notes on it:

one way of thinking of a type: a set of values

Statically typed languages are often said to assign types to variables and expressions (and hence to values, by virtue of which variable/expression they're in), while dynamically typed languages are often said to assign types to values. Dynamically typed languages often carry along a 'type tag' with each value to indicate its type. Many dynamically typed languages are still strict about types and can still have runtime type errors, eg many dynamically typed languages will give an error if you try to add an integer and a string, because at runtime the language examines the type tag of the integer and the string, and compares it to the type expected by the addition function, before executing the addition.

However some languages (both static and dynamic) are not very strict (note: the word "strict" has another definition in programming languages; i am using it here informally) about some or all types, which means that they will allow you to create a value of one type, and then use that value as if it were a very different type. If you have to explictly mark when you are doing this in the source code, that is "explicit coercion"; if the language permits it otherwise, that is "implicit coercion". For example, in C, you can assign an integer to a variable expecting a float; this will be converted automatically (implicit coercion). In C, you can also implicitly use a character in a string as if it were an integer. In C, you can also explicitly convert an integer to a pointer.

In C, when you convert an integer to a floating point, the language transforms the internal representation of the value. But when you use a character in a string as an integer, or convert an integer to a pointer, the language just uses the same raw internal representation, and just interprets that representation as a different type. This is often dangerous (in the example of converting integers to pointers, it allows you to easily create invalid pointers, leading to a program crash if these are dereferenced) and non-portable (because often different implementations will have different internal representations of values of one or the other type).

Statically typed languages (usually) do 'typechecking' at compile time, that is, they check to make sure that no type errors are possible at runtime. A type error is (mostly) when you try to assign a value to a variable (here the definition of 'variable' can be stretched to include function arguments as well as an invisible 'variable' that holds the values being returned by functions) that is incompatible with that variable's type. Typechecking mostly involves looking at the type of an expression that is going to be assigned to a variable, and then looking at the type ascribed to that variable, and making sure that any value of the former type is also in the latter type.

"Many statically typed languages require type declarations." 'Type declarations' are when the programmer tells the language the type of a variable explicitly in the source code via a 'type annotation', eg 'int i' or "int add(int x, int y)". However some statically typed languages infer some or all of the type declarations.

Gradual typing means that some parts of the program's source code are statically typed, and some are not; the values of these other variables and expression are dynamically typed (or even untyped). TypeScript? is an example [19] [20].

Some languages also have tools to annotate source code with types, but do not a standard way to typecheck these; for example Python 3. Some languages are dynamic but there exist tools that allow static typechecking for them (for example, Flow for Javascript).

Table of some behavior of some languages' type systems:

Language 	Static? 	Implicit Conversions? 	Rules Enforced? 	Memory-Safe?
C 		Strong 		Depends 		Weak 			Weak
Java 		Strong 		Depends 		Strong 			Strong
Haskell 	Strong 		Strong 			Strong 			Strong
Python 		Weak 		Depends 		Weak 			Strong
JavaScript 	Weak 		Weak 			Weak 			Strong

Generics, eg "we might create our own list type, MyList?, that can contain any type of data that we need to store. We want to be able to create a MyList? of integers, a MyList? of strings, etc., without making any changes to the source code of MyList?. The compiler should enforce those types: if we have a MyList? of integers, but accidentally try to insert a string into it, then the compiler should reject the program."

(a workaround for the lack of generics in Go: "create a MyList? of "empty interfaces": the MyList? can hold objects, but the compiler simply doesn't know what their types are. When we retrieve objects from a MyList?, we have to tell the compiler what its type is. If we say "I'm retrieving a string" but the actual value is an integer, we get a runtime error just like we would in a dynamic language.")

Haskell's type system also encodes information about whether a function can do I/O into its type.

Some type systems have a type Any; any value of any other type is also a value of type Any.

" With more powerful type systems, we can specify constraints at finer levels of granularity. Here are some examples, but don't get bogged down if the syntax doesn't make sense.

In Go, we can say "the add function takes two integers and returns an integer":

func add(x int, y int) int { return x + y }

In Haskell, we can say "the add function takes any type of numbers, and returns a number of that same type":

f :: Num a => a -> a -> a add x y = x + y

In Idris, we can say "the add function takes two integers and returns an integer, but its first argument must be smaller than its second argument":

add : (x : Nat) -> (y : Nat) -> {auto smaller : LT x y} -> Nat add x y = x + y "

Haskell has no equivalent of the Idris type above, and Go has no equivalent of either the Idris type or the Haskell type.

" Some static languages' type systems

Here's a very rough list of some languages' type system power in increasing order. This is meant to provide a rough sense of type system power and shouldn't be taken as an absolute. The languages grouped together here can be very different from each other. Each type system has its own quirks and most of them are very complex.

    C (1972), Go (2009): These aren't very powerful at all, with no support for generic types. We can't define a MyList type that can be used as a "list of ints", "list of strings", etc. Instead, it will be a "list of unspecified values". The programmer must manually say "this is a string" each time one is removed from the list, which may fail at runtime.
    Java (1995), C# (2000): These both have generic types, so we can say MyList<String> to get a list of strings, with the compiler being aware of and enforcing that constraint. Items retrieved from the list will have type String, enforced at compile time as usual, so runtime errors are less likely.
    Haskell (1990), Rust (2010), Swift (2014): These all share several advanced features, including generics, algebraic data types (ADTs), and type classes or something similar (respectively: type classes, traits, and protocols). Rust and Swift have more mass appeal than Haskell, with strong organizations pushing them (Mozilla and Apple, respectively).
    Agda (2007), Idris (2011): These support dependent types, allowing types like "a function that takes an integer x, and an integer y, where y is greater than x". Even the "y is greater than x" constraint is enforced at compile time: at runtime, y will never be less than or equal to x, no matter what happens. Very subtle but important properties of systems can be verified statically in these languages. Only a small minority of programmers learn them, but some become very excited about them.

There's a clear trend toward more powerful systems over time, especially when judging by language popularity rather than languages simply existing. The notable exception is Go, which explains why many static language advocates shun it as a step backwards. Group two (Java and C#) are mainstream languages that are mature and widely used. Group three looks poised to enter the mainstream, with big backers like Mozilla (Rust) and Apple (Swift). Group four (Idris and Agda) are nowhere near mainstream use, but that may change eventually; group three wasn't anywhere near mainstream use just ten years ago. "

some examples of errors that a static type system might help to catch at compile time are:


" "A type is a collection of possible values" is not quite true. It arises because sometimes programs evaluate to a value and type safety says that a program's type cannot change under evaluation. But, truly, a type classifies the program itself. For instance, the type getChar : [Teletype] Char classifies the action of getting a character by its side effect. ...if we could only imagine types by listing their inhabitants then we'd lose many interesting types. ...((some imaginable types have bad properties:)) consider a type Nope a b which can be introduced by both NopeLeft? :: a -> Nope a b and NopeRight? :: b -> Nope a b and eliminated by nope :: (a -> b -> r) -> Nope a b -> r. This is a satisfactory way to define a type, but it lets us write 'coerce'. "



@seanparsons seanparsons commented 7 days ago

In Haskell those types actually support more powerful functionality (like foldMap and traverse which are my main go to examples), which is often much more of a compelling reason to have a static type system than "Make sure this is an Int".



some benefits of static typing:



naasking 3 days ago [-]

Decent overview of many concepts, but the opening line isn't strictly true:

> A type is a collection of possible values.

A type is a proposition, and "This binding has one of these possible values" is merely one type of proposition. So a type is much more powerful than a simple set-based interpretation, even though this is how most people think about it.

For instance, in Haskell you can encode a region calculus that ensures prompt reclamation of file handles and other scarce resources [1]. The relations between nested regions to ensure correctness seems awkward to interpret as a set, but it's straightforward if you just think of types as a relation.



rntz 3 days ago [-]

(Static) types may be seen as propositions, but this is just one interpretation of them. We can also interpret types as sets (although this doesn't work out very well except in very simple languages), as domains (most commonly), as cpos, as relations over closed terms (PERs), and so forth.

The types-as-propositions interpretation is a particularly useful one because it lets us take ideas from the field of formal logic and apply them to programming. But it is only one interpretation, and not a definition of what types are.

I don't think it's possible to give "type" a good, short definition. For me John Reynolds' definition comes closest: "Type structure is a syntactic discipline for enforcing levels of abstraction." But that's rather a mouthful, and not a good introduction to what types are.

> A type is a proposition, and "This binding has one of these possible values" is merely one type of proposition.

The type that an expression or variable has does not correspond to the proposition that that expression has a certain set of possible values! You're mixing the object-level (the language & its types) with the meta-level (assertions about the language). This is a total misunderstanding of Curry-Howard. Rather, the values an expression might take on can be interpreted as proofs of the corresponding proposition.


eli_gottlieb 3 days ago [-]

>The types-as-propositions interpretation is a particularly useful one because it lets us take ideas from the field of formal logic and apply them to programming. But it is only one interpretation, and not a definition of what types are.

Ehhh, it kind of is a definition. Think of it this way: a type system is a topology we lay over programs, with open sets representing observable (using finite computation) properties and closed sets representing deniable (using finite computation) properties. The finest such topology possible is the Scott topology, which can distinguish every single observable value and deniable property from every other: it encodes all the information you can have about a program without solving the Halting Problem.

Type systems, then, are coarser topologies over programs than that one, which group observably different expressions and traces into the same open sets, which are then called "types". When those types are guaranteed not to contain _

_, the divergent program, they can also be considered topological spaces, with the programs in them acting as points in the topological space.


naasking 3 days ago [-]

> (Static) types may be seen as propositions, but this is just one interpretation of them. We can also interpret types as sets (although this doesn't work out very well except in very simple languages), as domains (most commonly), as cpos, as relations over closed terms (PERs), and so forth.

Right, I wasn't advancing the propositional definition as the only one, merely as one that sufficiently captures the generality, because the article's current definition of types-as-sets is insufficiently general.


catnaroek 3 days ago [-]

Under HoTT?'s take on propositions as “types where any two inhabitants are equal”, “types as propositions” is even less general than “types as sets”.

As I stated in a comment on that gist, what types denote varies from one type system to another:

reply "




Peaker 125 days ago [-]

I can try to explain why HM type inference ((HM = Hindley-Milner)) and unification are related.

If you take just the lambda calculus where you have:

1) Lambdas that bind variables

2) Usages of bound variables

3) Function applications

The inference rules for the first 2 are simple & straight-forward and require no unification.

For 1 (lambdas) - you create a fresh type variable (e.g: 'a') for the parameter type and infer the lambda body (e.g: 'T'), and the lambda's type is then 'a -> T'. While inferring the lambda body, you also pass the information that the parameter bound by the lambda has-type 'a'.

For 2 (variable usage), you just use the known type information that was passed down by the lambda inference.

For 3 (application), you need unification. Function application is between two subexpressions (e.g: 'f' and 'x'). You infer both of these recursively, and get 2 types (e.g: 'fType' and 'xType'). But you also know that 'fType' must look like: 'xType -> resType' because it's being applied to 'x'. You also know that 'resType' is the result of the entire application. So you have to unify 'fType' with 'xType -> resType'.

This is why inference relates to unification: You have 2 sources of information for 'fType' and 'xType' -- the recursive inference AND the fact they're being applied together.

This unification, by the way, is the only way that type information is learned for parameter types. If all parameter types are always specified (as in, e.g: C++) then formally, there's no type inference at all. So what C++ calls "local type inference" is formally just type checking. "


book recommendation: "Introduction to Functional Programming through Lambda Calculus" by Michaelson, rec. by [21]


type inference

"standard type systems such as Hindley-Milner and inferencing algorithms such as Damas-Milner, sometimes known as Algorithm W. "


pedagogic HM type inference implementations:

OCaml's type checker vs. Damas-Milner type inference:

" Damas-Milner type inference uses an "occurs check" to prevent the type equation solver from unifying a type variable with a compound type expression containing the same variable. The "obviously correct" way to perform this check is eagerly - as soon as possible.

However, performance-wise, delaying the occurs check can speed up the inference process - and this is exactly what OCaml does. The downside is that the resulting algorithm is quite involved, because the solver can now run into type expressions containing cycles, so naively recursively walking type expressions can cause an infinite loop. " [23]


" eduardoleon commented 7 days ago • edited

What types denote varies from one type system to another. For example, in the simply typed lambda calculus, types denote sets. In a language with general recursion, types as collections of expressions denote domains. In homotopy type theory, types denote higher groupoids.



union types

discriminated union types

"sum types"


" AndrasKovacs? commented 7 days ago • edited

@twblalock in any programming language, we usually verify statically unknown data with runtime checks, and do some error throwing/handling if our data is not OK. We do the same in Idris/Agda, except we have runtime checks that return proofs of properties. If the data is OK, the checks return proofs of OK-ness, and we pass the proofs along to functions that require them. If the data is not OK, the checks return proofs of not-OK-ness.

For example in Idris one would generally use a comparison function with a type like the following:

compare : (x : Nat) -> (y : Nat) -> LT x y \/ LT y x \/ (x = y)

where the return type is a tagged union with three possible cases, each returning a proof. This function itself can be considered verified, since its return type specifies all relevant properties of the semantics (given that the definitions of LT x y and x = y also have the correct meaning). In contrast the non-verified comparison function has type:

data Ordering = LT


compare : Nat -> Nat -> Ordering

It's non-verified because it has implementations with the wrong semantics:

compare x y = LT -- ignores the arguments

In contrast, all implementations of the verified compare must actually compare the arguments. This can be generalized to many more properties; at the extreme end we have things like the verified CompCert? C compiler which has all the non-concurrent ANSI C syntax and semantics specified in types.

Note that it's usually possible to specify erasable or "irrelevant" proofs that don't appear in generated code, therefore ensuring that our verified code has the same runtime performance as unverified code. In some cases it's necessary or more convenient to use proofs that appear at runtime though. With dependent typing it's common to use data structures that store "normal" data but also act as proofs because they have certain structural properties or invariants.



static type systems generally require the compiler to be able to prove that there are no type errors in the program. This can constrain the expressiveness of the language, and can also make it harder to write programs (since some ways of writing the program may allow the compiler to see how to prove this, while other mostly equivalent ways might not).


"heterogeneous data structures are usually much easier in dynamically typed languages" -- [24]


" Important difference here is that there are different types of conversion. I made up those terms myself, I suppose there are official names for them (in Category theory or similar):

Symmetric, defined for all values conversion. Example: Integer to Float. Any Integer can be converted to Float and there are no two Integer values which can be converted to the same Float value.

1 -> 1.0 1.0 -> 1

Asymmetric, defined for all values conversion. Example: Float to Integer. Any Float can be converted to Integer, but there is no one-to-one correspondence. For example we can take round-down as type conversion

1.0 -> 1 1.1 -> 1 1 -> 1.0

Not defined for all values conversion. Example: String to Integer. Some strings can be converted to Integer, but some can not.

StringToInt? :: String -> IntOrNothing?

"1" -> 1 "a" -> Nothing " --


"As a marketing term ((type safety)) tends to refer to the property of a language where type information reduces the likelihood of writing "unsafe" programs of some form or another. Typical notions of "unsafety" which are avoided include: null pointers, memory safety, implicit coercions, array out-of-bounds mistakes, encapsulation breaking, and more." [25]


"type erasure"

when there is no way to dynamically (at run-time) introspect on types (that is, to introspect on type information that is only available at runtime, not at compile-time)


"Often, a good way to detect when internalized types are static (and thus not tags or classes) is that they are erased at runtime---at least by default[^typeable]. This means that there is no way for the program's behavior to change dynamically according to types. Essentially, this outlaws reflection and, in simple contexts, is usually A Good Thing."


" To invent a distinctive word, a "formal type" is a description of the statics of a language. It's assigned to a fragment of abstract syntax and it indicates the kind of value that the abstract syntax represents. For example, both 3 and 3+3 considered as abstract, static syntax might be given the same type, Natural, because they each take values as Natural numbers. For the remainder of this section, the word "formal" may be elided if the meaning is clear.

Formal types give rise to the technical notion of "type safety". A language and its type system together are type safe if the dynamics of the language respect two properties: preservation and progress. Language dynamics preserve a type if as a program "evolves" at runtime its type does not change. For instance, we might consider evaluation of the abstract syntax 3+3 as stepping like

3+3 2+4 1+5 0+6 6

If this evolution does not change the type then the dynamics preserve the type. A dynamics has progress if any abstract syntax either reflects a value of its type or can be reduced. In order to talk about this we thus have to give a language a notion of base values which every program attempts to compute. For instance, we can talk about the values of Natural as the natural numbers 0, 1, 2, 3, .... Thus progress entails that any program typed as a Natural either is one of those numbers or can continue to evolve.

When a language---its statics, type system, dynamics, and values together---has preservation and progress then it is type safe. What this means is that given the type of a program we know something very clear about the eventual value the program will compute. " -- [26]


" Advanced forms of formal types can involve their own language for merely describing the types themselves. This often involves the notion of "type constructors" such as List which are applied to other types like Integer to form complex types like List Integer, i.e. lists of integers. Usually these languages are described as "type languages" while the language being described by the types is the "term language". So, [1,2,3] is a value in the term language which has the formal type List Integer which is written in the type language.

As the complexity of the type system itself grows it may begin to resemble formal language all on its own and it then becomes desirable for the "types to have types". This second level of typing is sometimes called "kinds". The third level is sometimes called "sorts".

It's can be desirable for the type languages at each higher level to be simpler and simpler so that there isn't an infinite stack of typing levels to handle. This provides a tradeoff on the expressiveness of the type language. It's also often desirable for these languages to not be Turing Complete as it's important to be able to say two types are equivalent (and thus their programs take the same values). Turing Completeness would cause equivalence to be undecidable in general and mean that type checking could sometimes fail to terminate.

On the other hand, some languages have embraced unification of the types and terms. In other words, the very language which the types describe is indentical to the language of the types. This produces an infinite hierarchy of types and all the complexities involved therein. Typically these languages have "dependent types" which allow information to bleed between the layers of types. It's also typical that these languages restrict Turing Completeness at every layer for all the reasons discussed previously.

As a final note, the complexity of type languge is usually related to the "interestingness" of types. Very boring type languages will have only very weak notions of "type safety" while very complex type languages will usually include both boring notions of type safety and extremely interesting ones. At the most complex end, dependently typed languages typically have types which can express arbitrary mathematical theorems and the programs which inhabit these types an be seen as proofs of validity of the theorem.

More practically, complex type languages can capture information about complex abstraction patterns being used in the term language. Typically, a term language with a weak type language will trade off simplicity and speed of analysis for loss of abstraction power. " -- [27]

" ...many languages express notions of reflection which allows a language to get ahold of the type of a value at runtime. For instance, in Javascript we can write

> typeof(3) "number"

to reflect on the type of a value like 3.

The existence of reflection implies that the runtime of a language maintains fragments of some type language attached to each value. In formal terms these are sometimes called "tags" to differentiate from formal types, but in dynamic language communities they are referred to simply as types. " -- [28]


a literature review of empirical studies on whether type systems aid productivity. "The summary of the summary is that most studies find very small effects, if any. However, the studies probably don’t cover contexts you’re actually interested in. ":



Row polymorphism

Let's say you had structural subtyping. Subtyping implies the rule of subsumption; if a term x has type T1, and T1 is a subtype of T2, then x also has type T2. In sequent notation,

CONTEXT |- x:T1    T1<:T2
    CONTEXT |- x:T2

Structural subtyping means that you apply types to records as follows. The 'type' of a record is a list of fields that it is guaranteed to have, and the types of those fields (todo: is this sentence correct?). So, consider these two types:

Anyplace where you need a T2, you could substitute a T1, because the reason you would need a T2 is because you are accessing field 'a', and T1 has an 'a'. So, with structural subtyping, we assume that T1 is a subtype of T2 (todo: ok, but what is the formal defn of structural subtyping?).

Now consider a function with type signature f: T2 -> T2. Let's say that this function takes something of type T2, mutates it, and returns the mutated result. What happens if you give it something of type T1? With structural subtyping, it would be legal for the implementation to simply upcast this to T2 (stripping any fields which aren't in T2), mutate that, then return the result. For example, with

let f x = x with {sum: x.a + x.b}
let answer = f {a: 1, b: 2, c: 100}

then under structural subtyping the 'answer' might be:

{a: 1, b: 2, sum: 3}

Note that field 'c' has been stripped! But this is legal because the only guarantee the type system gave for the return type was that it would be of type T2. But what we may have wanted was:

{a: 1, b: 2, c: 100, sum: 3}

Row polymorphism is a way to make the type system guarantee that these 'extra' fields aren't stripped. Row polymorphism represents records as 'rows' with an implicit polymorphic variable (we'll call it "rho") which contains any extra fields. So the type of "f" in the example above would be:

f :: {a: Number, b: Number | rho} -> Number



Typed lambda calculus

(i think the following is only one of many typed lambda calculi? is this one "System F"? i'm not sure)

eg the polymorphic identity fn:

λ(a : *) → λ(x : a) → x

(so, the first argument is a type, 'a'; the second is a value in that type; then, that value is returned)

the type of the previous function is:

∀(a : *) → ∀(x : a) → a

The first parts of can serve as a quick tutorial to typed lambda calculus (if you already know lambda calculus). The rest of it is a tutorial for 'annah', a typed lambda calculus-based (and Boehm-Berarducci encoding-based) language.


Henk: A Typed Intermediate Language section 3.3 gives the typical usage of the words: term, type, kind, sort.

(todo explain more)

" Unfortunately it is very hard to avoid using the word \type" for multiple purposes. In particular, note the di erence between sort Type and kind * ((* is usually pronounced 'type')). For example, both Int : * and Tree : * -> *are of sort Type; but only Int is of kind type (Int : *). "


should note somewhere the metamathematical roots of type theory in set-theoretic foundations and Russell's paradox


"...type "soundness" means that the type checker will always complain when your program is invalid. You can't write an invalid program with a sound type system." [29]


my question: is there a duality between (union or discriminated union?) types and subclasses? That is, is saying "Y is a subclass of X, and Z is a subclass of X" like saying one of "X is a union type of Y and Z" (this one would be type erasing, i guess) or "X is a discriminated union type of Y and Z"?



Issues with typeclasses

" Typeclasses aren't a pure win, they have downsides too:

There are probably more issues in the literature, but these are ones I've been banging my head against recently.

Typeclasses are a great idea and a useful addition to the PLT toolbox, but including them in a language isn't an automatic improvement: it's more of a design choice to steer the language down a particular path, when other languages may prefer other paths. Similar to using strict/non-strict evaluation, or exceptions, or macros.

" chriswarbo, -- [30]


Haskell typeclasses vs ML modules


    Each datatype or newtype declaration has an effect -- namely, it creates a fresh type constant, different from all others
    The typeclass mechanism in Haskell makes this effect visible.
      newtype Foo = Foo Int
      newtype Bar = Bar Int
      instance Show Foo where
          show _ = "hi"
      instance Show Bar where
          show _ = "bye"

Note that you can't just substitute Foo for Bar because they can have different Show instances. 3. As a result, if you permit type declarations in expression context, then evaluation order becomes observable. This would make Haskell impure (i.e., (\x -> e) e' would not equal [e'/x]e). 4. Because of this, type declarations are only permitted at toplevel scope, and not in expression contexts. 5. A module is basically the same thing as a top-level namespace. 6. First-class modules let you stick a module into any place an expression could appear. 7. This reintroduces the problem that the toplevel/expression distinction was invented to solve. -- " [33] "Global coherence of type class instances is anti-modular -- most global properties are.

That said, you can have all the same syntax as Haskell type classes and as long as you don't ask for coherence, instance visibility being restricted by module boundaries is completely okay. You can also play around with named instances and a syntax for explicitly passing a named instances. You get something that looks a lot like Scala implicits, really.

But, without coherence, the current implementations of things like (Hash)Map and (Hash)Set become subtly incorrect. The structure of the values reflects the type class instance(s) that were previously used, and will only be correctly navigated if the same type class instance is used in the future. In this past, using incorrect instances of Typeable this could be used to write unsafeCoerce and violate memory safety of the language." -- [34]

Methods to extend ML modules to support typeclass-like stuff:

Methods to extent Haskell typeclasses to support ML modules-like stuff:


ML modules



" 1ML — Core and modules united

This is an amazing paper that I've been anticipating ever since I read the abstract about half a year ago.

    ML is two languages in one: there is the core, with types and expressions, and there are modules, with signatures, structures and functors. Modules form a separate, higher-order functional language on top of the core. There are both practical and technical reasons for this stratification; yet, it creates substantial duplication in syntax and semantics, and it reduces expressiveness. For example, selecting a module cannot be made a dynamic decision. Language extensions allowing modules to be packaged up as first-class values have been proposed and implemented in different variations. However, they remedy expressiveness only to some extent, are syntactically cumbersome, and do not alleviate redundancy.
    We propose a redesign of ML in which modules are truly first-class values, and core and module layer are unified into one language. In this "1ML", functions, functors, and even type constructors are one and the same construct; likewise, no distinction is made between structures, records, or tuples. Or viewed the other way round, everything is just ("a mode of use of") modules. Yet, 1ML does not required dependent types, and its type structure is expressible in terms of plain System Fω, in a minor variation of our F-ing modules approach. We introduce both an explicitly typed version of 1ML, and an extension with Damas/Milner-style implicit quantification. Type inference for this language is not complete, but, we argue, not substantially worse than for Standard ML.
    An alternative view is that 1ML is a user-friendly surface syntax for System Fω that allows combining term and type abstraction in a more compositional manner than the bare calculus.

Basically, it unifies module and core languages, without dependent types, while supporting type inference without requiring more type annotations than OCaml (i.e. only on module or higher-rank types). It leaves applicative functors for future work.

An extended technical report and a prototype interpreter are available here. " -- [37]

"What Andreas did is that he made modules into first-class objects while keeping the type system decidable; in essence, if I understand his paper correctly, he makes sure that types are not "expanded" when type-checking, i.e. abstract types cannot be instantiated by module signatures (which might contain yet other abstract types, which would need to be instantiated, and so on, exploding the type-checking system)." -- [38]


Linear types:

" In a nutshell, a linear type lets you prove that a variable is used exactly once....due to the aliasing properties of an imperative language like C# (especially for heap structures with possible cycles between them), the simple and elegant model of strict linearity is hard to make work.

Linear types themselves also aren’t very commonly seen in the wild, and are mostly useful for their mathematical and proof properties. If you go looking, you will find examples, however. More than real syntax in real languages, linear types have been hugely influential on subsequent innovations in type systems that also impacted us, such as affine and uniqueness types. " -- [39]. Paper: Phillip Wadler’s 1990 Linear types can change the world!


Effect typing:

Type and Effect Systems.




Separation Logic... "is a formal technique for proving the disjointness of different parts of the heap" [40]. Concurrent form: . Midori team's proof technique:

--- Uniqueness

--- :

" keyof and Lookup Types

Many libraries take advantage of the fact that objects are (for the most part) just a map of strings to values. Given what TypeScript? knows about each value’s properties, there’s a set of known strings (or keys) that you can use for lookups.

That’s where the keyof operator comes in.

interface Person { name: string; age: number; location: string; }

let propName: keyof Person;

The above is equivalent to having written out

let propName: "name"


This keyof operator is actually called an index type query. It’s like a query for keys on object types, the same way that typeof can be used as a query for types on values.

The dual of this is indexed access types, also called lookup types. Syntactically, they look exactly like an element access, but are written as types:

interface Person { name: string; age: number; location: string; }

let a: Person["age"];

This is the same as saying that n gets the type of the age property in Person. In other words:

let a: number;

When indexing with a union of literal types, the operator will look up each property and union the respective types together.

Equivalent to the type 'string

let nameOrAge: Person["name"

This pattern can be used with other parts of the type system to get type-safe lookups, serving users of libraries like Ember.

function get<T, K extends keyof T>(obj: T, propertyName: K): T[K] { return obj[propertyName]; }

let x = { foo: 10, bar: "hello!" };

let foo = get(x, "foo"); has type 'number' let bar = get(x, "bar"); has type 'string'

let oops = get(x, "wargarbl"); error!


" Mapped Types

Mapped types are definitely the most interesting feature in TypeScript? 2.1.

Let’s say we have a Person type:

interface Person { name: string; age: number; location: string; }

Much of the time, we want to take an existing type and make each of its properties entirely optional. With Person, we might write the following:

interface PartialPerson? { name?: string; age?: number; location?: string; }

Notice we had to define a completely new type.

Similarly, we might want to perform a shallow freeze of an object:

interface FrozenPerson? { readonly name: string; readonly age: number; readonly location: string; }

Or we might want to create a related type where all the properties are booleans.

interface BooleanifiedPerson? { name: boolean; age: boolean; location: boolean; }

Notice all this repetition – ideally, much of the same information in each variant of Person could have been shared.

Let’s take a look at how we could write BooleanifiedPerson? with a mapped type.

type BooleanifiedPerson? = { [P in "name"

"age""location"]: boolean

Mapped types are produced by taking a union of literal types, and computing a set of properties for a new object type. They’re like list comprehensions in Python, but instead of producing new elements in a list, they produce new properties in a type.

In the above example, TypeScript? uses each literal type in "name"

"age""location", and produces a property of that name (i.e. properties named name, age, and location). P gets bound to each of those literal types (even though it’s not used in this example), and gives the property the type boolean.

Right now, this new form doesn’t look ideal, but we can use the keyof operator to cut down on the typing:

type BooleanifiedPerson? = { [P in keyof Person]: boolean };

And then we can generalize it:

type Booleanify<T> = { [P in keyof T]: boolean };

type BooleanifiedPerson? = Booleanify<Person>;

With mapped types, we no longer have to create new partial or readonly variants of existing types either.

Keep types the same, but make every property optional. type Partial<T> = { [P in keyof T]?: T[P]; };

Keep types the same, but make each property to be read-only. type Readonly<T> = { readonly [P in keyof T]: T[P]; };

Notice how we leveraged TypeScript? 2.1’s new indexed access types here by writing out T[P].

So instead of defining a completely new type like PartialPerson?, we can just write Partial<Person>. Likewise, instead of repeating ourselves with FrozenPerson?, we can just write Readonly<Person>! "


not sure if this is right but, following whateveracct's and my correction to it:

" A <: B means "A is a subtype of B" F[_] refers to a unary type constructor. If F[_] is "covariant", it means that if A <: B, then F[A] <: F[B] " If F[_] is "contravariant", it means that if A <: B, then F[B] <: F[A]

You can see that the 'co' and the 'contra' prefixes make sense, because in the 'covariant' definition, the ordering of A vs. B matches F[A] vs. F[B], whereas with contravariant, the ordering of F[A] vs. F[B] is flipped.

" Examples of covariant type constructors are List and Functions in their output. An example of a contravariant type constructor is Functions in their input type."

And consider the Substitutability principal [41] that if A <: B then you should be able to put an A anywhere a B is demanded.

For an example of covariance, dogs are animals, so say Dog <: Animal. Consider a function type Ft1, the type of functions whose input type is Int and whose output type is Dog; and a function type Ft2, the type of functions whose input type is Int and whose output type is Animal. If you need a function of type Ft2, it's okay to provide a function whose output type is Dog, because anything that outputs only Dogs is always outputting Animals, so we say Ft1 <: Ft2. So say the type constructor T is defined so that T(TypeX?) = the type of functions whose input type is Int and whose output type is TypeX?. So we have a covariant relationship for function outputs: because Dog <: Animal, T(Dog) <: T(Animal).

For example of variance, consider a type constructor T2 defined so that T2(TypeX?) = the type of functions whose input type is TypeX? and whose output type is Int. If you need something of type T2(Dog), that is, you need a function which can accept Dogs as input, it's okay to provide something of type T2(Animal), that is a function that can accept any Animal as input, because if it can handle any Animal it will certainly be able to handle a Dog, because Dogs are Animals. That is, because Dog <: Animal, T2(Animal) <: T2(Dog).


" generics is orthogonal to polymorphism. Generics enables you to declare a parameter to a function as being of type T, where T isn't specified at the time the function is written. " -- [42]


great concise definitions and examples:


" FWIW, not doing type inference across across function boundaries is a semi-artificial limitation. Haskell can do it just fine despite being statically typed, and although Haskell has very different implementation strategies, that's not related to type inference. C++ now sort of does it with `auto`, but only does 'forward reasoning' - i.e. it can deduce the type of an operation's result from the types of its operands, but not vice versa. Rust has true type inference, but it eschews global type inference (across functions) mainly because of a tendency to cause hairy/confusing errors in larger programs. (There are also compilation time concerns, but they're not the primary reason AFAIK.) I suppose you can say that dynamically typed languages are easier to understand when they go wrong, compared to hairy Haskell errors - but I'm not sure to what extent that's actually true, as opposed to functional programmers just tending to use more complex types. " -- [43]