# proj-oot-ootTypeNotes3

--

http://www.quora.com/What-are-the-most-common-type-inference-algorithms-in-use-today

" Jesse Tov, likes programming languages. 20 votes by Eugene Marinelli, Jackson Davis, Mike Dillon, (more) We should probably distinguish “type inference algorithms” from “(usually) inferred type systems.” I’m going to start with the latter, since I suspect it’s how most people would understand the question.

Of all things that we could characterize as type inference, the fairly lame stuff going on in Java and C# is probably more common. That aside, for global type inference—that is, where you write no types in your program (unless you want to) and are guaranteed that the type system admits only typeable terms—as far as I know descendants of Hindley/Damas-Milner are the only game in town.

The actual type systems in use are, for the most part, not pure HM but extensions. Of the major HM-based languages, Standard ML is still pretty close, but other languages have moved away in different directions. Ocaml has extended HM with subtyping and row types, whereas Haskell has extended HM with type classes, both of which involve type constraints. In that sense, the most popular HM variant might be Sulzman et al.’s HM(X), which is HM parameterized by a constraint system, though Ocaml and Haskell are not usually presented in that framework. Ocaml and Haskell have also both been extended with higher-rank polymorphism. So we might say that another commonly inferred type system is one for which type inference is undecidable: System F_\omega. Haskell’s FPH and whatever it is Ocaml is doing obviously don’t infer F_\omega types, but they do impressive type inference while dealing with a large portion of that. Along with FPH are a variety of other three-letter acronyms such as MLF, HMF, and HML, all of which are attempts to do HM-style type inference for larger subsets of System F.

The other interesting thing happening in type inference, and I don’t know how common it is, is local type inference. Scala is probably the most popular language right now with non-trivial local type inference. Scala’s type system is not globally inferable, but local type inference helps by eliminating the need for most type instantiation and some type annotations on bound variables. (Typed Racket and the language I work on both use local type inference as well, though I don’t think they qualify as “common.” But what I take away from this is that for people who want type systems to do increasingly difficult things, local type inference may be increasingly attractive.)

As for type inference algorithms, that’s harder to answer because I don’t actually know exactly what algorithms everyone is using. There are several distinct algorithms for pure HM, and some are more suitable than others for various extensions.

Damas and Milner (1982) first gave their type system in a non-algorithmic form. In particular, it doesn’t specify where to apply the generalization and instantiation rules. Their key contribution is Algorithm W, an algorithmic version of their type system that is sound and complete. That is, if W returns a result then there is a valid derivation of that result in their type system, and conversely, if a term is typable in their type system then W finds the principal (i.e., most general) type scheme. There are other algorithms for this same type system. Notably, Algorithm J is also sound and complete for HM, and many people consider it easier to understand.

However, I suspect that neither W nor J is the most common (HM-style) type inference algorithm at this point. Both Algorithms W and J call Robinson’s algorithm for first-order unification as a “subroutine.” Rather than interleave constraint generation and unification, it’s possible to separate type inference into two phases, the first of which traverses the term and generates a constraint, and the second solves the constraint by unification. I believe the idea originated with Wand (though I can’t figure out where). The best explanation of this that I know of is Pottier and Rémy’s chapter “The Essence of ML Type Inference,” in Advanced Topics in Types and Programming Languages (Pierce).

I mentioned earlier that inferred type systems today tend to be extended in ways that fall into the HM(X) framework, such as subtyping and type classes. Several of the first-class polymorphism extensions are in terms of constraints as well, though I don’t know whether that makes them candidates for the X in HM(X). While the original HM(X) paper uses an extension of Algorithm J (if I recall correctly), Pottier and Rémy recast it using the constraint solving approach, and it makes a lot of sense that way. I suspect that this new approach is more common (or becoming more common) than algorithms like W at this point. So it may be that the most common (non-trivially) inferred type system at this point is (instantiations of) HM(X), and the most common type inference algorithm is Pottier and Rémy’s algorithm for HM(X).

Via Christopher Thorpe. Comment • Share • Thank • Report • 10 Oct, 2010 "

---

http://lambda-the-ultimate.org/node/1115

what i call 'assumptions' he calls 'contracts'. maybe 'contract' is the better name.

http://www.cis.upenn.edu/~bcpierce/papers/harmful-mfps.pdf

--

oo, should read this:

also "Dan Licata pointed me to A Tutorial Implementation of a Dependently Typed Lambda Calculus by Andreas Löh, Connor McBride?, and Wouter Swierstra which is similar to this one. It was a great inspiration to me, and you should have a look at it too, because they do things slightly differently: they use de Bruijn indices, they simplify things by assuming (paradoxically!) that , and they implement the calculus in Haskell, while we are going to do it in OCaml."

hmm, i think i should read that first:

http://www.andres-loeh.de/LambdaPi/LambdaPi.pdf

--

---

wikipedia's list of major type systems

--

ok this looks crazy: http://www.cs.indiana.edu/~sabry/papers/rational.pdf

--

success typings (i dont understand this yet but i think it may be what we want; hmm actually mb we want to report BOTH conventional typings, and success typings?):

http://www.softlab.ntua.gr/Seminar/2006/presentations/sagonas.pdf

--

" F# has _ for placeholders in types. That works really well. Reply

    Liyang HU / Oct 27 2010 5:14 am
    @Ganesh: As does Agda.
augustss / Oct 27 2010 7:09 am
    The same has been suggested for Haskell a long time ago, but nobody implemented it.
Once you have _ you start wishing for more. For instance, _ -> _ means any function, but what if you wanted both _ to stand for the same type? Maybe _a -> _a?

...

UHC supports partial type signatures, although it uses different surface syntax: … -> …, resp %a -> %a for the examples above. It’s already there for a long time, so it might be a good idea to bring it in line with the notation Agda uses and to polish it up a bit. See also the doc http://www.cs.uu.nl/wiki/bin/view/Ehc/UhcUserDocumentation, look for partial type signatures.

"

---

ah, a better soln to the haskell variadic problem:

instead of saying (as i proposed but then discarded above) that you can have polymorphic return arguments, but they must be of the same kind, just say that you can have polymorphic return arguments, but for the purpose of type checking everything else, the type checker will pretend it can't infer which one you are using, even if it can.

e.g. if you have a function 'empty', which can return an empty list or an empty dict, you can't call any list-specific methods on this that arent present in dict

hmm but wouldn't that cause it not to typecheck at all if called in a context in which a dict is required?

hmm the idea there was to check kinds when the kind was syntactically important, but still to allow, say, some sort of weird database store that could store functions of arbitrary arity

hmm could make the entire typechecker only go forward, not backward, when doing polymorphism (e.g. an initial fwd-only pass to resolve polymorphism, then both directions passes to propagate constraints and check type safety, but polymorphism isnt allow to be resolved anymore).. seems extreme..

could allow backwards polymorphism but only when there is an explicit type annotation sufficiently close to the source.. i guess the variadic haskell issue (the issue is just that it is confusing, not incorrect) happens even if we recursively follow one function, and then also the call site.. so maybe just do local type inference (not even module-local)?

---

so, a conundrum remains:

---

one idea i had was not to allow backwards ad-hoc polymorphism (that is, ad-hoc polymorphism where a type variable appears in the output of the function) where the cases for the output type variable had different arities.

--- i guess one way to state the key problem with http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function is:

(background:)

(1) the type checker saw 'sumOf 1 4 7 10 :: Integer'. In inferred that (sumOf 1 4 7) must be of type Integer -> Integer. It inferred that (sumOf 1 4) must be of type Integer -> Integer -> Integer... it inferred that sumOf must be of type Integer -> Integer -> Integer -> Integer -> Integer.

(2) it said, well sumOf is actually of type Integer -> r. Can r be Integer -> Integer -> Integer -> Integer?

(3) it said, maybe; although one instance for SumRes? has r = Integer, another one has r = a -> r' (where a is in the Integral typeclass, and r' is in the SumRes? typeclass). This matches if a = Integer and r' = Integer -> Integer -> Integer. Integer is in Integral, so i just need to verify that we can have Sumres r'. (recurse into step 3)

The problem, then is that: (1) the typechecker looked at how many times the function sumOf and its descendents were actually being applied in the code, (2) the typechecker used backwards inference to propagate this info backwards, (3) the typechecker used this information to decide which ad-hoc polymorphic implmentation of sumOf to use, (4) the typechecker recursed on this

As a result, the number of times the typechecker recursed affected what function was actually produced in the code, and this number was dependent not just on the name of the type used, but on the structure of the source code in the 'future'.

Perhaps all four of these need to be present for this to get confusing. It's not just that (2) backwards inference was used for (3) ad-hoc polymorphism; that's what a generic 'default' function would do, but that's fine. It's also that (1) the structure of code, not just the names of types, was used, and (4) this structure affected not just the internal computation of the typechecker, but also which computation would actually be performed.

I guess what i really find objectionable here is that i want the user to be able to think of the typechecker as a fancy, alien black box that proves some theorems about their code to catch bugs. I don't want the user to have to understand what the typechecker is thinking in order to figure out what a piece of code that they are reading will do. If the typechecker's thought pattern affects which implementation is chosen in ad-hoc polymorphism, then the user has to understand what the typechecker is thinking in order to determine what a piece of code will do.

And yet, i do want ad-hoc polymorphism. I want to be able to say "plus(3,5)" and "plus(3.0,5.0)", and let the compiler figure out that in the former, it should interpret 'plus' as referring to the implmentation 'plus-integers', and in the latter, as referring to the implementation 'plus-floats'. I don't want to have to write "plus-integers(3,5)" to resolve this ambiguity.

Is the problem with backwards inference? Perhaps, as long as the inference pattern of the typechecker matches the control flow of the code, there is nothing 'extra' for the reader to understand, or rather, they can understand the compiler's ad-hoc polymorphism choices by pretending that at runtime it is using dynamic typing and looking at what type a value actually has.

But, as noted above, i do want backwards inference sometimes. I want to be able to define e.g. a function 'arrayinit(x)', of polymorphic type [A], which takes a length x and returns an array of length x whose values are of type A, and whose actual value is the default value of type A, whatever that default value is. To do this, i need to be able to call an ad-hoc polymorphic function 'default', which returns the default value, and which is polymorphic in the type parameter A of arrayinit, even though the typechecker might infer which type A is based on how arrayinit is used. Why is that OK?

Maybe because arrayinit was generically polymorphic, not ad-hoc? But what if there is an optimized array implementation for integers, so arrayinit should be ad-hoc based on whether or not A is Integer? Then, what makes this OK if haskell sumOf is not?

Perhaps the recursion on the structure of the code is the problem? Perhaps, as long as the information from the future being used is a simple, opaque type, i don't have a problem with saying, 'yeah, the compiler figured out that this array is going to be an array of integers, so it initialized it to 0 and used the integer array optimizated implementation'; but i do have a problem with saying, 'the compiler saw that in the future i would apply this function to an integer and then apply the result of that again and then apply the result of that again, so it constructed the code (a + (b + c)) to match this structure. Because now the compiler's backwards thought process seems Turing complete or close to it, and i have to follow it. But if we just prohibited recursion and made it polynomial (e.g. it can't construct varying depth structures, only fixed depth structures) then would that really be any better? Even if it is not Turing-complete or even NP-complete, it could still be looking at the structure of my future code and building a matching structure here.

Could we prohibit the compiler from looking at structure at all? But then, in a world with currying, the compiler can't figure out that when it sees "f 3 4 :: Int", f must be of type Int -> Int -> Int, because (f 3) is of type Int -> Int. So that seems too weak.

My intuition is that it's that, the idea that the compiler is looking at structure in the future and using that to build, not just some type that it uses within its black box, but rather the actual control flow structure of my code via ad-hoc polymorphism. What we want to do is constrain either the information that affects backwards ad-hoc polymorphism to be non-structural (opaque black-box type names), or constrain the way that ad-hoc polymorphism can act so that it can't do structure-building.

The latter option seems fruitless because you don't want to prohibit recursion in the IMPLEMENTATIONs; and you don't want to prohibit the implementations from doing pattern matching on their inputs and destructuring them. But i think i am missing something there.

Also, if you do this right, it may make the typechecker simpler and maybe even less computationally complex, because i have this idea that we would reduce the process of typechecking from this complicated recursive process passing around complex recursive structures, into a simpler process that is locally recursive but that globally passes around opaque type names.

---

if there is any pattern-matching done in the output of ad-hoc polymorphism: then the structure of the pattern must be the SAME for all cases; only the constants can change

e.g.:

• you CAN have a function with an implementation whose output is Int, and another implementation whose output is String
• you CAN have a function with an implementation whose output is String -> Int, and another implementation whose output is Int -> String
• you CANNOT have a function with an implementation whose output is Int -> Int, and another implementation whose output is Int -> Int -> Int
• you CANNOT have a function with an implementation whose output is Int, and another implementation whose output is [Int]
• you CAN have a function with an implementation whose output is Int, and another implementation whose output is ListOfInts?, which is a synonym for [Int]. But if the typechecker sees you using its output as an [Int], it won't allow this; you'll have to explicitly typecast from ListOfInts?.
• you CAN have a function with an implementation whose output is FnFromIntToInt?, and another implementation whose output is FnFromIntToFnFromIntToInt?; but again, you'll have to explicitly typecast its result from either FnFromIntToInt? or FnFromIntToFnFromIntToInt? to Int -> Int or Int -> Int -> Int

hmmm... that's a bit strict, yes? it prevents someone from using the common syntax [Y] to designate the type of a list of Y, and then having a 'default' function work, e.g. '(x = default) :: [Y]' wouldn't work, because the language would want you to say '(x = default) :: ListOfY?'. That seems a little silly and hard-to-remember/surprising.

--

how about if you add an exception: (except; structure inside constants can change as long as no non-dummy variables are bound in there). But then wouldn't the haskell variadic thing work? recall that the thing we are trying to prohibit is:

class SumRes? r where sumOf :: Integer -> r

instance SumRes? Integer where sumOf = id

instance (Integral a, SumRes? r) => SumRes? (a -> r) where sumOf x = sumOf . (x +) . toInteger

the problem is in the last instance. But in the last instance, a and r are never used in the body of the function, although they are bound and used to give contexts. What does this mean?

when given a bunch of arguments to sum, e.g. (sumOf 1 2 3) :: Integer, the type checker is going to look for an instance of sumOf :: Int -> (Int -> (Int > Int)). (i think) It'll match against the second instance, binding Int -> (Int -> (Int > Int)) to Integer -> (a -> r) with a = Integer, r = Int -> Int.

Then it'll apply the body of the instance, replacing "sumOf 1" with "sumOf . (1 +) . toInteger".

But now is when those contexts come into play. toInteger is a typeclass (ad-hoc) polymorphic function. Which instance should we use for it? Rather than subbing in the body and then doing another pass, i bet Haskell just looks at the type signature given (or already computed) for this function body, namely Integer -> (a -> r) with a = Integer, r = Int -> Int; and it looks at the context binding which binds toIntegral to a; and it notes that toIntegral is bound to Integer; and so it looks to use an Integer version of toInteger, ie "Integral Integer". Similarly, for the new sumOf, it looks to use "Int -> Int" for the new instance, ie "SumRes? Int->Int". So, although neither a or r explicitly appear in the body of the function, they actually are used to determine which code is ultimately executed in that body.

So, maybe this exception would work.

--

Could you get around this rule as a whole by just adding dummy code that has no effect to the Integer instance, and then binding a and r there too? This would look something like

class SumRes? r where sumOf :: Integer -> r

instance (FakeInteger? a r) => SumRes? FakeInteger? where sumOf = id

but that wouldn't work, because a and r are only in the context, not in the actual signature.

---

so, the current proposal:

if there is any pattern-matching done in the output of ad-hoc polymorphism: then the structure of the pattern must be the SAME for all cases; only the constants can change; EXCEPT that you can have different substructure as long as you only bind dummy variables (that can't even be attached to contexts). Example:

• you CAN have a function with an implementation whose output is Int, and another implementation whose output is String
• you CAN have a function with an implementation whose output is String -> Int, and another implementation whose output is Int -> String
• you CAN have a function with an implementation whose output is Int -> Int, and another implementation whose output is Int -> Int -> Int
• you CAN have a function with an implementation whose output is Int, and another implementation whose output is [Int]
• you CAN have a function with an implementation whose output is Int, and another implementation whose output is [_] (where _ is a dummy type variable)
• you CANNOT have a function with an implementation whose output is Int, and another implementation whose output is [A] (where A is a normal (bindable) type variable)

in more detail:

current proposal to prevent black magic like the Haskell variadic function example (http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function ; see ootTypeNotes1):

if there is any pattern-matching done in the output of ad-hoc polymorphism: then the structure of the pattern must be the SAME for all cases; only the constants can change; EXCEPT that you can have different substructure as long as you only bind dummy variables (that can't even be attached to contexts). Example:

• you CAN have a function with an implementation whose output is Int, and another implementation whose output is String
• you CAN have a function with an implementation whose output is String -> Int, and another implementation whose output is Int -> String
• you CAN have a function with an implementation whose output is Int -> Int, and another implementation whose output is Int -> Int -> Int
• you CAN have a function with an implementation whose output is Int, and another implementation whose output is [Int]
• you CAN have a function with an implementation whose output is Int, and another implementation whose output is [_] (where _ is a dummy type variable). E.g. you can have a function 'default' such that 'default :: Int == 0' and 'default :: [Int] == []'.
• you CANNOT have a function with an implementation whose output is Int, and another implementation whose output is [A] (where A is a normal (bindable) type variable)

similarly to Haskell, if you have any ad-hoc polymorphic functions inside the definition of some other function f, if the ad-hoc polymorphic functions choose their case depending on (or partly depending on) their output type, then the signature of function f must explictly provide 'contexts' (the Haskell term, but i think we should choose another word for this since i suspect we'll be using the word Context for something else, probably variable or variable-scope related) which show which type variable in the signature of f binds to the type variable in the typeclass signature for the ad-hoc output-polymorphic functions within. This is important here because this is why [_] is more restrictive than [A]; you can't dispatch the ad-hoc polymorphic function on the _, but you can on the A.

(unlike Haskell, we allow dynamic ad-hoc polymorphism so we don't have this constraint on the inputs, only on the outputs)

---

hmm... why not just ban output ad-hoc polymorphism altogether and use type parameters for things like 'default'?

---

note: in haskell, the variadic sum example hides additional complexity of the type system. You might ask, 'why is (Integral a, SumRes? r) => needed in the second instance?'. If you omit this, you get

class SumRes? r where sumOf :: Integer -> r

instance SumRes? Integer where sumOf = id

instance SumRes? (a -> r) where sumOf x = sumOf . (x +) . toInteger

test.hs:8:15: No instance for (SumRes? r) arising from a use of sumOf' In the first argument of (.)', namely sumOf' In the expression: sumOf . (x +) . toInteger In an equation for sumOf': sumOf x = sumOf . (x +) . toInteger

test.hs:8:31: No instance for (Integral a) arising from a use of toInteger' In the second argument of (.)', namely toInteger' In the second argument of (.)', namely (x +) . toInteger' In the expression: sumOf . (x +) . toInteger

you might say, 'OK well at least we don't need that toInteger for sums of actual Integers; just restrict the type signature to Integer'. But you get (omitting the unchanged parts)

instance (Integer a, SumRes? r) => SumRes? (a -> r) where sumOf x = sumOf . (x +)

test.hs:7:11: Integer' is applied to too many type arguments In the instance declaration for SumRes? (a -> r)'

this is because Integer is a type, not a type class, and so 'Integer a' doesn't make sense. So try:

instance (SumRes? r) => SumRes? (Integer -> r) where sumOf x = sumOf . (x +)

test.hs:7:24: Illegal instance declaration for SumRes? (Integer -> r)' (All instance types must be of the form (T a1 ... an) where a1 ... an are *distinct type variables*, and each type variable appears at most once in the instance head. Use -XFlexibleInstances? if you want to disable this.) In the instance declaration for SumRes? (Integer -> r)'

So we can't have type constants in our instances. So even if we know we only have Integers, we'd have to invent a type class Integral and a conversion function toInteger that has one instance, Int -> Int, whose implementation is just 'id'. Sheesh. I guess in Oot, the 'everything is a type class' will help with this bit, at least.

Now what if we remove the (recursive) SumRes? context in the type signature?

instance (Integral a) => SumRes? (a -> r) where sumOf x = sumOf . (x +) . toInteger

test.hs:8:15: Could not deduce (SumRes? r) arising from a use of sumOf' from the context (Integral a) bound by the instance declaration at test.hs:7:10-40 Possible fix: add (SumRes? r) to the context of the instance declaration In the first argument of (.)', namely sumOf' In the expression: sumOf . (x +) . toInteger In an equation for sumOf': sumOf x = sumOf . (x +) . toInteger

this is the same kind of error that you get if you leave the body the same, and remove the 'Integral' context:

instance (SumRes? r) => SumRes? (a -> r) where sumOf x = sumOf . (x +) . toInteger

test.hs:8:31: Could not deduce (Integral a) arising from a use of toInteger' from the context (SumRes? r) bound by the instance declaration at test.hs:7:10-38 Possible fix: add (Integral a) to the context of the instance declaration In the second argument of (.)', namely toInteger' In the second argument of (.)', namely (x +) . toInteger' In the expression: sumOf . (x +) . toInteger

i think (though am very uncertain) that what this is saying is that "You used a typeclass function, 'toInteger', in the definition of the body of this function; when you do that, you need to make this function's signature bind to the typeclass's signature, so that we will know how to choose between the polymorphic typeclass instances.".

These are all simple enough reasons for errors, but, especially given that there are so many of them, they make it hard for a newcomer (or even someone who, like me, hasn't looked at Haskell for awhile and forgot some stuff) to get going with Haskell.

---

"

Subtyping and type inference

The type inference problem in the presence of parametric polymorphism and subtyping like in Java and C# is significantly harder than in Haskell (98) or SML. You're very unlikely to ever get anything like those FP languages' type inference. Scala is a good language to look at in this regard. It too uses (only) local type inference which is more or less as much as one can reasonably expect. That said, you can do way, way more with local type inference than C# is doing (as Scala demonstrates.) By Derek Elkins at Fri, 2008-06-20 22:12

"

"

Both necessary and not

In Typed Scheme, because Scheme inherently uses subtyping, we were unable to choose an H-M style global inference algorithm. Further, based both on our experience with ML/Haskell and our experience with the HtDP? style of giving a contract and a purpose statement for every function, we decided that that amount of annotation was a good middle ground. This also has the significant benefit that you can tell by looking at a function what the type is, which is often not the case in ML.

However, in some places, type inference should be considered mandatory. For example, with the exception of FX-89, no language I know of has ever had polymorphic functions which required explicit instantiation at call sites. For example, everyone wants (map add1 (list 1 2 3)) to work without an additional annotation. This, of course, requires type inference. We, like Scala, chose a solution based on Pierce and Turner's Local Type Inference system, but lots of other solutions are possible here (Java does inference at call sites as well, for example). By Sam Tobin-Hochstadt at Sat, 2008-06-21 12:12

"

--

" Probably because they don't know it's useless. I use and like C#. The type inference seems useful to me. Avoiding generic parameters on almost every linq extension method is a huge savings in comprehensibility. var x = new SuperDuperLongClassName?(); is a nice savings in redundancy. "

--

" Types can be parameterized over other types so that we end up with ???type constructor??? entities whose instances are concrete types much as data values are instances of types"

interesting; it's true, a List(Ints) is an instance of the List() type constructor; but it is also an element in the range of List() if List() is thought of as a function from types to types.

--

http://www.hypergraphdb.org/learn?page=RefTypes&project=hypergraphdb

--

Pro- toLisp's types include numbers, truth values, characters, procedures, sequences, symbols and pairs (a pair denotes a procedure call). 2

--

see also typeSystem (later: moved to ootTypeNotes1), typeSystemDesignNotes (in other folder)

--

So i guess a type system that lets a type of all types be constructed is impredicative. I'm not sure if that's really a problem for programming languages though (except for its influence on the decidability of type checking and type inference). i guess if the programmer said, "T is the type of all types. R is the type of all types Y such that that Y is not a member of Y. Variable x is of type R. x = R.", the type checker might hang. Surely we can just rely on the programmer not to do that (and maybe have a max recursion limit in the type checker)?

--

" There are a few things about Go that I'm not super happy about, and that tend to bite me from time to time.

First, you need to "know" whether the variable you are dealing with is an interface or a struct. Structs can implement interfaces, of course, so in general you tend to treat these as the same thing. But when you're dealing with a struct, you might be passing by reference, in which the type is *myStruct, or you might be passing by value, in which the type is just myStruct. If, on the other hand, the thing you're dealing with is "just" an interface, you never have a pointer to it -- an interface is a pointer in some sense. It can get confusing when you're looking at code that is passing things around without the * to remember that it might actually "be a pointer" if it's an interface rather than a struct.

Go's type inference makes for lean code, but requires you to dig a little to figure out what the type of a given variable is if it's not explicit. So given code like:

    foo, bar := someFunc(baz)

You'd really like to know what foo and bar actually are, in case you want to add some new code to operate on them. If I could get out of the 1970s and use an editor other than vi, maybe I would get some help from an IDE in this regard, but I staunchly refuse to edit code with any tool that requires using a mouse.

Finally, Go's liberal use of interfaces allows a struct to implement an interface "by accident". You never have to explicitly declare that a given struct implements a particular interface, although it's good coding style to mention this in the comments. The problem with this is that it can be difficult to tell when you are reading a given segment of code whether the developer intended for their struct to implement the interface that they appear to be projecting onto it. Also, if you want to refactor an interface, you have to go find all of its (undeclared) implementations more or less by hand. "

--

"

Java generics are an example on how not to do it, but it does not mean all generic implementations are like Java's.

szalansky 1 day ago

Hope you write some further explanation for your statement, because so far it looks like part of this c++ vs. java flame.

pjmlp 1 day ago

• Type erasure. Going the .NET way would be a better approach on my opinion;
• Lack of type definitions. If it was possible to declare types in Java, the generic declarations wouldn't need to be so monstrous;
• The common co-variance/contra-variance issues with the generic types;
• Lack of support for builtins;
• No way to specialize generic algorithms for specific types

--

--

some notes from http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/ECOOP-July09.pdf regarding haskell type classes:

" generics not subtyping

 Haskell has no sub - typing  Ability to act on argument of various types achieved via type classes: data Tree = Leaf

Tree f :: Tree - > Int f t = ... f‟s argument must be ( exactly ) a Tree square :: (Num a) => a - > a square x = x*x Works for any type supporting the Num interface  Means that in Haskell you must anticipate the need to act on arguments of various types (in OO you can retroactively sub - class Tree)
 Branch Tree

"

however, compared to Java, which has subtyping, this restriction to generics allows haskell to do more type inference:

" Reason:  Generics alone => type engine generates equality constraints , which it can solve  Subtyping

# > type engine generates

subtyping constraints , which it cannot solve (uniquely) "

bayle: i don't understand why constraints that cant be solved uniquely are a problem; cant you just say 'this is some type that's a subtype of that'? is this one of those problems caused by the doctrine of principal types?

haskell's parametric polymorphism can be used for some of the use cases of subtyping

because haskell does not have subtyping, the language designer doesn't have to think about http://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29

--

times when co- and contra-variance comes up (see en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)/ ):

• a subtype method must be at least as permissive in what it accepts as the superclass, and at least as constrained in what it emits; so it must be contravariant in its arguments and covariant in its return types
• wikipedia adds: "Languages with first-class functions have function types..Those languages also need to specify when one function type is a subtype of another—that is, when it is safe to use a function of one type in a context that expects a function of a different type...In other words, the → type constructor is contravariant in the input type and covariant in the output type. This rule was first stated formally by Luca Cardelli.[1]..When dealing with functions that take functions as arguments, this rule can be applied several times...it is easy to calculate which positions are co- and contravariant: a position is covariant if it is on the left side of an even number of arrows."
• wikipedia adds: "This creates problems in some situations, where argument types should be covariant to model real-life requirements. Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Person::see(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. If children only visit pediatricians, we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Child::see(d) must take any Doctor, not just a Pediatrician. We could try moving the see() method to the Doctor class hierarchy, but we would have the same problem: If a Doctor could see a Person and a Child is a Person, then there is still no way to enforce that a Child must see a Pediatrician and that a Person who is not a Child cannot see a Pediatrician and must see another Doctor. In this case, the visitor pattern could be used to enforce this relationship. Another way to solve the problems, in C++, is using generic programming (see below)."
• wikipedia adds that Giuseppe Castagna[2] noticed that the sort of reasoning that leads us to pair Child with Pediatrician is captured by runtime multiple dispatch. That article states, "In type-theoretic research on object-oriented programming, the issue of “covariance versus contravariance” is a topic of continuing debate. In this short note we argue that covariance and contravariance appropriately characterize two distinct and independent mechanisms. The so-called contravariance rule correctly captures the subtyping relation (that relation which establishes which sets of functions can replace another given set in every context). A covariant relation, instead, characterizes the specialization of code (i.e., the definition of new code which replaces old definitions in some particular cases). Therefore, covariance and contravariance are not opposing views, but distinct concepts that each have their place in object-oriented systems. Both can (and should) be integrated in a type-safe manner in object-oriented languages. We also show that the independence of the two mechanisms is not characteristic of a particular model but is valid in general, since covariant specialization is present in record-based models, although it is hidden by a deficiency of all existing calculi that realize this model. As an aside, we show that the &lgr;&-calculus can be taken as the basic calculus for both an overloading-based and a record-based model. Using this approach, one not only obtains a more uniform vision of object-oriented type theories, but in the case of the record-based approach, one also gains multiple dispatching, a feature that existing record-based models do not capture"
• writing to a container, you may only write things that are subtypes of the container's value type; but reading from a container, you may only assume that the value you get will be a supertype of the container's value type. since we think of this in terms of casting the container's value type (rather than casting the value to be read/written), writing to a container is contravariant (you can only pretend that the container's value type is a superclass of what it actually is), and reading from a container is covariant (u can only pretend that the container's value type is a subclass of what it actually i).

regarding en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)/ and arrays , i assume the best thing to do is to actually treat array reads and array writes differently (e.g. covariance for reading, contravariance for writing)

regarding the child who can only see a pediatrician, not any ol' doctor, i guess the example demonstrates that in this case, the concept of 'Child' is not a subtype of 'Person', nor is 'Pediatrician' a subtype of 'Doctor', at least if the 'see' function is included in either one as normally defined. But how does this comport with the notion that children are subsets of people, and pediatricians are subsets of doctors? The problem is that, when we say that a 'Person' has a function 'see' whose domain is 'Doctor', we are stating that any Person may see any Doctor. What we wanted to say, was, rather, than if x is a Person, then a necessary but not sufficient condition for y to be in the domain of x.see is for y to be a Doctor.

this may relate to my proposed separation between 'demands' and 'promises'; if you see x.see(y), where x is a Person, this demands that y is a Doctor, but it does not promise that the expression is valid if y is a Doctor. I guess a better notation would be necessary vs. sufficient conditions for validity; if x is a Person, and you have x.see(y), a necessary condition for validity is that y is a Doctor, but this is not a sufficient condition. Alternately, we may speak of only necessary and sufficent conditions, but of sub- or super- types; if x is a Person, and you have x.see(y), then the condition for validity must be of the form that y is a some subtype of Doctor (note: as stated this would seem to rule out disjunctive and conjunctive expressions in the condition of validity, which isn't quite what i wanted).

Trying to understand Castagna's notion of relating multiple dispatch, i came up with the following:

• if the compiler sees that Child is a subset of Person, then the fact that a Person can only See a Doctor gives us a necessary condition for x.see(y) to be valid when x is a Child: y must be a Person
• if the compiler sees a multiple dispatch implementation of Child.see(Pediatrician), e.g. an implementation in the Child class of .see where the parameter is of type Pediatrician, then this provides a sufficient condition for x.see(y) to be valid when x is a Child: when y is a Pediatrician we know what to do.
• the previous two cannot both be QUITE right, because what happens when Pediatrician isn't a subtype of Doctor? in that case the multiple dispatch definition and the declaration that Person.see() must take a Doctor parameter would seem to be in conflict. Three things that we could do (there may be more) are: (a) say that the sufficient condition is actually the conjunction of any necessary conditions, plus what we called above the sufficient condition (e.g. the Child.see() is only permitted to be called with a Doctor, so the Child.see(Pediatrician) definition is only actually called when the Pediatrician is also a Doctor); (b) say that what we called above the necessary condition is actually the disjunction of that with the sufficient conditions (e.g. Child.see(y) is actually valid whenever y is a Pediatrician, OR y is a Doctor), (c) notice this discrepency and throw a compile-time error. I favor (a).
• there is a connection to success typings. Traditional typing seeks to guarantee through compile time analysis that the program will have no runtime type errors; success typings seeks to prove through compile time analysis that the program has a type error. So, traditional typing is concerned with the sufficient conditions for validity, seeking to prove that in every term in the program, the sufficient conditions for type validity are true, and success typing is concerned with the necessary conditions for validity, seeking to find a term in the program for which the necessary conditions for type validity are false.
• the previous bullet point highlights that success typings may be badly named. In success typings, the type you associate with each term is the maximal type which meets the necessary conditions. If you ever unify two of these and get the null set (the inconsistent type which has no values as members), then you know that two terms have been combined in a way such that the necessary conditions for both of them can't possibly be met, therefore you have proved that there is an error in the program, e.g. a statement that if reached would cause a runtime error. So really, you might call this maximal type which meets the necessary conditions (and therefore, which represents the necessary conditions) a 'failure type', since knowing that a value falls within this type doesn't guarantee success, but knowing that a value falls outside this type does guarantee failure. And similarly, you might call the minimal type which meets the sufficient conditions a success type.
• with actual multiple dispatch, there is an addition wrinkle; if you define a dispatch handler for Child.see(Pediatrician) but nothing else, then there is no way to handle Child.see(y) where y is a Doctor but not a Pediatrician. So in this case, just by looking at the multiple dispatch, you've also deduced something about the necessary conditions, not just the sufficient ones. If you say well then, we'll have to make our neccessary conditions more constraining, then this may be the dual to solution (a) in the point above starting with "the previous two cannot both be QUITE right".

also, in relation to the co/contra-variance of arrays, i think a simpler way to look at it is:

One way to look at it is that the container has an interface with two methods, read(): v and write(): v -> (); as usual, arguments must be contravariant and return types must be covariant; so when you when you write to an array you have contravariance in what you pass to the write() method, and when you read from an array you have covariance in what you get back from the read() method. For example, if you have an array of type CatArray?, and if this is a subtype of arrays called AnimalArray?, then in order to satisfy Liskov substitution w/r/t AnimalArray?, a CatArray? would have to accept any Animal written to it (it may demand only Animal), and promise to emit only Animals when reading from it (it must promise at least Animal).

Clearly that doesn't fit the bill; the above shows that what we called a CatArray? isn't the same notion as Cat[], or alternately, that Cat[] is not a subtype of Animal[], even though Cat is a subtype of Animal (that is, A \subset B does NOT imply that A[] \subset B[]).

When Wikipedia notes that immutable arrays can be covariant, that's because it's just like saying that anywhere you have an AnimalArray?.read(), you can substitute a CatArray?.read() without harm; the 'immutable' condition is here the same as saying 'an object with a subset of the full interface (namely, only the 'read' method).

--

the above suggests an inference algorithm that works in terms of the following: for each term, keep track of two types. in the following, i use the terms 'success typing' and 'failure typing' in the opposite of the traditional manner:

(a) a type representing the disjunction of all of the explicitly given sufficient conditions, e.g. the cases of a multiple dispatch (b) a type representing the conjunction of all of the explicitly given necessary conditions, e.g. the inferences on subtypes given by Liskov substitution

now the type associated with a term is the intersection of (a) and (b). In terms of conditions, that is:

(a disjunction of each explicitly given sufficient conditions) AND (a conjunction of each explicitly given necessary condition)

The Liskov substitution principal is not satisfied, because a Child cannot always be substituted for a Person (because, due to the conjoining of the disjunction of multiple dispatch cases, it is a type error for a Child to see a non-Pediatrician Doctor)

However, if the type annotations for Person are not regarded as a success typing but rather only a failure typing (not as, "if i use this Person in such a way so that these are satisfied then i want to be guaranteed that i wont have a type error at runtime", but instead, "i want the compiler to notify me if i DON'T use this Person in such a way so that these are satisfied"), then the corresponding principal to the Liskov substitution makes sense.

hmmm... needs more work..

if whenever we have multiple dispatch, we require (a disjunction of each explicitly given sufficient condition) to be a superset of (a conjunction of each explicitly given necessary condition), then the above type just becomes a conjunction of the necessary conditions, so Liskov substitution works; e.g. if we require that any multiple dispatch exhausts the type annotation given to the containing function. However, if a case like Child-Pediatrician, this just leads the programmer to create a multiple dispatch clause covering the case where a Child Sees a non-Pediatrician Doctor, which will just throw an runtime exception. We want a compile-time exception in this case.

ok how about this. we need a way for the programmer to be able to separately express "if this condition is satisfied, that sufficies to show that this will be well-typed" vs "if this condition is NOT satisfied, that sufficies to show that that this will be ill-typed", that is, success type annotations from failure type annotations. This goes back to my "demand vs. promise" idea: a success type annotation is a promise, a failure type annotation is a demand.

however, we also want a way to make explicit statements that are combined with statements elsewhere to form the full success or failure typing, e.g. the way that we want to constrain Child.see() to take only Doctors even if the multiple dispatch just has a clause for Pediatrician, and Pediatrician is not a subtype of Doctor.

So, the explicitly given demand or promise statements are really just partial demands or promises.

Now, when defining Person, we provide facilities for the programmer to say EITHER (or both) of: "i promise that if you do Person.see(x) and x is a Doctor, then this is valid" vs "i demand that if you do Person.see(x) then x must be a Doctor". If the former (the promise), and if Child is a subtype of Doctor, then Child.see(x) must accept any Doctor in x. If the latter (the demand), and if Child is a subtype of Doctor, then Child.see(x) might put additional conditions upon x beyond that it be a Doctor. Liskov substitution only applies to the promises.

As for combining explicit typings with implicit, what i mean is that if in one part of the code you make a promise that x is valid given some condition A, but then in a superclass you make a demand that condition B must hold of x, then the real promise is only that x is valid if (A and B). But what about duality? In what circumstances do explicit promises affect demands, instead of just explicit demands weakening subtype promises?

One thing we could do is have two notions of subtype rather than one; a promising subtype and a demanding subtype. I'm not going to consider that right now.

Another thing we could do is have supertype promises weakening subtype demands. E.g. if in the supertype we say that Child can See any Doctor, and then in the subtype we say that Child must see a Pediatrician, then we implicitly weaken the demand in the subtype to be "must see a Pediatrician OR a doctor". In this world we would have a compile-time error if Person promises that it can See any Doctor, but Child has a multiple dispatch with only Pediatrician; the Child multiple dispatch would infer both a promise and a demand, but that demand would be weakened, but it cannot be because it's coming from the multiple dispatch, so there would be an error. In this system, the supertype's promises and demands always implicitly overrides the subtypes, and when the subtype tells the compiler W, this is interpreted as something like "W, unless my supertype disagrees".

i think i like the latter suggestion.

---

## co/contravariance, subtyping, promise/demand, failure and success typings

provide the programmer facilities to make type annotations that are either success typings (in my sense, which is the opposite of the traditional sense), which are promises that if they are obeyed, then things are well-typed, and also failure typings, which are demands that say that if they are not obeyed, then things are ill-typed.

explicit subtype promise and demand type annotations are implicitly weakened by supertype demands and promises.

multiple dispatch generates implicit promises and demands. If an implicit demand would be weakened by supertype promises, a compile-time error is thrown.

Perhaps promises are disjunctions and demands are conjunctions, although i'm not sure about that yet.

So, e.g. if Person is a subtype of Child and a Pediatrician is NOT a subtype of Doctor, if the supertype Person has a method .see(x):

   and it makes a demand that x is a Doctor, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then in fact the implied promise is weakened, and a Child can only .see Pediatricians which are also Doctors
   and it makes a promise that if x is a Doctor, then person.see(x) is valid, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then a compile-time error is thrown, because the implied demand that x be a Pediatrician must be weakend by the superclass's promise, so the actual demand of Child is only that .see(x) be passed a Doctor, and this demand is not exhaustively covered by the multiple dispatch cases.

Liskov substitution holds in both cases; if Person makes a promise that x in .see(x) need only be a Doctor, then others using Person can rely upon this demand, and we have a success typing where a call to person.see(doctor) is promised to succeed. But if Person makes a demand that x in .see(x) needs to be a Doctor, then we have a failure typing where a call to person.see(non-doctor) fails at compile-time, but there is no guarantee that person.see(doctor) will always succeed.

We can state a modified Liskov substitution principal in terms of promises and demands, rather than in terms of all properties of the type, as: the subtype must promise everything that the supertype promises, and may not demand anything that the supertype doesn't demand. This allows us to capture failure-typing properties (demands) as well as success-typing properties (promises).

Note that in the latter case, since Person doesn't give ANY promise regarding the type of x in .see(x), any uses of Person elsewhere in the program can never be proven to be valid. A programmer demanding success type safety cannot use Person.see (they must speak in terms of subtypes like Pediatrician). However, if Person was guaranteed to be one of an exhaustive list of potential subtypes, then we could take the conjunction of all of the promises in the subtypes to get a promise for Person, e.g. if a Person is either a Child or an Adult, x is guaranteed to be valid in person.see(x) iff x is both a Pediatrician and an AdultDoctor?. So this suggests that we should provide a way to state an exhaustively enumeration of the subtypes of a supertype.

note that subtypes can also expand functionality by making promises that their supertypes don't. E.g. if the supertype only promises success when .see is given someone who is both a Pediatrician and an AdultDoctor?, then a Child could promise that they can .see() a Pediatrician who isn't an AdultDoctor?. Another example is that an AlternativeMedicineBeliever? could promise that they can .see() a chiropractor, whereas an arbitrary Person might not promise that.

note that you still require the language to support parametric polymorphism (generics) in order to write functions that both read and write arrays of arbitrary type; an array which promises to give out Cats must also demand that you only put Cats in it; so if you are moving elements from one location in an array to another, you'll require the guarantee that it only gives out Cats in order to be able to promise that you will only put Cats in.

this language seems kind of confusing with regards to the type of results. It makes sense that a function demands that its inputs be of a certain type, or promises that if you give it inputs of a certain type, it won't crash. It makes sense for a function to promise that its output will be of a certain type. But what does it mean for a function to demand that its output be of a certain type?

i guess those term 'demand' isn't the best. a function says that succeed if its output is of a certain type, and will fail if its input is not of a certain type. a function says that its output will definitely be of a certain type, and it says that you must be prepared for an output that is any member of a certain type.

when talking about function outputs, the terms 'promise' and 'demand' sound like they make sense only if you talk about the promises and demands of the next INPUT connected to that output, and swap them. E.g. if i promise that i will output a cat, you can demand a cat from me. If i demand to be allowed to output any animal, you must promise that you can accept any animal. So whereas an input promise or demand refers to "i must get a subset of this", an output promise or demand refers to "you must accept a superset of this".

if A \subseteq B: If i make no promises nor demands regarding my output, and you make no promises nor demands on what you can accept, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you demand to be give A, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you promise that you can accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you demand to be given A, compile time error If i demand to output B and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho).

If i promise to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you promise to accept A, we're guaranteed to succeed. If i promise to output A and you promise to accept B, we're guaranteed to succeed. If i promise to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given B, we're guaranteed to succeed.

by 'we're cool', i mean 'no compile time error from the failure typing' and by 'we're guaranteed to succeed', i mean the program typechecks in the success typing (e.g. traditional) way, and by ' we're cool (we are not guaranteed not to fail at runtime tho)' i mean that although failure typing wouldn't give us a compile time error, the program does not typecheck in the success typing (e.g. traditional) way.

note that you demands and promises are related: you can't promise that you'll be happy with a type of input while also demanding something more restrictive, and you can't promise that you'll output a type while demanding that you be allowed to output a supertype. This is because these are modal logic operators with the usual relationship: you are NOT allowed to promise something that is NOT consistent with your demands, and you are NOT allowed to make demands that you have promises you wouldn't make. The direction of this relationship flips from input to output; you can't demand that you be given input that is a proper subset of what you promise that you will accept, or equivalently, you can't promise that you will accept input that is a proper superset of what you have demanded; and you can't demand that you be permitted to output a proper superset of what you have promised you will give, or equivalently, you can't promise to give a subset of what you've demanded. (but do these relations bind subclasses? i guess not, b/c a subset can promise to handle a type of input that the superclass can't handle; so a Child CAN say that it can .see() a Pediatrician who isn't a Doctor. hmm so i guess that kills the above idea of 'implicit weakening' controlled by the superclass, right? maybe implicit weakening only happens within the same class? or do we say that a Child can only promise to see a non-Pediatrician Doctor if the Doctor in the Person superclass was a promise, not a demand? e.g. so then you couldn't have a Child class which both (a) refuses to see all Doctors (non-Pediatricians), and also (b) can see some non-Doctors. i guess that makes sense. ok, so yes, these relations DO bind subclasses? Again, moving from superclass to subclass increases promises and weakens demands. So maybe superclass promises do bind subclasses, and furthermore, subclasses may not introduce new demands? but then what about the Child who demands a Pediatrician? So maybe moving from superclass to subclass we can increase promises, weaken demands, and also decrease promises when doing so does not conflict with superclass demands, and increase demands when such does not conflict with superclass promises? no that isn't right.. hmm... maybe it varies with input vs output too; moving from superclass to subclass we can increase promises, and weaken demands, certainly; we can also increase demands on the input provided it doesn't conflict with a superclass promise; but i guess we can do the same with the output.. maybe promises and demands are the asymmetrical thing: you can only increase promises, but you can both weaken and strengthen demands? so promises are binding on subclasses but demands (or lack thereof) are not?)

so the failure typing triggers a compile-time error when there are irreconciliable demands, and the success (traditional) typing guarantees success when there are consistent promises.

i feel like the demand/promise terminology and the success/failure terminology is perhaps more complicated than just necessary and sufficient conditions for well-typed?

IF subclasses can weaken demands, to keep syntax simple, maybe not provide a mechanism to do this explicitly, but a promise by the subclass can weaken the demand?

hmm i think that perhaps my modified Liskov above is bad. Perhaps subclasses should only be able to increase promises, AND only to increase demands. E.g. if the Person superclass demands a Doctor, you can't make a person who will see a Chiropractor. And if the Person subclass promises to see a Doctor, you cant make a Person subclass who insists on a Pediatrician.

note that if you promise something on the input, you promise to accept any subclass of that, and if you promise it on your output, you promise to emit only subclasses of that

so it's clear what to do with demands and promises and function composition and how they relate to each other and success and failure, and it's clear that promises made by a superclass must bind a subclass. What's still unclear is if or how demands made by a superclass affect a subclass.

a little later: ok, i think the right way is for subclasses to be bound by the demands of the parent class. So both promises and demands are additive. The reason is that, just as you should only make a promise if you are sure that the value won't cause an error if that promise is fulfilled, you should not make a demand, it's dual, unless you are sure that the value WILL cause an error if the demand is NOT fulfilled. So if you find yourself wanting to remove demands in subclasses, then the original superclass demands must be in error. For example, if a Person demands to .see() only Doctors, and you have an AlternativeMedicineBeliever? who can .see() a Chiropractor, and Chiropractor is not a Doctor, then you should have made Person demand to see ThingsThatActLikeDoctors? (some superclass of both Doctors and Chiropractors) instead.

I briefly toyed with letting you relax demands in subclasses as a form of default reasoning (relaxing the demand for the non-default special cases), but now i think that's a different kind of thing that would require a separate design (or at least a separate deeply thinking through, even if the end result is to change this design).

--

When a function input is unified with the provider of that input, the unification is the union of all of the places where the function gets input,

largest supertype of the provider's type that is a subtype of the function input's type.

When multiple type constraints apply to the same value, they are combined as follows. The necessary type constraints are conjuncted/intersected to make the unified necessary constraint. And the sufficiency type constraints are conjuncted/intersected, and also conjuncted/intersected with the unified necessary constraint, to make the unified sufficiency constraint.

For example, if you have a function 'f x = x + 4', and +: Num a => a -> a, and a subtype of Int is "3", and 3 \member 3, and you do y = f 3, then the x in f x is unified with the a in +'s type, and so is the output of f. f starts with no type constraints, so it starts as Any -> Any. So for its input, we take the largest supertype of Any that is a subtype of +'s a, and we get Num a => a. For its output, we take the largest supertype of +'s a that is a subtype of Any, and we get Any. So we have f: Num a => a -> Any.

"3" is unified with "Int" to produce "Int". If elsewhere you also do z = f 4? The intersection of types "3" and "4" would be empty.

If the compiler calculates that two necessary type constraints being made on a value are irreconcilable (for instance, if 'y = f x' and if f demands its output to be a Str and y demands its type to be an Int), then at least one of these necessary constraints cannot be satisfied, so the compiler halts with a type error.

If the compiler calculates that a chain of operations on values is such that each operation's output has a sufficient constraint that matches the sufficient constraint of the next operation's input, then no runtime type checking needs to occur between the two values (for instance, if y = 3 +5, then since 3 and 5 each have a sufficiency constraint of Int, and + has a sufficiency constraint of Int -> Int, then in this case no type checking need be performed on the inputs of +). Otherwise, runtime type checking is provided for each term's sufficiency constraint.

--

two necessary type constraints being made on a value are irreconcilable (for instance, if 'y = f x' and if f demands its output to be a Str and y demands its type to be an Int), then at least one of these necessary constraints cannot be satisfied, so the compiler halts with a type error.

If the compiler calculates that a chain of operations on values is such that each operation's output has a sufficient constraint that matches the sufficient constraint of the next operation's input, then no runtime type checking needs to occur between the two values (for instance, if y = 3 +5, then since 3 and 5 each have a sufficiency constraint of Int, and + has a sufficiency constraint of Int -> Int, then in this case no type checking need be performed on the inputs of +). Otherwise, runtime type checking is provided for each term's sufficiency constraint.

A subtype inherits all sufficient and all necessary type constraints of its supertype.

If a subtype has a sufficiency constraint which is a proper subset of the sufficiency constraint of its supertype, that is a type error. This can only occur when improper demands are made on the subtype which conflict with the sufficiency constraints of its supertype. (note/todo: how can the programmer control/avoid unwanted implicitly inferred sufficiency constraints on the supertype?)

The type of input to a multiple dispatch function is determined to be the disjunction of all input types defined in the dispatch cases (e.g. this disjunction forms both a necessary and a sufficiency constraint). (note: this means that if the dispatch cases can be found in multiple places in the source, they must all be gathered before type checking. If the user wants the program to compile despite leaving out some of the cases, they can add a default dispatch case).

largest supertype of the provider's type that is a subtype of the function input's type.

When multiple type constraints apply to the same value, they are combined as follows. The necessary type constraints are conjuncted/intersected to make the unified necessary constraint. And the sufficiency type constraints are conjuncted/intersected, and also conjuncted/intersected with the unified necessary constraint, to make the unified sufficiency constraint.

For example, if you have a function 'f x = x + 4', and +: Num a => a -> a, and a subtype of Int is "3", and 3 \member 3, and you do y = f 3, then the x in f x is unified with the a in +'s type, and so is the output of f. f starts with no type constraints, so it starts as Any -> Any. So for its input, we take the largest supertype of Any that is a subtype of +'s a, and we get Num a => a. For its output, we take the largest supertype of +'s a that is a subtype of Any, and we get Any. So we have f: Num a => a -> Any.

"3" is unified with "Int" to produce "Int". If elsewhere you also do z = f 4? The intersection of types "3" and "4" would be empty.

If the compiler calculates that two necessary type constraints being made on a value are irreconcilable (for instance, if 'y = f x' and if f demands its output to be a Str and y demands its type to be an Int), then at least one of these necessary constraints cannot be satisfied, so the compiler halts with a type error.

If the compiler calculates that a chain of operations on values is such that each operation's output has a sufficient constraint that matches the sufficient constraint of the next operation's input, then no runtime type checking needs to occur between the two values (for instance, if y = 3 +5, then since 3 and 5 each have a sufficiency constraint of Int, and + has a sufficiency constraint of Int -> Int, then in this case no type checking need be performed on the inputs of +). Otherwise, runtime type checking is provided for each term's sufficiency constraint.

A subtype inherits all sufficient and all necessary type constraints of its supertype.

If a subtype has a sufficiency constraint which is a proper subset of the sufficiency constraint of its supertype, that is a type error. This can only occur when improper demands are made on the subtype which conflict with the sufficiency constraints of its supertype. (note/todo: how can the programmer control/avoid unwanted implicitly inferred sufficiency constraints on the supertype?)

The type of input to a multiple dispatch function is determined to be the disjunction of all input types defined in the dispatch cases (e.g. this disjunction forms both a necessary and a sufficiency constraint). (note: this means that if the dispatch cases can be found in multiple places in the source, they must all be gathered before type checking. If the user wants the program to compile despite leaving out some of the cases, they can add a default dispatch case).

todo

So, e.g. if Person is a subtype of Child and a Pediatrician is NOT a subtype of Doctor, if the supertype Person has a method .see(x):

   and it makes a demand that x is a Doctor, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then in fact the implied promise is weakened, and a Child can only .see Pediatricians which are also Doctors
   and it makes a promise that if x is a Doctor, then person.see(x) is valid, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then a compile-time error is thrown, because the implied demand that x be a Pediatrician must be weakend by the superclass's promise, so the actual demand of Child is only that .see(x) be passed a Doctor, and this demand is not exhaustively covered by the multiple dispatch cases.

Liskov substitution holds in both cases; if Person makes a promise that x in .see(x) need only be a Doctor, then others using Person can rely upon this demand, and we have a success typing where a call to person.see(doctor) is promised to succeed. But if Person makes a demand that x in .see(x) needs to be a Doctor, then we have a failure typing where a call to person.see(non-doctor) fails at compile-time, but there is no guarantee that person.see(doctor) will always succeed.

We can state a modified Liskov substitution principal in terms of promises and demands, rather than in terms of all properties of the type, as: the subtype must promise everything that the supertype promises, and may not demand anything that the supertype doesn't demand. This allows us to capture failure-typing properties (demands) as well as success-typing properties (promises).

Note that in the latter case, since Person doesn't give ANY promise regarding the type of x in .see(x), any uses of Person elsewhere in the program can never be proven to be valid. A programmer demanding success type safety cannot use Person.see (they must speak in terms of subtypes like Pediatrician). However, if Person was guaranteed to be one of an exhaustive list of potential subtypes, then we could take the conjunction of all of the promises in the subtypes to get a promise for Person, e.g. if a Person is either a Child or an Adult, x is guaranteed to be valid in person.see(x) iff x is both a Pediatrician and an AdultDoctor?. So this suggests that we should provide a way to state an exhaustively enumeration of the subtypes of a supertype.

note that subtypes can also expand functionality by making promises that their supertypes don't. E.g. if the supertype only promises success when .see is given someone who is both a Pediatrician and an AdultDoctor?, then a Child could promise that they can .see() a Pediatrician who isn't an AdultDoctor?. Another example is that an AlternativeMedicineBeliever? could promise that they can .see() a chiropractor, whereas an arbitrary Person might not promise that.

note that you still require the language to support parametric polymorphism (generics) in order to write functions that both read and write arrays of arbitrary type; an array which promises to give out Cats must also demand that you only put Cats in it; so if you are moving elements from one location in an array to another, you'll require the guarantee that it only gives out Cats in order to be able to promise that you will only put Cats in.

this language seems kind of confusing with regards to the type of results. It makes sense that a function demands that its inputs be of a certain type, or promises that if you give it inputs of a certain type, it won't crash. It makes sense for a function to promise that its output will be of a certain type. But what does it mean for a function to demand that its output be of a certain type?

i guess those term 'demand' isn't the best. a function says that succeed if its output is of a certain type, and will fail if its input is not of a certain type. a function says that its output will definitely be of a certain type, and it says that you must be prepared for an output that is any member of a certain type.

when talking about function outputs, the terms 'promise' and 'demand' sound like they make sense only if you talk about the promises and demands of the next INPUT connected to that output, and swap them. E.g. if i promise that i will output a cat, you can demand a cat from me. If i demand to be allowed to output any animal, you must promise that you can accept any animal. So whereas an input promise or demand refers to "i must get a subset of this", an output promise or demand refers to "you must accept a superset of this".

if A \subseteq B: If i make no promises nor demands regarding my output, and you make no promises nor demands on what you can accept, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you demand to be give A, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you promise that you can accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you demand to be given A, compile time error If i demand to output B and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho).

If i promise to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you promise to accept A, we're guaranteed to succeed. If i promise to output A and you promise to accept B, we're guaranteed to succeed. If i promise to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given B, we're guaranteed to succeed.

by 'we're cool', i mean 'no compile time error from the failure typing' and by 'we're guaranteed to succeed', i mean the program typechecks in the success typing (e.g. traditional) way, and by ' we're cool (we are not guaranteed not to fail at runtime tho)' i mean that although failure typing wouldn't give us a compile time error, the program does not typecheck in the success typing (e.g. traditional) way.

note that you demands and promises are related: you can't promise that you'll be happy with a type of input while also demanding something more restrictive, and you can't promise that you'll output a type while demanding that you be allowed to output a supertype. This is because these are modal logic operators with the usual relationship: you are NOT allowed to promise something that is NOT consistent with your demands, and you are NOT allowed to make demands that you have promises you wouldn't make. The direction of this relationship flips from input to output; you can't demand that you be given input that is a proper subset of what you promise that you will accept, or equivalently, you can't promise that you will accept input that is a proper superset of what you have demanded; and you can't demand that you be permitted to output a proper superset of what you have promised you will give, or equivalently, you can't promise to give a subset of what you've demanded. (but do these relations bind subclasses? i guess not, b/c a subset can promise to handle a type of input that the superclass can't handle; so a Child CAN say that it can .see() a Pediatrician who isn't a Doctor. hmm so i guess that kills the above idea of 'implicit weakening' controlled by the superclass, right? maybe implicit weakening only happens within the same class? or do we say that a Child can only promise to see a non-Pediatrician Doctor if the Doctor in the Person superclass was a promise, not a demand? e.g. so then you couldn't have a Child class which both (a) refuses to see all Doctors (non-Pediatricians), and also (b) can see some non-Doctors. i guess that makes sense. ok, so yes, these relations DO bind subclasses? Again, moving from superclass to subclass increases promises and weakens demands. So maybe superclass promises do bind subclasses, and furthermore, subclasses may not introduce new demands? but then what about the Child who demands a Pediatrician? So maybe moving from superclass to subclass we can increase promises, weaken demands, and also decrease promises when doing so does not conflict with superclass demands, and increase demands when such does not conflict with superclass promises? no that isn't right.. hmm... maybe it varies with input vs output too; moving from superclass to subclass we can increase promises, and weaken demands, certainly; we can also increase demands on the input provided it doesn't conflict with a superclass promise; but i guess we can do the same with the output.. maybe promises and demands are the asymmetrical thing: you can only increase promises, but you can both weaken and strengthen demands? so promises are binding on subclasses but demands (or lack thereof) are not?)

so the failure typing triggers a compile-time error when there are irreconciliable demands, and the success (traditional) typing guarantees success when there are consistent promises.

i feel like the demand/promise terminology and the success/failure terminology is perhaps more complicated than just necessary and sufficient conditions for well-typed?

IF subclasses can weaken demands, to keep syntax simple, maybe not provide a mechanism to do this explicitly, but a promise by the subclass can weaken the demand?

hmm i think that perhaps my modified Liskov above is bad. Perhaps subclasses should only be able to increase promises, AND only to increase demands. E.g. if the Person superclass demands a Doctor, you can't make a person who will see a Chiropractor. And if the Person subclass promises to see a Doctor, you cant make a Person subclass who insists on a Pediatrician.

note that if you promise something on the input, you promise to accept any subclass of that, and if you promise it on your output, you promise to emit only subclasses of that

so it's clear what to do with demands and promises and function composition and how they relate to each other and success and failure, and it's clear that promises made by a superclass must bind a subclass. What's still unclear is if or how demands made by a superclass affect a subclass.

a little later: ok, i think the right way is for subclasses to be bound by the demands of the parent class. So both promises and demands are additive. The reason is that, just as you should only make a promise if you are sure that the value won't cause an error if that promise is fulfilled, you should not make a demand, it's dual, unless you are sure that the value WILL cause an error if the demand is NOT fulfilled. So if you find yourself wanting to remove demands in subclasses, then the original superclass demands must be in error. For example, if a Person demands to .see() only Doctors, and you have an AlternativeMedicineBeliever? who can .see() a Chiropractor, and Chiropractor is not a Doctor, then you should have made Person demand to see ThingsThatActLikeDoctors? (some superclass of both Doctors and Chiropractors) instead.

I briefly toyed with letting you relax demands in subclasses as a form of default reasoning (relaxing the demand for the non-default special cases), but now i think that's a different kind of thing that would require a separate design (or at least a separate deeply thinking through, even if the end result is to change this design).

--

would like to define integers by induction with 0 and successor, and do pattern-matching recursion on them. but would also like to use the speed of integer math on the underlying platform.

so, Oot needs the concept of an isomorphism between two representations of one type, and a mechanism to state these and for it to use them

note that in general this would also be helpful when we want to specify a mapping between some Oot data, and how it should be physically represented, e.g. serialized or presented through an FFI. Note that the same Oot data type could have multiple physical representations (e.g. C or Pascal-style strings). Note that this stuff gets a little tricky so we want support for homomorphisms, not just isomorphisms.

interfaces can help but can we do better? how do multiple interfaces work with ADTs and pattern matching? is this a perspective? a view?

i guess it should be a perspective. s(s(0)) and 2 are two perspectives on the same value.

so with the metadata, if you attach some metadata to a metadata perspective of a value in x whose int perspetive is "2", then if y = 2 + x, clearly y's int perspective will be 4, but will the metadata carry over? how would it be merged with the metadata from "2", if any? i guess you'd have to explicitly specify stuff like that?

so then does every operation have to be specified in every perspective? seems like this misses the point. certainly if you define "+" in the normal int perspective, you shouldn't have to also define it in the successor perspective. do you just define both directions of conversion between these two and then perform each operation in the perspective it is defined in, and then convert it on demand to the others?

--

or isomorphisms between two types. looks like Frank Atanassow is thinking along similar lines, although he seems ( http://www.informatik.uni-trier.de/~ley/pers/hd/a/Atanassow:Frank ) to want automatic inference of isomorphisms, i'll settle for manual:

" Re: Arrow

I've seen you mention your programming language before, but never any real details on it.

Yes, mostly because I cannot honestly answer all the questions I put forward above. I will do my best, though. (I can't give full answers here, anyway, though; it would take too much time.)

What problem does this language solve? How can I make it precise?

The most important problem I'm trying to address with Arrow is the fact that ML-like algebraic datatypes are not abstract or compositional enough to support interoperability on a large scale.

For example, I can define a datatype of naturals in a number of different ways. One as data Nat = Z

 S Nat. Another is data Nat' = Z' S' Nat. These differ only naming but obviously have the same models. Another is data Nat = Z () S Nat. This one just has an extra unit. Another is given by Succ Nat where data Succ a = Fresh Old a. Here I have factored the datatype into two; but when I compose the factors, the result is only isomorphic to Nat, not equal. By applying isomorphisms in this fashion I can obviously come up with an infinite number of ways to represent the naturals, all unequal, and therefore not interoperable.

But in fact I've only just started. The representations above were all numerals in unary base, but that's very inefficient. Better is something like [Bool], which is a representation in base two. And of course I can do all the things to this that I did above. But why stop there? How about base ten, [Ten]? Or [Either Five Five]? Or all compositions using functors only isomorphic to []? Other factorizations? And so on...

Here is one reason this is problematic: a realistic program is one that uses libraries from many disparate sources. For them to interoperate, they need to use the same representations. But libraries developed in isolation are not likely to do so. At the very least, people will choose different names.

The way I am trying to address this is to make the type system know when two types are canonically isomorphic -- this means roughly that it knows when two types are mutually coercible in a unique way (no guessing -- there is exactly one coercion, or none). Then I add constructs which let you define "generic" functions that don't care, say, if they get as input a [Bool] or a Nat, only that they fall in the equivalence class (what I call a "genus").

More precisely, the way I do it is by identifying a basis of types from which all others are generable by canonical isomorphism. These are "type values"; all the result are "type redexes". If two types have the same type value, then they are intercoercible. This is enough to generate a coercion between them, but using that coercion costs you run-time, and I think you can do more.

What I want to do is classify types according to kinds which denote different bases. Then there will be a functor from one kind to another which corresponds to change-of-base as in linear algebra. Natural transformations for such functors should be the generic functions I am looking for, and I think that the change-of-base can all be done statically, essentially by normalizing a program's types at compile-time. It is a bit like partial evaluation, but without the binding-time analysis.

Mind you, the problem I described above is only one application of this idea. I could give you four or five others.

How can I show it solves this problem? How can I make it precise?

Formal methods.

Is there another solution? Do other languages solve this problem? How? What are the advantages of my solution? of their solution? What are the disadvantages of my solution? of their solution?

Yes, there are other solutions, if you compromise on one point or another. The most obvious thing is to promulgate standards, but standards require cooperation, and the problem I am trying to solve is one where parties can't cooperate because they don't know of each other. Also, there is the adage that there are so many standards to choose from; given two choices of n standards, you have n-squared combinations, so this doesn't scale.

You can go in the direction of dependent types and strengthen type equality. But this is a variation of the standards argument. Dependent types exploit type equality, and, for example, cartesian products are generally not associative up to equality, yet they are absolutely fundamental. As you force more and more types to be equal by adding conditions, you get things which look less and less like sets, and type checking gets harder and harder. One advantage of my approach is (I conjecture) that it supports principal type inference.

There are module systems like ML. These are variations on dependent types. Module systems have exactly the problem I am solving, namely that when you apply an opaque functor, you lose the representation type. ML solves this with type sharing, which is a type equality assertion. Sharing assertions are tedious to write, and break abstraction: they force representations to be equal, which is in practice often not what you want. The problem is that signatures only tell you types, not laws. Also, most module systems are limited by the fact that you can only instantiate a finite number of functors in a program. And there is polymorphic recursion...

My approach has disadvantages. First, I only know at present how to handle very weak sorts of isos, although I think I know how I can handle more. Second, all the information is structural, so you have to program "typefully". You will never get the compiler to infer, for example, that file handles and files are one-to-one. Third, my approach seems to require bringing in a lot of other apparently unrelated stuff like linearity, and it is very abstract (though I hope that it will seem obvious once you see some examples).

How can I show that my solution cannot be expressed in some other language?

Again, formal methods. Showing there exists no translation, etc.

That is, what is the unique property of my language which is lacking in others which enables a solution?

There are several interrelated things. First, the type system needs to know what is a type value and what is not and how are they related. Second, it needs to understand the difference between a type function and a functor/monad. (Haskell datatype declarations don't give enough information for this; and Haskell type class notions of Functor and Monad are not guaranteed to be actual functors or monads.) Third, it needs a notion of datakinds, functors which are not endofunctors and their associated polymorphic functions. I have heard Omega has these things, but I think they are built on equality, and I have not had time to check it out.

What parts of my language are essential to that unique property?

I cannot say yet, as I have not formulated a solution.

There... not too dishonest...

(Please note, by the way, that untyped languages suffer from exactly the same problem but have no hope of ever using such a solution, since the whole thing is enabled by static typing. This is why I always say that typing increases freedom -- typing means you don't have to depend on other people's cooperation. It does not matter how they represent things, as long as you get a static description of it.)

Update: I want to give you another application because, although it is trivial, it suggests the possibilities that open up with iso-equivalence. People often complain that Haskell does not support extensible sums or extensible products. But, it does, of course, in the sense that you can always add constructors to a datatype T by defining them in S and using Either T S as your extended type. Or you can leave a "hole" in your type by adding a type parameter; so you have a datatype T a and extend it with S a getting T (S a) which can be extended again. (This is how objects are encoded sometimes.) But no one sees this approach as a solution, because it is a real pain to dig past the extra constructor to get to your extended ones. If data T a = C1 Int

By Frank Atanassow at Mon, 2006-06-05 21:33"
 C2 a and data S = C3 Char C4 a, they don't want to pattern-match on C2 every time they need to get at C3; they want data T' = C1 Int C3 Char. But this T' is iso to T (S ()), so generic functions in my sense treat them the same, and you get mostly what you want, without compromising type inference. login or register to post comments

--

--

y'know, one property of ADTs that is important that we dont want with interfaces is that they're closed; if Tree = Node val left right

 Empty, then we want to write switch statements (multiple dispatch etc) for each of those cases and then statically know that we covered every case.

but really this property of being a Node or an Empty can be just a property on the object. So you have a "Type of tree" field that holds an enum which can either be Node or Empty.

later: this is relevant to the last paragraph if not the others http://en.wikipedia.org/wiki/Tagged_union

apparently Scala and https://github.com/lihaoyi/macropy#case-classes call them 'case classes'. Rust has 'em. http://en.wikipedia.org/wiki/Tagged_union#2000s

" Some languages such as Scala allow base classes to be "sealed", and unify tagged unions with sealed base classes. "

--

the first use of this is what i've been calling Any, Object, or a Dynamic type: http://en.wikipedia.org/wiki/Variant_type (the second use of this is as cases in an ADT)

--

upvote

termain 9 hours ago

Ambivalent or apathetic regarding the typing debate?

Typing can become useful in numerical code when you move past operating on scalars. Column and row vectors needn't be confused, a two-vector and a complex number have different types, etc.

Also, physical quantities can have different types and a type system can be useful there.

I totally agree that for numerical code, operator overloading is of great utility.

--

http://www.megacz.com/thoughts/types.and.static.analyses.html

--

i don't understand this:

"

	If there is one thing I've suffered greatly in order to learn about dependently-typed programming, it is that one should never attempt to concatenate lists. If you find yourself doing this, you probably ought to be using option-trees instead.

--

" Scheme's missing ingredient: first-class type tags I've always felt that Scheme does data and types wrong, but I couldn't really say what bothered me until yesterday. After all, R7RS now has define-record-type for creating disjoint types. But that ain't enuff!

In my new language, Wat, I've found a satisfying solution, inspired by Kernel's encapsulation types, but going beyond them slightly. There are two procedures:

    (make-type) returns a list of three elements: 1) a first-class type 2) a tagger function for tagging a value with that type, creating a tagged object and 3) an untagger function for extracting the value of tagged objects of that type.
(type-of obj) returns the first-class type of an object. The crucial point is that type-of not only works for tagged objects of user-defined types created by make-type, but also for all built-in types. E.g. (type-of 12) will return the number first-class type.

This system has the benefits of Kernel's encapsulated types: only someone with access to the tagger function (capability) may create new instances of a type. Only someone with access to the untagger function may access the contents of tagged objects. So object contents are potentially fully encapsulated.

But at the same time, the fact that every object, including built-in ones, has a first-class type makes it possible to efficiently program generically. E.g. one may create Smalltalk-like dynamic dispatch by using the first-class types of objects as indexes into a virtual lookup table of a generic function. This is not possible in either Scheme or Kernel. In both languages, programming generically requires one to use a cascade of type predicates (e.g. number?).

Example in Wat:

; destructuringly bind the three elements ; returned by make-type (def (person-type person-tagger person-untagger) (make-type))

(define (make-person name email) (person-tagger (list name email)))

; untagger also performs a type check that ; person is in fact of type person (define (get-name person) (car (person-untagger person)))

(define (get-email person) (cadr (person-untagger person)))

(define p1 (make-person "Quux" "quux@example.com")) (get-name p1) --> "Quux" (get-email p1) --> "quux@example.com"

Posted by Manuel Simoni at 15:07 Labels: kernel, scheme, types "

http://axisofeval.blogspot.com/2012/08/schemes-missing-ingredient.html

--

"Every fast lisp has a type checker, there is Typed Racket and there is Dylan, which has a richer type system (union, intersection and singleton types which are first-class values) than most static languages. "

--

http://lambda-the-ultimate.org/node/2828

http://lambda-the-ultimate.org/node/412

--

random type theory slides

http://www-logic.stanford.edu/seminar/1213/Scott_LambdaEnum.pdf

http://www.cs.cmu.edu/~rwh/talks/Scottfest.ppt

--

Doug's "mutable data structure which becomse immutable after freeze" example (for what intersection types might be good for)

--

" dgreensp 3 days ago

I'm not sure it's realistic to say someone could make a career in a programming language without being able to look at a function and deduce the static type of a value without asking the computer.

As I wrote elsewhere in the comments:

When you use any language long enough, you end up needing to simulate pretty much every observable aspect of it yourself -- not including, for example, the garbage collector, the JVM bytecode verifier, or the GCC code optimizer, which are supposed to be transparent, but including the type inferencer, the JVM threading model, and the JavaScript? semicolon insertion rules, which are not. Some of these things you can steer clear of for a long time or even avoid forever by staying on the beaten path, but they lurk, waiting for the thousands of people who have run into bugs or compiler errors and needed to understand them.

I don't know HM too well, but it seems to have more in common with the dataflow analysis algorithms in optimizers and verifiers -- which programmers don't usually have to understand -- than the basic unidirectional type inference that already serves pretty well to reduce redundancy in the source code.

I could imagine citing C++ as a counterargument -- no one understands the type system, but people use the language anyway -- but it's still not an abstraction you don't have to understand to use, like a CPU.

" colanderman 3 days ago

Of course you could say it's a subset of unification, as it is.

It definitely is, and a limiting one at that. How does Hoon infer the type error in whatever is the Hoon equivalent of the, say, JavaScript? expression (a == b && b == 5 && a == "foo"), where a and b are function parameters?

upvote

urbit 3 days ago

But my previous reply is a little content-free, as what you're really asking for is how we solve the same problem that typeclasses solve in Haskell.

Very briefly, there's two ways of asking whether a caller can change the type of a parameter. The normal way is to say, are all the possible values of the caller's type in the set of nouns that the parameter defines? This is what I call "geometric polymorphism."

But there's also "generic polymorphism," in which we ask: will the Nock code that was generated for the parameter type, actually work with the type we're calling it with? A really stupid way to ask this is to recompile the function, with the parameters exchanged for the arguments. The Hoon approach is a little bit smarter than this but not much.

So, with these two approaches, you basically get variance and genericity, which is all of polymorphism. "

--

woah, scala has "compound types" which are intersection types! yay!

--

some of netention's types (paraphrased):

bool, text, int, real, url, object, spacepoint, timepoint, timerange

--

could we make something like 'spacepoint' a core type? what about relative vs. absolute spacepoints? but wouldn't even supposedly 'absolute' ones be relative if you moved the origin (e.g. if the origin were the earth.. or the sun.. or the center of the milky way..)

--

"a netention object is easily created and described and can be visualized in various ways" (e.g. map view, graph view (shows related objects linked by tags), list view)

object editing: tags such as human, action, geometry, cause, effect, goal

" a netention object describes a real object, animal or human; has a unique URI; is described by custom tags; describes abstract objects; time, position (length, events); can be geolocalized (GPS + radius)"

--

is the freedom afforded by scala's feature to allow you to choose between covariant, contravariant, and invariant really useful? i feel like the compiler should derive whether something is input-like, output-like, or both, and derive the variance from that

---

closer to type-level programming:

http://brandon.si/code/a-typefamilies-primer/

---

Finally, types are supported in hstore (slides 11-13). They are numeric, boolean, strings and NULL

---

http://www.infoq.com/presentations/data-types-issues

---

http://www.infoq.com/presentations/data-types-issues

slide 12

classify type systems into dimensions static/dynamic, weak/strong:

dynamic, weak: assembly dynamic, moderate: js, ruby dynamic, strong: python, clojure dynamic-ish but somewhat static, moderate: typescript, dart static-ish but somewhat dynamic, moderate: java, C# static-ish but somewhat dynamic, strong: scala static, weak: C static, strong: OCaml, Haskell

slide 13

classify type systems into dimensions coarse/detailed, weak/strong:

coarse, weak: C coarse, strong ("my way or the highway"): Go, Java 4, Pascal detailed, weak ("cutting corners"): eiffel, typescript, dart detailed, strong ("type it to the max"): scala, F#, haskell, ocaml

---

My Way or The Highway (coarse, strong):

" Simple type systems No generics Not that extensible by users

• Simpler tooling
• Highly normative "

Type it to the Max: " Rich* language to write types Type combination forms, including generics. Type systems often inspired by logic.

• Often, turing complete

Where dynamic languages had the upper hand: – No type-imposed limits to expressiveness  Rich type system + escape hatches such as casts – No boilerplate  Type Inference – Easier for exploration  Bottom type Nothing, ???

Other Strengths of Dynamic • Simpler languages − Rich types add complexity • Fewer puzzling compiler errors

Cutting corners: Appeal to user’s intuitions (covariant generics). – Employee are Persons – So functions from Employees to Employers are also functions from Persons to Employers, right? • Embrace unsoundness. • Easy, and superficially simple. • But, fundamentally, a hack. • Can we build great designs on false theories?

Precision Soundness Simplicity Take Any Two?

"

---

http://www.cs.uwm.edu/~boyland/fool2012/papers/fool2012_submission_3.pdf

looks promising

--

http://www.infoq.com/presentations/data-types-issues

i haven't watched this talk yet, perhaps they should, at least, starting with 'Types in Scala' and continuing on from there

slide 27

Abstractions

Two fundamental forms – Parameters (positional, functional) – Abstract Members (name-based, modular)

Types in Scala

Functional:

scala.collection.BitSet? Named Type Channel with Logged Compound Type Channel { def close(): Unit } Refined Type

Modular: List[String] Parameterized List[T] forSome { type T } Existential Type List Higher-Kinded

Orthogonal Design

crossproduct of modular and functional, where:

Modular:

Named T { ... } T with U

Functional:

T[U] T[_] Exists T

Dot and Dotty DOT: Calculus for Dependent Object Types Dotty: A Scala-Like Language with DOT as its core

paper: "dependent object types" http://www.cs.uwm.edu/~boyland/fool2012/papers/fool2012_submission_3.pdf " Abstract We propose a new type-theoretic foundation of Scala and languages like it: the Dependent Object Types (DOT) calculus. DOT models Scala’s path-dependent types, abstract type members and its mix- ture of nominal and structural typing through the use of refinement types. The core formalism makes no attempt to model inheritance and mixin composition. DOT normalizes Scala’s type system by unifying the constructs for type members and by providing clas- sical intersection and union types which simplify greatest lower bound and least upper bound computations. In this paper, we present the DOT calculus, both formally and informally. We also discuss our work-in-progress to prove type- safety of the calculus "

Types in Dotty scala.collection.BitSet? Channel & Logged Named Type Intersection Type Channel { def close(): Unit } Refined Type ( List[String] Parameterized )

removed: List[T] forSome { tpe T } Existential Type List Higher-Kinded

Modelling Generics class Set[T] { ... }  class Set { type $T }  Set { type$T = String } Set[String] class List[+T] { ... }  class List { type $T }List[String]  List { type$T <: String } Parameters  Abstract members Arguments  Refinements

Making Parameters Public class Set[type Elem] {...} class Set { type Elem ...} Set[String] Set { type Elem = String } class List[type +Elem] {...} class List { type Elem ...}List[String] List { type Elem <: String } Analogous to “val” parameters: class C(val fld: Int) class C { val fld: Int }

Expressing Existentials What is the type of Lists with arbitrary element type? Previously: List[_] List[T] forSome { type T } Now: List (Types can have abstract members)

Expressing Higher-Kinded • What is the type of List constructors? • Previously: List • Now: List • Can always instantiate later: type X = List X { type T = String } X[String]

In a Nutshell In this system, Existential = Higher-kinded In fact, both are just types with abstract members. We do not distinguish between types and type constructors.

Native Meets and Joins • The horrible type error message came from a computed join of two types. • Problem: In Scala, the least upper bound of two types can be infinitely large. • Adding native & and

 types fixes that.

DOT expr.member Type = path.TypeName?

Def = val x: Type = Expr >:
 Type { Defs } ... def f(y: Type): Type = Expr type T <: Type

extends

Subtyping Fundamental relation: T1 <: T2 T1 is a subtype of T2. Comes in many guises: Implementation matches Interface Type class extension Signature ascription

---

from nimrod:

http://nimrod-lang.org/talk01/slides.html#%2824%29

01 # in system module (simplified): 02 proc $ [T: object](x: T): string = 03 result = "(" 04 for name, value in fieldPairs(x): 05 result.add("$1: $2\n" % [name,$value]) 06 result.add(")")

Notes:

    [T: object] -- generic type + constraint
relies on builtin fieldPairs iterator

---

ek 1 day ago

Unfortunately this contribution is inhibited from being significant in value by the fact that TypeScript? doesn't support full gradual typing [0] and has an intentionally unsound type system [1].

brandonbloom 1 day ago

Accidentally unsound is a problem, but intentionally unsound just means that they balanced soundness against other design criteria such as completeness, complexity, and tooling capability, then decided that they were willing to trade soundness for other things.

For onlookers, here's a lucid explanation of soundness and completeness in type systems: http://eschew.wordpress.com/2009/08/31/sound-and-complete/

The author of that article argues that soundness is more important than completeness, but concedes that there are other ways to view the world too.

The bottom line is that most ad-hoc lint tools are both unsound and incomplete, but they are still useful for finding bugs!

--

ek 1 day ago

Unfortunately this contribution is inhibited from being significant in value by the fact that TypeScript? doesn't support full gradual typing [0] and has an intentionally unsound type system [1].

brandonbloom 1 day ago

Accidentally unsound is a problem, but intentionally unsound just means that they balanced soundness against other design criteria such as completeness, complexity, and tooling capability, then decided that they were willing to trade soundness for other things.

For onlookers, here's a lucid explanation of soundness and completeness in type systems: http://eschew.wordpress.com/2009/08/31/sound-and-complete/

The author of that article argues that soundness is more important than completeness, but concedes that there are other ways to view the world too.

The bottom line is that most ad-hoc lint tools are both unsound and incomplete, but they are still useful for finding bugs!

--

--

complaint about golang polymorphism on return type, as applied to channel reads:

" Another syntax / semantics oddity is the behavior of reading from channels (like a pipe). Whether a read from a channel blocks depends on how the return value is used:

 res := <- queue /* waits if the queue is empty */
res, ok := <- queue /* returns immediately if the queue is empty */


This bears repeating: the behavior of a channel read depends on how the return value is (will be) used. This seems like a violation of the laws of time and space! " -- http://ridiculousfish.com/blog/posts/go_bloviations.html#go_syntax

--

" With modern type systems employing structural subtyping, row polymorphism, proper variance and covariance handling and other advanced techniques static typing becomes a very powerful tool.

On the other hand, there are correct programs which we still don't know how to type. This means that there are programs we just can't write in statically typed languages, at least not yet. " klibertp, https://news.ycombinator.com/item?id=7861823

---

ocaml has the 'assert' escape hatch from the type system:

" If something can't be type-checked then you just add an "assert" and do the check dynamically.

For example, in 0install I need to ask the SAT solver which version of a program I should use. I pass it a load of SAT variables, each containing a user data field with the attached program details, run the solver, and get back the selected variable. The SAT problem also contains other variables, with other kinds of data.

I know the user data field for the selected result will be of type program details, but I couldn't figure out how to let the compiler check this statically. It's not a problem; I just check the type and "assert false" if it's wrong. I still get the benefit of static checking almost everywhere else in the code. "

---

---