things that make type systems confusing

the Haskell type class defn syntax

"class Eq a where " looks like a is a type parameter of Eq. (btw does haskell allow a destructuring bind on a in that statement?)

http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function

you are using the type system's type inference combined with polymorphism to decide something which feels syntactic
in unpacking this, you switch from reading things from left-to-right and right-to-left three times
- when you first see 'sumOf 5 8 22 :: Integer', you read it left to right. And indeed, that's how it parses: (((sumOf 5) 8) 22) :: Integer
- but now, you see that you need to determine the type of 'sumOf 5' in order to compute the first term. You know it takes an Int as integer. But you don't know what type it yields as output.
- So you look further to the right. 8 is an Int, but you still don't know what you need. So you look further to the right. 22 is an Int and you gotta have an Int at the end. So the type of ((sumOf 5) 8) 22 must be Int -> Int.
- And now you have to switch and go right to left. Since ((sumOf 5) 8) 22 must be Int -> Int, ((sumOf 5) 8) must be Int -> Int -> Int, and sumOf 5 must be Int -> Int -> Int -> Int.
- Now you start substituting the fn defn. Int -> Int -> Int -> Int matches (Int -> (*)) so (sumOf 5) is replaced by (sumOf . (5 +) . toInteger).
- But now you have to go right again. (((sumOf . (5 +) . toInteger) 8) 22). And now you have to go leftwards from the 8 (not sure if i got this step right, i forgot how . work w/ grouping): ((sumOf . (5 +) . (toInteger 8)) 22) -> ((sumOf . (5 +) 8) 22) -> ((sumOf (5 + 8)) 22).
- Now you have to go right to left again (or just remember your earlier going-left work); ((sumOf (5 + 8)) 22) :: Int so (sumOf (5 + 8)) :: Int -> Int so sumOf :: Int -> Int -> Int. Then go left to right again to expand: (sumOf (5 + 8)) 22 -> (sumOf . ((5 + 8) +) . toInteger) 22. Then go right to left again to apply: (sumOf . ((5 + 8) +) . toInteger) 22 -> (sumOf . ((5 + 8) +)) 22 -> sumOf . ((5 + 8 + 22))
the types of parts of the expression are unexpected and context-dependent and seemingly inconsistent with each other. You wouldn't expect the type of 'sumOf 5' to be 'Int -> Int -> Int', you'd expect it to be 'Int', because if you said (sumOf 5) + 3 you'd get 8. It's only 'Int -> Int -> Int -> Int' in this particular context. Indeed, right in this derivation, we see that (sumOf 5) is of type Int -> Int -> Int -> Int, but (sumOf (5+8)) is of type Int -> Int, and (sumOf (5+8+22)) is of type Int->Int->Int.
in fact, the compiler is using its knowledge of how many times the function will be applied in the future to alter the function's behavior. In theory, you could make a function which does one thing if it will be applied less than 143 times before a scalar results, and another thing if it will be applied 143 times before a scaler results, and another thing if it will be applied 144-251 times... etc. The type logic counts via type-level inductive hooey. This seems like an abomination because it seems like something syntactic far away from the location of the function in the source code, and in the future from the perspective of data flow, is affecting the semantics of the function.

Hmm, the more i think about it, the more i think that we should ban type inference determining ad-hoc polymorphism which is polymorphic only in the return type (see [1] for more reasoning on that). This won't make the language any less powerful than, say, Python (in Python there is no type inferencing so in order to make an ad-hoc polymorphic choice, the class of the value that the object that you are dispatching on must differ (which would correspond to an input value in Haskell and probably in Jasper). So i'm proposing that in Jasper, you can have all the type inferencing you want, it's just that type inferencing won't be allowed to drive ad-hoc polymorphism from the "future" (from the types of the outputs).

Or, try THIS one on for size: the statically known (at the time of compiling the typeclass) kinds of the return types of different ad-hoc poly fns of the same inputs may not vary. Hmm.. but that sounds like you cant have two instances of the same kind of two different typeclasses. For example, it would prevent the empty homogeneous list from being both an empty list of ints and an empty list of fns, at different points in the program. No... it just prevents the existence of a fn, 'empty', that returns that. You can still have Int.empty and Fn.empty; or even, empty(Int) and empty(Fn). So, you can't have a single fn, 'empty', that sometimes returns statically known lists of ints and sometimes statically known lists of Fns, on the same inputs. This should prevent any type-level inductive hooey like Haskell's variadic fns, which uses the static knowledge of how many times the result of a fn will be applied in the future, which seems like a 'syntactic' thing, to alter the behavior of the function over the various applications. It still allows 'messages from the future', that is, it allows the compiler to automatically select among multiple instantiations of return-type-only-ad-hoc-polymorphic functions based on the expected return value, but it guarantees that all of the possibilities will be of the same kind (and kind incorporates the notion of 'syntactic' difference here)

but you could get around this by giving a dummy argument to the fun (empty(dummy)) so that the compiler can infer 'dummy' to be the desired return type

example from http://kawagner.blogspot.com/2006/12/my-haskell-experience.html

" Lets have two simple, small functions:

giveElements :: (a -> Bool) -> [[a?]] -> [[a?]] giveElements condition = filter (detectElement condition)

detectElement :: (a -> Bool) -> [a] -> Bool detectElement = any

The above seems a bit useless but I've extracted from it's surrounding code to show the problem. The 'giveElements' function is able to extract certain elements from a list and returns the filtered list. The 'detectElement' function does the job of deciding which elements are in and which are not. It does this by checking if any element of a list fulfills a certain condition.

So short so nice, but after some work on your program you discover that your 'condition' function required access to outside state to do it's job. Maybe the function has to do some a small modification of some data or should simply write a short message into a log. To allow this, you have to rewrite the above functions to use a monad. For this example lets simply use the IO monad here. Now the functions become:

giveElements :: (a -> IO Bool) -> [[a?]] -> IO [[a?]] giveElements condition = filterM (detectElement condition)

detectElement :: (a -> IO Bool) -> [a] -> IO Bool detectElement c els = do r <- mapM c els return $ or r

The first function still looks quite similar, only the 'filter' function is replaced by 'filterM' now. 'filterM' is the monadic version of filter and gives the monad back from the condition. So without filterM the state won't get up from the detectElement function (and of course there would be a type error).

detectElement looks a bit more different. The reason is that there is no monadic version for 'any'. So I use a monadic map (the mapM function) to create a list of Bools by applying the condition to each element and then check if at least one of the results it true. This doesn't requires more evaluations of c than the first version because of lazy evaluation, but it still required a complete rewrite of my code. It's still readable, but not as much as before.

...

And if you have more complex functions the above translation process from a non-monadic into a monadic-one is much more difficult and IMO it also destroy the clarity of the code. Also state monads have bad runtime characteristics because of their 'inside-out' evaluation which can fill up the stack quickly. You can use IORefs instead - but then you can't simply transform your program to using continuations anymore for example. And if you put a IO monad into a state-monad, you have to lift all IO-accessing functions via liftIO.

And whats the point of having a pure functional language, going thru lots of trouble to maintain the purity, only to get things like IORef which you tried to get rid of in the first place? If you design a program to use IO monads from the beginning everywhere, then you have fewer problem maintaining and expanding it - but then you're using Haskell like a normal imperative language. Whats the point of Haskell then? ...

Its not a problem of functional programming in itself, its a Haskell thing: While monads are quite powerful in principle IMO they are a pain to use in practice. Because monads show up in the return type of functions, the easy composition of functions which is the prime reason for the power of functional programming only works as long as all those functions use (compatible) monads, and sadly many functions simply don't exists in this form. "

hopefully Jasper's attribute types will allow the monad-ness to be treated as orthogonal to everything else.

---

ok, whether or not we have principal types, i think languages to date have focused too much attention on them. Martin Odersky says http://www.codecommit.com/blog/scala/is-scala-not-functional-enough#comment-4162 "As to combining Hindley/Milner and subtyping: In principle it’s possible, but in practice it does not work so well. The problem is not so much that you end up too often with Any, but that inferred types become impractically large.". i'll have to look at scala to see what he came up with instead.

anyhow, my point is that some extant powerful type systems, like Haskell's, force the programmer to raise the level of abstraction but still to be very precise. You end up with long type expressions because of this. Also, you have to take all the facts about the type of an object and put them together into one long type expression, making sure that they are composed together in the proper fashion. For example, monads and monad transformers; the programmer things "this is a list of ints, and it is in the IO monad". To the programmer, these are two orthogonal pieces of information about the type of the object, but in the type system, one wraps around the other, and in a certain order (and it gets worse with monad transformers).

btw i'm sure someone has used the name "attribute types" before but it's probably something different from what i mean.

with attribute types, we never try to make the programmer interact with a single expression containing all the type information about a value (although they can query the compiler for that during debugging if they want). For example, this value is a list of something; this value is in the IO monad; this value is a list of ints. The programmer can look at either of these three as the type of the value in different contexts (note that the third implies the first).

---

in jasper, what haskell calls 'existential types', which means expressions which dont completely determine a type, e.g. 'some type with a Show method' will be the rule, not the exception. this allows things like heterogenous lists. i think the usual problem with this is (esp. if you feed such a list through a function which reshapes it), then you lose the typing information on what individual items were, and all the type system knows is that they are elements of the heterogenous list, with only whatever properties are common to all items in that list. then how do you then polymorph on individual elements later?

e.g. if you construct a structure where you know that the first item in each list is a function reference, and all the rest of the items are integers, how do you get the type system to let you apply the function reference later? all it knows is that these are each members of a heterogenous list. i think the Right Thing for this case according to languages like Haskell is to create a data type (ADT) that specifically has a fn in one field and then a list of ints in another. But this imposes the type system on the programmer; in addition to the task of writing the program, they have the new task of explaining it to the compiler. And what if your knowledge about the structure of the list is hard to express; what if you know that the list is a valid AST for some language; do you want to translate the entire semantic grammar of that language to Haskell's type system?

in jasper we want to make the task of explaining the program to the compiler optional. the 'promise' assertion will help here -- 'promise' tells the compiler to take something as a given (and to check it at runtime, unless runtime promise checks are disabled). This helps because you can tie promises to positions in code, rather than have them be universal truths about data types -- 'if we are here in the code, then you may assume this'. This saves you from having to construct a new data type for every state of knowledge that you have about your data (e.g. 'if this node of the AST is under a FOR-LOOP node, then the second child is an iteration variable symbol'), and from having to encode all your knowledge about the valid state transitions into the type system (e.g. an AST node of type AST-FOR-LOOP's second child is a node of type AST-UNDER-FOR-LOOP-ITERATION-VAR). Instead at the relevant point in your code you say 'we must be in an AST node corresponding to a for loop iteration symbol; i promise that this node is of type Symbol' This makes the program harder to reason about but easier to write, and even easier to read if the reader isn't an expert at the type system. It is equivalent to having weak typing/allowing unsafe type coercion like in C.

i think the other usual problem is, if you pass elements through a function that only demands an existential type, and creates new elements, you must distinguish between functions which guarantee to create new elements of the same type it was given, or functions which guarantee only that the new elements will satisfy the existential type.

---

i guess our type system would be described as

a type and effect system, optionally static, weak, inferred, structural (with optional nominal types)

---

in Haskell, existential types let you define a type which doesnt say 'this is a list of Ints', but rather, something like, 'this is a list of objects whose types are in the Show typeclass'. In Jasper all types are existential because 'everything is an interface'; you describe the properties that a type in a certain slot has, e.e. the interfaces that types belong to, not the types themselves.

although it should not be mandatory to state which exceptions your code might throw (that's the compiler's job), it should be possible for a caller to insist that you throw only a subset of a given set of exceptions, and that anything else is a type error

could use Coq for type assertions and proving stuff about Jasper programs, in the same way than Shen uses Prolog

would eventually have to port Coq into Jasper. note that Coq's not-Turing-completeness (via requiring 'obviously decreasing' structural recursion) is key.

hmm.. in addition to the restrictions contemplated above to prevent http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function , might want to have more restrictions on global type inference than on local (see http://lambda-the-ultimate.org/node/2862 ).

taints and dataflow analysis (?)

the type system can be extended to detect 'taints' or 'untaints'. an untaint is the negation of a taint. a taint can be defined either as a whitelist (anything using only operations in the whitelist is untainted; anything built only from untainted operations is untainted; everything else is tainted) or a blacklist (anything using any operations in the blacklist is untainted; anything built only from tainted operations is tainted; everything else is untainted).

Note that the taint system has negation at the top (taint = not untaint) and at the bottom (the base case can be a whitelist or a blacklist), but that the bottom must be all one polarity or all the other, not both. This should simplify things by making it unable to express all boolean functions. I'm hoping that most of the type inference system can use positive logic, perhaps with a few extensions to deal with stuff like this.

note that the necessity of making a blacklist complete would require Jasper to fix a set of primitive operations that everything is compiled to. An alternative is to first design the system to be fully expressive, but then to only support a published list of primitive properties that are guaranteed to be defined correctly (e.g. blacklisted if necessary) for every language primitive (e.g. stuff like side effect-less/side effect-ful, deterministic (the other part of referential transparency), reads, writes, moves, null). This achieves the goal of giving the language the right sort of machinery that can be repurposed to other tasks.

a taint can be calculated based on the operations used to produce a value (e.g. 'side-effect free' is an untaint of this sort; a function built out of side-effect free parts is side-effect free), or based on all operations applied to a VARIABLE (e.g. 'unique' is an untaint of this sort; the unique value could be 3 or it could be a function, but what matters is that if it ever read into another variable, then the original value is never used anymore; another one is 'const'). these two are related; e.g. if we want a variable containing a reference to be immutable, then we can only apply immutability-preserving ('const' in that variable) functions to it; but a const function if a taint of the first kind (defined inductively because a function that only applies const (in that variable) functions to that reference, and doesnt use any primitive mutation ops on it, is itself const).

we need ways to express: (a) a value or variable has the property (e.g. immutability) and (b) that an operation preserves or breaks the property (is whitelisted or blacklisted) (e.g. a function that does not write to a reference is 'const' in the D language, meaning it doesnt break the immutability)

perhaps this is not yet enough. We should be able to express D's transitive const (http://dlang.org/const-faq.html). We should be able to express Rust's pointer type system ( http://static.rust-lang.org/doc/tutorial.html#boxes-and-pointers ). We should be able to express [2].

---

swkane had a good idea that i summarized as:

" that's an interesting perspective that i hadn't considered; let the types stay complicated but try to find a way to make readable labels for them so you only have to dig into the complexity when you need to. "

--- http://www.haskell.org/haskellwiki/GADTs_for_dummies ( http://web.archive.org/web/20130702221947/http://www.haskell.org/haskellwiki/GADTs_for_dummies )brings up an excellent point: type classes are like arbitrary functions on types, with normal Haskell stuff like pattern matching, algebraic data types (multiple constructors), guards, etc, except with a confusing relational (rather than functional) syntax.

this brings up the obvious point: Jasper could be like this but use normal syntax

seems like there is really a lot of mileage to be had just by delivering a uniform notation for data, code, patterns, types, typeclasses

remember that since type signatures are optional and principal types are not the core concept, you need a way to add type signatures to someone else's library code (without modifying their source code) ---

i guess maybe the reason why e.g. Haskell has an idiosyncractic syntax for typing is that they want to syntactically prohibit things they dont know how to automatically typecheck in a decidable, relatively complete way (relative to some nicely characterizable subset of dynamically valid programs)

GHC's typechecking algorithm has expanded over the years as progress was made

i guess i think it would be better to have a very simple, uniform syntax, assumptions, and compiler type errors that are simple enough to use in debugging, then allow for the case that sometimes the typechecker simply wont be up for the task, and the programmer must use assumptions to bridge the gap

see also PrincipalTypingConsideredHarmful?

the idea of unifying the type language and the normal language is discussed here:

http://lambda-the-ultimate.org/node/1085

some interesting pointers from that page to check up on later:

dependently-typed languages
- epigram
ontic
ohmu
- http://prog.vub.ac.be/~thomas/ERLS/DeLesley.pdf
http://lambda-the-ultimate.org/node/1085#comment-11583
- comment that this works by formally mapping types to identity functions: http://lambda-the-ultimate.org/node/1085#comment-11645
http://lambda-the-ultimate.org/node/1085#comment-11555 , which is endorsed by Tim Sweeney in http://lambda-the-ultimate.org/node/1085#comment-11611
some random papers: http://lambda-the-ultimate.org/node/1085#comment-11602

http://lambda-the-ultimate.org/classic/message6587.html#6615

book on the theory of coq:

http://www.amazon.com/kindle-store/dp/B000QCUCZU#reader_3642058809

interestingly this sort of asserts that Coq is the first real useful language that makes practical use of Curry-Howard:

http://lambda-the-ultimate.org/node/1532#comment-17865

    Would it be possible to take an arbitrary mathematical theorem (let's say: "string concatenation is associative", or Fermat's little theorem) and express it as a type?

Yes, and in fact this is exactly what happens in automated theorem provers based around constructive type theory, or the calculus of (co-)inductive constructions. A good example of this is Coq. I would suggest reading through some of the tutorials/introductory documentation if you wish to learn more. Also, there is a good book on the topic (on Coq, Curry-Howard, and related issues): here. "

the roots of type theory lie in the effort to avoid things like Russell's paradox in the foundations of mathematics [see Stanford encyclopedia of philosophy]. The idea is to restrict the set of valid terms of a logic by putting type requirements on operators. You structure the requirements to prevent operators from applying to themselves or to things like themselves that can indirectly cause similar problems.

In some systems the way you know a term is invalid is that such a term has no type. The type system presents axioms and inference rules to make "type judegements" which assign types to terms, and if no type can be assigned to a term, it must be invalid. In this way the type system specifies a subset of otherwise acceptable terms.

For example, in simply typed lambda calculus, the predicate "lambda x . not[x x]", namely, the predicate of predicates that don't apply to themselves, has no type.

This semantics is problematic for Jasper's notion of type attributes because the idea here is that we can have terms that we don't know much about; terms for which we have only partial information, or no information, about what type they are. For example, the Jasperian approach to the above "bad" function would be either to say, "you can't apply a predicate to itself" [which still leaves open the door for more cleverly designed paradoxes], or to say, "i can't figure out the type of this and that's okay", again leaving the programmer to fend for themself against this sort of paradox.

e.g. if someone tried to define the function "lambda x: x x", jasper would let them. Calling this function would cause an infinite loop, but so what?

Surely there is some way to finesse that sort of usage paradigm onto conventional type systems but this indicates that they may not be an appropriate basis for Jasper. Especially because the whole original reason for their existence [e.g. to prevent Russell-ish paradoxes] is functionality that we are willing to do without. While we do want a logic to make type judegments on terms, we don't want to equate 'cannot be typed' with 'badly typed'. [Pierce's definitions still seems on-target, see Wikipedian defn quote]

otoh i guess we do want our type system to be consistent, even if we dont care about ruling out paradoxes in the object language, so we might need a little type theory just to ensure that. However, another approach would be to allow paradoxes and make our type logic paraconsistent, or something similar [e.g. is it paraconsistent to say that if we can derive both that x is of type t, and also that x is not of type t, that x is not of type t? i feel like there's some better way to phrase that :)]

remember also that perhaps the real challenge is that we want to allow the user to extend the type system. so even if what we provide is consistent, the user could add something inconsistent.

Also, recall that we want to allow the user to easily add things like "unique" [cannot copy] and "immutable" [cannot write to], as well as ownership types a la milewski [and wasn't there something else with a different form too, in terms of where the quantifier and where the 'not' was? todo]. Note that these are certainly over terms, not over base-level values [unless in our language all terms are values, like i guess in Haskell], as "unique" isn't really a property about what kind of value something is, rather it's about what operations you are allowed to apply to it.

it does seem as if there's one world in which you have functions applied to value objects like numbers, giving you values, and another in which you have terms being inductively generated and reduced by rewrite rules [implications], giving you something like grammars/proofs [methods which produce ever-larger sets of 'valid' terms by iteration]. Perhaps this second world is the pattern-matching/logic/type system part of Jasper. But where does probabilistic logic fit in here? Here terms are not either derived or not derived, they are valued.

-- consider using sequents as the fundamental logical operation

---

took a very brief look at http://existentialtype.wordpress.com/2011/04/16/modules-matter-most/ and i agree:

" More disappointingly, for me at least, is that even relatively enlightened languages, such as Haskell or F#, fail to support this basic machinery. In Haskell you have type classes, which are unaccountably popular (perhaps because it’s the first thing many people learn). There are two fundamental problems with type classes. The first is that they insist that a type can implement a type class in exactly one way. For example, according to the philosophy of type classes, the integers can be ordered in precisely one way (the usual ordering), but obviously there are many orderings (say, by divisibility) of interest. The second is that they confound two separate issues: specifying how a type implements a type class and specifying when such a specification should be used during type inference. As a consequence, using type classes is, in Greg Morrisett’s term, like steering the Queen Mary: you have to get this hulking mass pointed in the right direction so that the inference mechanism resolves things the way you want it to. "

copied from elsewhere:

Lessons learned from previous work

There are many commonalities in various approaches to race freedom.

    Ownership should be built into the type system. In OO languages each object must have an owner. In C-like languages each piece of data must have an associated lock.
    There should be efficient mechanisms for passing data between threads
        By value
        By reference. (The object being passed must be a monitor.)
        As immutable data, by const reference
        By unique reference using move semantics

Most of those ideas are expressible through type qualifiers. "

btw the 'promise' vs. 'demand' thing w/r/t const vs. immutable and lent vs. unique (lent should be called noalias or somesuch) seems pretty common (wasn't this also the deal with Haskell's existentially quantified types or maybe with Haskell typeclasses vs. Java interfaces? or were those other things?). should have syntax for it, so you can just say 'i promise it will be immutable' and 'i demand it be immutable' without learning two keywords for each of these cases and remembering which one is which (or at least, have a keyword naming convention like promiseImmutable demandImmutable)

--- ...

hmm if patterns define types. thinking about logics for patterns. the usual tradeoff b/t expressiveness and feasible inference.

since type are in a hierarchy, this suggests descripition logics.

i feel like we can cut out negation somehow. but not quite sure..

some types, like 'immutable' or 'functionally pure', are 'purity' types, meaning that they are true whenever some condition of impurity never obtains in some transitive/data-flow-y manner.

others, such as Integer, are 'classy' types, defined as subsets of other classy types. in this case they aren't true unless you have a reason to believe that they are.

so it seems like you may want to write some notation and then put a 'not' in front of it. but you'll still never need a 'not' inside of it.

mb logics without negation are called "positive"?

i think the 'purity' types correspond to positive 'impurity' types, e.g. 'mutable', which we can use negation-as-failure semantics on, e.g. if you never any code that would mutate this value along the dataflow, you may consider it immutable.

description logics have negation, and are PSPACE-complete (at least as hard as NP-complete), so that's pretty hard (but hey, at least it's decidable)

positive first-order logic is NP-complete (but hey, at least it's decidable) ( http://ecommons.library.cornell.edu/handle/1813/7017 First Order Predicate Logic Without Negation is NP-Complete by Kozen, Dexter)

do we really need both universal and existential quantifiers in our patterns, though?

hmmm... if there exists any mutation along the dataflow, then the variable must be mutable, so there's a there-exists. and a list of ints must contain all ints, so there's a for-all. but perhaps these are two different kinds of reasoning? e.g. it doesn't seem like you would ever need a forall if you are looking at a list of known types and trying to deduce new types, e.g. if the thing has type Int and type single-world-in-memory you don't need to look at both of those to deduce type Number, that's just a there-exists. similarly, you might want the type of a list of ints to say each element is an int, but you don't care about a list of things-with-at-least-one-int, so there you only need forall, not there-exists.

but just thinking about graph patterns, sure, sometimes you want to match a node which has at least one arc going to (Int or Real or Complex), and sometimes you want to match a node whose arcs all go to Int.

mb just that you can have either forall or thereexists but not both on one expression?

but what if you want to look at a list and see if it's a list of ints; you look at every item in the list (follow each arc), look at each of their types (follow the type arcs in the meta view), and then look for at least one of these type arcs to terminate at Int. But this is saying, forall arcs in the list, there exists a type arc from the target of the initial arc that targets Int. So we are mixing them.

we want OR; if something is an Int or a Float then it is a Number.

and i guess the last statement shows that we want Implies too?

could we possibly get rid of AND?

hmmm.. mb the two different activies above are, reasoning from the types to allowed values, vs. reasoning from program code to types.

e.g. reasoning from program code to types, we see that if there is a possible mutation, we must have type Mutable.

reasoning from the types to allowed values, we see that a list of ints must contain all ints

e.g. if you see a single non-int go into that list, then you deduce the impure type "non-list-of-ints", which is like Mutable.

similarly, from Immutable, you deduce that for every step in the dataflow, there is no mutation.

i think i may be conceptualizing this wrong. first off, for straight-up graph MATCHING, the problem is simple (this is like checking if an assignment to variables satisfies some set of FOL statements). the inference complexity comes when you are not presented with a single graph, but rather with a statement in this language (a statement about the type of the graph, i.e. a graph pattern), and you must deduce merely from that whether it will match another graph pattern, without reference to a single particular graph.

also, when i say that we want to deduce that because something is one type, it is another type, the types form a lattice. we can require the users to explicitly construct the rungs of this lattice, we don't have to infer them from the definitions of the types.

also, when i say that the compiler should infer if matching this graph pattern sufficies to show that that graph pattern will be matched, we can require a user to use the assertion facility to write a proof of this for us; the compiler is only checking the proof, not writing it. so if the problem of finding the proof is NP, then of course the problem of checking the proof is in P.

so mb it doesnt matter if theorem-proving in this logic is NP.

mb we just tell the programmer:

if you say "type A is a subset of type B", then if there is a chain of these statements that shows that type A is a subset of type C, then the compiler will allow things of type A to match against patterns of type C. However, it will not otherwise infer that A is a subset of C just because their definitions imply it.
you can't use negation within a pattern. But whenever you match a pattern, you can request that the handler be invoked when the pattern is NOT matched, instead of when it is matched. E.g. it's almost like you can have negation, but only at the top of the pattern (but we don't phrase it this way, because patterns are composable, and this would not be)
we want to be able to support recursive concepts like dataflow so that we can define functional purity and immutability. so we want to be able to say "if there is ANY PATH to node A from a node B such that each step of the path satisfies these constraints, and node B satisfies those constraints, then node A is of type T". if our inference system was complete then we would define it as usual by giving the transitive property to the thing being measured. but in this case we don't want to guarantee complete inference in general, so the user must ask for it explicitly. this is essentially a 'reachability' operator. it also lets us say with respect to graphs representing expression trees, 'any expression that contains a subexpression of type Y is type T' -- immutability is a special case of this i think.
you can use AND, OR, FORALL, and THEREEXISTS.
- i feel like further comment is warrented about this. eg. if you say 'forall B' and you say 'A is a subset of B' then we'll match something that is guaranteed to be 'forall A'.
- will we do any transformation of the ANDs and ORs? probably not, again aside from reordering ones on the same level..
- and i guess we'll reorder the quantifiers for you within chains of identical quantifiers

wait a minute... i think i am misunderstanding quantifier here.. we won't have statements like 'for all arcs a from node n, and for all arcs b from node n, a = b'. we'll have statements like 'for all arcs a from node n, let the target of the arc be m. for all arcs b from node m, ...'. we'll only quantify once at each level. so no, the quantifiers won't be exchanged. but used this way, this isn't really FOL quantifiers. we can't express things like 'for all arcs a from node n, there exists an arc b from node n such that b's label = a's label + 1'.

also, the reachability is like * is regexs, leading to the analogy that the reachability component of these patterns is like a regex over linear paths instead of over linear strings. a character in a string has a single successor but a node in a graph can have many (i.e. the outgoing arcs; or even the ingoing ones, i see no need to require the programmer to always respect directedness here, although that may be a good default, and some types of graphs may not permit the 'what links to here' query). so we need forall and thereexists to choose how to handle the multiple (Kant would say manifold) successors.

remember that we need to ask things about the reified edges, too, like what their label is

so i guess the type system will accept two kinds of definitive statements; "type A is a subset of type B" and "if something matches this pattern, then it is of type B". If you say "variable V is type B", then the system will enumerate all known types which are subsets of type B, and go through all known "if something matches this pattern, then it is of type T" for each such type T, looking for a match.

note: semantically, giving a pattern with "C or D" for type B is the same as saying "type C is a subset of type B" and "type D is a subset of type B". However, the system may not bother to infer this. (or will it? perhaps it should try replacing each type in the pattern with all of its subtypes..? or can we leave it up to the user to assert those things? no, if you have a type of 'list of numbers' and you assert that a list of Ints is of this type, that assertion should typecheck without you having to explicitly say it is a 'list of numbers' first. so it does try to match each type in the pattern against any subset of that type; but what it does not do is try to match the patterns of each of those subset types (because then we'd be recursing..). so if complex numbers are defined as things of the form (float, float), and you have a list like [(float, float), (float, float)], but you haven't asserted that the (float, float)s are complex numbers, then this list won't typecheck as a list of numbers ([Num]), because although it known that complex numbers are a type of number, it's not going to bother to match these patterns.

if you have a 'list of list of numbers', and you give it a list of list of ints, this should typecheck, but why? because it substitutes the 'list of numbers' match structure directly into the list match structure at the top. so it'll substitute in match structures of things directly mentioned, but not their subset types.

note: this way we never have to actually go into the 'type' metanode in our patterns for typechecking (except mb if we are doing something weird, not sure if we NEVER have to..)

how can we make the syntax for these patterns like [Num] for list of numbers? i guess that's just universal quantification. but mutability is existential..

with mutability: i guess it's important to be able to say that fns by default mutate unless we say they don't.. so we allow negations at both ends of the pattern (at the atomic elements, and at the top), just not in the middle.

hmm.. also should have an 'exactly one' operator (in addition to forall and exists), and an 'is-injective' operator (no two arcs have the same target), and a 'is-function' (no two arcs have the same label) operator.

hmm, now that we aren't doing inference, why not allow FOL anyways?

also remember we that we need a cons operator.. and i guess we need the analog of cons for graph nodes (i guess it's 'add this edge to this node') probably other more graph-y ones will arise.. the cons operator and its like must be able to appear on the left hand side of function definitions. Haskell used : and Shen used

also need parametric types; e.g. "swap(a,b)" of type A,B -> B,A where A, B are type variables

hmm.. principal types are too restrictive (since we want to uniformally treat type qualifiers such as 'unique'), but mb each node should only have one type when seen from each perspective?

@@ could refer to the last mentioned perspective (lexical scope), sort of an 'ibid' for Jasper

mb -f means 'negation of f' (negation of True is False, negation of 3 is -3), and f- is 'inverse of f'

---

messages to handlers are a graph with a set of nodes distinguished as 'headers'. handlers match on any of the headers, and then get the graph, with that header as an entry point

---

another fundamental attribute of nodes is reverse arcs, that is, 'what links here'

---

http://www.shenlanguage.org/learn-shen/prolog.html

---

http://clojure.org/Protocols

http://clojure.org/datatypes

easy for loops like python enumerate

---

types:

there are three kinds of properties associated with types. one is propositions that hold true for every value of the type; these are things like "list of Ints". the second is propositions that hold true for every access of the value in the type; these are things like "unique" (seen as 'copy never applied to this'), "immutable". the third are things about how multiple accesses to the value relate to each other, or how access to this value relates to access to other values (this third definition is unsatisfactory because it is so general). included in this are things like 'unique' (seen as 'this variable's value cannot be changed without accessing it directly'), 'deterministic' (vs stochastic), 'no side effects', 'purely functional'.

a suitable notation for the first one need only use positive connectives, and possibly quantifiers, over values, with subtyping.

a suitable notation for the second one need only be expressed as marking each primitive operator with respect to the type, and then confirming that no negative primitive operator of the type is every applied to a variable containing the value (e.g. if you want it to be immutable, then no mutation is ever applied)

i don't know to capture the third one simply, since defining 'no side effects' seems to be impossible without a language for the semantics of the programming language. of course, 'no side effects' can be treated as the second case if we allow it as a primitive.

--- should name those. mb 'value types', 'expression purity types'.

---

types can have properties, e.g. conditions, and these can either be necessary or sufficient or both. necessary: every value of this type satisfies this condition, or in the second case above, every access to every value of this type satisfies this condition.

sufficient: if there is a value satisfying this condition, or if there is a variable whose accesses satisfy this condition, then it is of this type.

--- in fact, we should have a general notation that lets us write 'necessary', 'sufficient', and 'iff'

---

should we allow types whose semantics aren't fully defined, that is, ones for which a list of sufficient conditions is not provided? yes, because this is how we use the type system to attach arbitrary tags and qualifiers for use in metaprogramming.

---

note that we need to attach tags both to locations in the expression tree, and also to types.

mb ~ can be expression tree tags, not needing ^ also? or are footnotes different from arbitrary tags?

i guess if we use -> for logical implication then move should be :=

compiler/interpreter option to say if the necessary conditions of supertypes of the type of something are checked

'promise' has a different meaning for 'expression purity types' than its usual meaning; we are only promising that the expression purity holds within the function we are defining, not in general; but when we demand it, we demand it for all past and future, over all functions.

for value types, 'promise' has the usual meaning.

want a generic way to add modals and quantifiers to statements. 'promise' and 'demand' are modals.

mb should look at

http://en.wikipedia.org/wiki/Category:Logic_symbols

http://en.wikipedia.org/wiki/Modal_operator

interesting application of modal operators in http://en.wikipedia.org/w/index.php?title=Modal_operator&oldid=503040598

if Jasper is general enough it should enable us to give commands that people would describe as rather philosophical, rather than practical computer programs that we know how to execute. E.g. the 'immortality program':

   Search for a program whose semantics is extremely similar to the input-output relationships of the agent known as Eugene Baylis Shanks III, who has the attributes in the language English "human being born in Tennessee in 1979 C.E. on the planet Earth in this universe". If such a program is found, execute it.

This contains all sorts of complex ideas, all of which however are generic enough that Jasper (or its std libs) could include them: search (without specifying how to carry out a search; e.g. looking for a program on the Internet or deterministically searching the space of all programs, or stochastically searching the space of all programs), a program, the semantics of a program, similarity, an agent, a name, the idea of attributes specified in a foreign language, a conditional on the result of search, the execution of a program.

note that placing the concepts of 'search' and 'similarity' in a std lib means giving Jasper the ability to express ontologies (not necessarily baking a large ontology into Jasper, however, although i think a bit of that would be a good idea also)

btw i used 'is' instead of 'are' on purpose because 'the semantics of a program' SHOULD be a similar concept.. perhaps as noted earlier, in this context 'semantics' is interchangable with the word 'meaning'..

---

end copied from elsewhere

---

Defining New Types

The real innovation in Qi is the use of sequent notation to define types. This is an enormously powerful means of defining types which gives Qi capacities beyond ML or even more recent languages such as Haskell. The use of sequent notation derives from Gerhard Gentzen who developed the sequent calculus treatment of first-order logic. In Gerhard's system a sequent is a pair composed of two lists of logical formulae called the antecedent and the succeedent respectively. The sequent is valid if the conjunction of the elements of the antecedent logically implies the disjunction of the elements of the consequent. The concept of logical implication is defined by laying down a series of sequent rules that say how sequents may be proved.

If the succeedent is restricted by only allowing one element in the list, then the system is a single-conclusion sequent calculus system; which is the kind that Qi is designed to represent. Qi requires that user-defined types be defined in terms of sequent rules. In other words, to define a type in Qi is to say how objects can be proved to belong to that type.

For instance, suppose we wish to define a type binary composed of all non-empty lists of 0s and 1s. Using sequent notation we would write:

if X e {0,1} X : zero-or-one;

This says that for any X, any sequent whose conclusion is X : zero-or-one (meaning X is of the type zero-or-one) is provable if the side-condition is satisfied (namely X is an element of the set {0, 1}). In Qi we writeif

(element? X [0 1]) : zero-or-one;

We state the simplest condition for being a binary number. Next we have two rules that say how bigger binary numbers are built out of smaller ones.

X : zero-or-one; [X] : binary;

\ binary_right \ X : zero-or-one; Y : binary; [X

Y] : binary;

\ binary_left \ X : zero-or-one, [Y

[X Y

Z] : binary >> P;

This says that if we have a list of at least two elements X and Y (followed by 0 or more elements Z), which is assumed to be binary, then we can replace this assumption by the assumption that [Y

Z] is binary and X : zero-or-one. We need both rules because the first rule says how to prove conclusions of the form .... : binary and the second tells us what we can infer from assumptions of the form .... : binary.

Here is a script showing the definition of the datatype binary as a non-empty list of 1s and 0s.The function complement calculates the complement of a binary number. Notice there are no special constructor functions for binary numbers - we just use lists of zeros and ones in a natural way. But this means that there is no obvious way of telling if [1 1 1 0] is intended to be a list of binary numbers or a list of numbers. To avoid headaches and combinatorial problems in typechecking, Qi insists on the explicit typing of functions.

" -- http://www.lambdassociates.org/qilisp.htm

--- maybe instead of outside-the-system sequents (that let you define new logics e.g. substructural logic), we should use classical logic and restrict the programmer to defining implication-like things in sequent form.

e.g. "A and B and C -> D or E or F"

but not e.g. modus ponens, which is built in to classical logic

otoh would be nice to have a uniform notation for nonclassical logics..

--- 18

" @9:

    If a function mutates local variables, but does not mutate global state, for all practical purposes it is just as good as a stateless function. From the callers perspective it is stateless.

Indeed, and Haskell's type system is rich enough to express this constraint. See http://www.haskell.org/haskellwiki/Monad/ST#A_few_simple_examples

Posted by: Dan

November 10, 2009 7:41 PM

" -- http://scienceblogs.com/goodmath/2009/11/philosophizing_about_programmi.php#comment-2064185

me: yes, but... the other guy's point is that having to use the ST monad for that isn't quite as convenient.

toread: http://cs.anu.edu.au/~Ben.Lippmeier/project/thesis/thesis-lippmeier-sub.pdf

---

decidable dependent types stuff:

Dependent types are based on the idea of using scalars or values to more precisely describe the type of some other value. For example, "matrix(3,3)" might be the type of a 3×3 matrix. We can then define typing rules such as the following rule for matrix multiplication:

    matrix_multiply : matrix(k,m) × matrix(m,n)  matrix(k,n)

where k, m, n are arbitrary positive integer values. A variant of ML called Dependent ML has been created based on this type system, but because type-checking conventional dependent types is undecidable, not all programs using them can be type-checked without some kind of limitations. Dependent ML limits the sort of equality it can decide to Presburger arithmetic; other languages such as Epigram make the value of all expressions in the language decidable so that type checking can be decidable, ...

http://www.cs.nott.ac.uk/~txa/publ/ott-conf.pdf Observational Type Theory (epigram 2)

http://en.wikipedia.org/wiki/Intuitionistic_type_theory something to do with Epigram

http://en.wikipedia.org/wiki/ATS_%28programming_language%29 supercedes Dependent ML

  ATS also has the good idea of token prefixes which indicate boxed or unboxed types

---

SAGE has an interesting idea: "hybrid type checking":

"Sage performs hybrid type checking of these specifications, proving or refuting as much as possible statically, and inserting runtime checks otherwise."

---

perl's notion of grammatical modes, i.e. context-dependence, is interesting, but i don't think that, e.g. plurality (scalar or list context), is a good choice of mode. in natural languages, we have plurality and gender and case and ?what else for nouns, and (conjugation), tense, aspect, mood, evidentiality, ?what else. more generally, in natural languages we have parts of speech: nouns, verbs, prepositions, adjectives, pronouns, adverbs, conjunctions, and interjections (see http://en.wikipedia.org/wiki/Part_of_speech for more). kant provides a system of "categories", or (hyper)logical modes ("hyper" if logic is taken to be merely boolean or first-order logic, since Kant's system is more expressive than these). anyhow, the "best" grammatical modes for programming languages are probably different than these (we don't seem to need gender, for example, although who knows, mb a clever use of gender could make it easier to read complex nested expressions... hmm, i think that's actually a good idea, todo...).

why have conjugation/declension? not for matching, as in natural language (but see the idea above about using "gender" to write expressions with less parens..). but for power/reuse

perl also makes the mistake of having similar-seeming operators that have different coerced types (i.e. == is numeric context, eq is string context). unless the morphology of the operator makes it very clear what type it is, this is confusing. also, it does the reverse of what you want in programming languages (to use modality for power/reuse); here we have one "conceptual" operation with many different lexical realizations; what we want is one lexical realization per conceptual operation, with modifiers.

here are some ideas for "grammatical modes" that might be good for programming languages:

default type (perl has numeric, boolean, string; but i dont think this works too well, at least compared to alternatives like explicit coercion, implicit coercion, python protocols, or haskelltypeclasses) vectorize quoted referenced (meta-level?) ignored concurrent assertion boxed/wrapped declarative/logical statement/constraint variable (comment; in parens b/c i consider this to be at a lower level than the actual programming language) (literal; in parens for the same reason as comment) purity access permission scope header/declaration/vardecl vs body transaction (todo: generalize) maximization polarity (positive vs. negative, max vs min, true vs false; or tripolarity (t/f/maybe, +/-/neutral, etc) (mb generalize to n-ary arity? mb generalize to algebra, include things like the wu xing?)

a clean mapping from noun->data and verb->function may not be the best. because in the semantics of natural language, nouns are state, yes, but verbs have lots to do with time. now, a pure computation has no relation with time (it is beyond, or before, time). but an impure computation with I/O side effects does have to do with time. this suggests that pure computations should all be nouns, and impure should be verbs. but whether a computation is pure or impure depends on how you define which bits of state are "inside" the system; a computer can mutate purely internal variables without becoming impure, but if it mutates (or reads) external variables, it becomes impure (monads have to do with defining which state is in the system). so the question of whether a variable read (or write) is pure or impure is context-dependent. but whether a word is a noun or verb is not (very) context-dependent. so this questions requires further thought (todo).

keyword arguments are like adverbs/adjectives

note: i like perl's "it" pronoun ($_), but if you let "print" be a shortcut for "print $_", you need to have another way to pass "print", the function. which is fine. jasper's version of $_ should have lexical scope by default, unlike Perl.

todo: read http://www.google.com/search?q=haskell+open+data+paper&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=iceweasel-a , including SYB3, and Open Data Types and Open Functions

search for attribute typing, attributive typing, attribute type system, attributive type system, partial typing, partial type system

property typing, property type system

predicate typing, predicate type system

propositional typing, propositional type system

fractional typing, fractional type system

success typings

toread

http://www.it.uu.se/research/group/hipe/papers/succ_types.pdf

also contains a related constraint propagation alg

mb also read later pubs by the same ppl:

http://user.it.uu.se/~tobiasl/publications/contracts.pdf

they implemented this stuff in a tool for Erlang, http://www.erlang.org/doc/apps/dialyzer/dialyzer_chapter.html

discussion:

http://lambda-the-ultimate.org/node/1910 http://www.quora.com/Has-anyone-used-success-typing-to-infer-and-validate-types-in-Python

also, a paper extending this work with a better algorithm that gives readable errors:

http://users.dsic.upv.es/~jsilva/papers/PEPM2013.pdf

an extension to catch more errors in certain cases:

http://hopa.cs.rhul.ac.uk/files/submissions/hopa2013_submission_7.pdf

currently cited by 48 articles in google scholar (http://scholar.google.com/scholar?cites=5054461946260377911&as_sdt=2005&sciodt=0,5&hl=en ), including this one which mentions success typings:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.103.4627

Static Detection of Race Conditions in Erlang : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.188.5681

this much-cited paper on TypeScript? claims that Success Typings are similar to the framework they use, called "abstract interpretation":

http://cs.au.dk/~amoeller/papers/tajs/paper.pdf

so mb what we need is just for each term to keep track of (a) a set such that if the term has an attribute within this set of type attrs, it definitely is ok, (b) a set such that if the term does not have an attr within this set of types, it definitely fails. do we also need the negations, e.g. attributes such that if it has them it fails and if it doesnt, it succeeds? do we also need to explicitly keep track of promise vs. demand?

my hunch, not fully developed, is that we'll end up needed just

 (a) a set such that if the term has an attribute within this set of type attrs, it definitely is ok, (PERMITTED, or P)

 (b) a set such that if the term does not have an attr within this set of type attrs, it definitely fails. (OBLIGATIORY, or B)

by adding their own custom type attributes and logic to go with them (see http://www.it.uu.se/research/group/hipe/papers/succ_types.pdf for an example of the form of such a logic; i havent read it yet but it looks like sequent calculus will do the trick), the user can extend these to be more expressive.

for the default, we'll choose readable and tractable over expressive.

my hunch is that it will suffice for one of (a) or (b) to be CNF and the other to be DNF.

for example, let's say that you see

f(x) = 3

we start by looking at 3 and saying, P = B = {3}, and looking at x and saying:

P = {}, B = {*}

now we look at f(x) and we say,

f: t(x) -> t(3)

therefore P(f) = B(x) -> P(3), so P(f) = * -> {3} and B(f) = P(x) -> B(3), so B(f) = {} -> {3}

umm... not quite.. the reasoning has to go both ways at once.. also, why is the literal 3 appearing as the type?

hmm for the both ways. if a value is being consumed as the input to a function, then that value's B is like offers, P is like hmm.

so,

P(3), B(3) = {3}, {3} P(x), B(x) = ?,?

actually i think both P and B (the strict typing and the 'success typing') must be INTERSECTED ("unified") at each step. that is, they go in the same direction. the difference is that when there are multiple branches of control flow, the P is pessimistic and the B is optimistic. e.g. in the http://www.it.uu.se/research/group/hipe/papers/succ_types.pdf paper, they make a point of noting that in Fig. 2, Derivation rules, APP, that usually APP would say that the type assigned to an application of a function would be a subset of the function's range, but in their system its a superset. i see also that there CASE derivation rule takes the OR of the cases, i assume this is also unusual and due to the liberality of success types.

toread++ http://people.fas.harvard.edu/~lib153/lectures/typechecking.pdf

-- good slides on Disciple (DDC) haskell variant:

http://cs.anu.edu.au/people/Ben.Lippmeier/project/disciple/disciple-anu-2008-06.pdf

these slides talk about Disciple' effect and region typing. Disciple seems like it might be a good language to implement Jasper in, but it's not as mature as Haskell.

i agree with disciple about how you should use something like effect types instead of monads. however, i think jasper attribute typing is a more general, uniform solution to this sort of thing.

Disciple (which is strict by default) has interesting syntax for laziness: f x is strict, f @ x is lazy

Disciple has effect, region, and closure typing. i dont understand the latter two.

http://cs.anu.edu.au/people/Ben.Lippmeier/talks/poisoning-20090618.pdf brings up an important problem he calls "poisoning":

if "mutable" is part of your types, and if "pi" is immutable, then using traditional type inference, if you have:

  x = x + 1
  if blah then x else pi

then you'll want to say that pi must be mutable (because x is clearly mutable; and you'll say that both branches of the if have the same type).

what i think jasper should do about poisoning is to allow the if to have the disjunctive type (union sum type, i think they call it) of the branches of the if.

and i guess if you have type information about multiple attributes (multiple type 'dimensions' you could call it) then that's a conjunctive type (product type, i think they call it)

---

chapter 3 explains the challenges of typing Python (stemming mainly from dynamicity):

http://web.archive.org/web/20081209222213/http://www.ocf.berkeley.edu/~bac/thesis.pdf

http://www.quora.com/Has-anyone-used-success-typing-to-infer-and-validate-types-in-Python

http://www.quora.com/What-are-the-most-common-type-inference-algorithms-in-use-today

" Jesse Tov, likes programming languages. 20 votes by Eugene Marinelli, Jackson Davis, Mike Dillon, (more) We should probably distinguish “type inference algorithms” from “(usually) inferred type systems.” I’m going to start with the latter, since I suspect it’s how most people would understand the question.

Of all things that we could characterize as type inference, the fairly lame stuff going on in Java and C# is probably more common. That aside, for global type inference—that is, where you write no types in your program (unless you want to) and are guaranteed that the type system admits only typeable terms—as far as I know descendants of Hindley/Damas-Milner are the only game in town.

The actual type systems in use are, for the most part, not pure HM but extensions. Of the major HM-based languages, Standard ML is still pretty close, but other languages have moved away in different directions. Ocaml has extended HM with subtyping and row types, whereas Haskell has extended HM with type classes, both of which involve type constraints. In that sense, the most popular HM variant might be Sulzman et al.’s HM(X), which is HM parameterized by a constraint system, though Ocaml and Haskell are not usually presented in that framework. Ocaml and Haskell have also both been extended with higher-rank polymorphism. So we might say that another commonly inferred type system is one for which type inference is undecidable: System F_\omega. Haskell’s FPH and whatever it is Ocaml is doing obviously don’t infer F_\omega types, but they do impressive type inference while dealing with a large portion of that. Along with FPH are a variety of other three-letter acronyms such as MLF, HMF, and HML, all of which are attempts to do HM-style type inference for larger subsets of System F.

The other interesting thing happening in type inference, and I don’t know how common it is, is local type inference. Scala is probably the most popular language right now with non-trivial local type inference. Scala’s type system is not globally inferable, but local type inference helps by eliminating the need for most type instantiation and some type annotations on bound variables. (Typed Racket and the language I work on both use local type inference as well, though I don’t think they qualify as “common.” But what I take away from this is that for people who want type systems to do increasingly difficult things, local type inference may be increasingly attractive.)

As for type inference algorithms, that’s harder to answer because I don’t actually know exactly what algorithms everyone is using. There are several distinct algorithms for pure HM, and some are more suitable than others for various extensions.

Damas and Milner (1982) first gave their type system in a non-algorithmic form. In particular, it doesn’t specify where to apply the generalization and instantiation rules. Their key contribution is Algorithm W, an algorithmic version of their type system that is sound and complete. That is, if W returns a result then there is a valid derivation of that result in their type system, and conversely, if a term is typable in their type system then W finds the principal (i.e., most general) type scheme. There are other algorithms for this same type system. Notably, Algorithm J is also sound and complete for HM, and many people consider it easier to understand.

However, I suspect that neither W nor J is the most common (HM-style) type inference algorithm at this point. Both Algorithms W and J call Robinson’s algorithm for first-order unification as a “subroutine.” Rather than interleave constraint generation and unification, it’s possible to separate type inference into two phases, the first of which traverses the term and generates a constraint, and the second solves the constraint by unification. I believe the idea originated with Wand (though I can’t figure out where). The best explanation of this that I know of is Pottier and Rémy’s chapter “The Essence of ML Type Inference,” in Advanced Topics in Types and Programming Languages (Pierce).

I mentioned earlier that inferred type systems today tend to be extended in ways that fall into the HM(X) framework, such as subtyping and type classes. Several of the first-class polymorphism extensions are in terms of constraints as well, though I don’t know whether that makes them candidates for the X in HM(X). While the original HM(X) paper uses an extension of Algorithm J (if I recall correctly), Pottier and Rémy recast it using the constraint solving approach, and it makes a lot of sense that way. I suspect that this new approach is more common (or becoming more common) than algorithms like W at this point. So it may be that the most common (non-trivially) inferred type system at this point is (instantiations of) HM(X), and the most common type inference algorithm is Pottier and Rémy’s algorithm for HM(X).

Via Christopher Thorpe. Comment • Share • Thank • Report • 10 Oct, 2010 "

---

http://lambda-the-ultimate.org/node/1115

http://lambda-the-ultimate.org/node/1532 --

what i call 'assumptions' he calls 'contracts'. maybe 'contract' is the better name.

http://www.cis.upenn.edu/~bcpierce/papers/harmful-mfps.pdf

oo, should read this:

toread

http://math.andrej.com/2012/11/08/how-to-implement-dependent-type-theory-i/ http://math.andrej.com/2012/11/11/how-to-implement-dependent-type-theory-ii/ http://math.andrej.com/2012/11/29/how-to-implement-dependent-type-theory-iii/

also "Dan Licata pointed me to A Tutorial Implementation of a Dependently Typed Lambda Calculus by Andreas Löh, Connor McBride?, and Wouter Swierstra which is similar to this one. It was a great inspiration to me, and you should have a look at it too, because they do things slightly differently: they use de Bruijn indices, they simplify things by assuming (paradoxically!) that , and they implement the calculus in Haskell, while we are going to do it in OCaml."

hmm, i think i should read that first:

http://www.andres-loeh.de/LambdaPi/LambdaPi.pdf

---

wikipedia's list of major type systems

ok this looks crazy: http://www.cs.indiana.edu/~sabry/papers/rational.pdf

success typings (i dont understand this yet but i think it may be what we want; hmm actually mb we want to report BOTH conventional typings, and success typings?):

http://www.softlab.ntua.gr/Seminar/2006/presentations/sagonas.pdf

https://www.google.com/search?client=ubuntu&channel=fs&q=success+typings&ie=utf-8&oe=utf-8

" F# has _ for placeholders in types. That works really well. Reply

    Liyang HU / Oct 27 2010 5:14 am

    @Ganesh: As does Agda.
    Reply	
    augustss / Oct 27 2010 7:09 am

    The same has been suggested for Haskell a long time ago, but nobody implemented it.
    Once you have _ you start wishing for more. For instance, _ -> _ means any function, but what if you wanted both _ to stand for the same type? Maybe _a -> _a?

...

UHC supports partial type signatures, although it uses different surface syntax: … -> …, resp %a -> %a for the examples above. It’s already there for a long time, so it might be a good idea to bring it in line with the notation Agda uses and to polish it up a bit. See also the doc http://www.cs.uu.nl/wiki/bin/view/Ehc/UhcUserDocumentation, look for partial type signatures.

---

ah, a better soln to the haskell variadic problem:

instead of saying (as i proposed but then discarded above) that you can have polymorphic return arguments, but they must be of the same kind, just say that you can have polymorphic return arguments, but for the purpose of type checking everything else, the type checker will pretend it can't infer which one you are using, even if it can.

e.g. if you have a function 'empty', which can return an empty list or an empty dict, you can't call any list-specific methods on this that arent present in dict

hmm but wouldn't that cause it not to typecheck at all if called in a context in which a dict is required?

hmm the idea there was to check kinds when the kind was syntactically important, but still to allow, say, some sort of weird database store that could store functions of arbitrary arity

hmm could make the entire typechecker only go forward, not backward, when doing polymorphism (e.g. an initial fwd-only pass to resolve polymorphism, then both directions passes to propagate constraints and check type safety, but polymorphism isnt allow to be resolved anymore).. seems extreme..

could allow backwards polymorphism but only when there is an explicit type annotation sufficiently close to the source.. i guess the variadic haskell issue (the issue is just that it is confusing, not incorrect) happens even if we recursively follow one function, and then also the call site.. so maybe just do local type inference (not even module-local)?

Subtyping and type inference

The type inference problem in the presence of parametric polymorphism and subtyping like in Java and C# is significantly harder than in Haskell (98) or SML. You're very unlikely to ever get anything like those FP languages' type inference. Scala is a good language to look at in this regard. It too uses (only) local type inference which is more or less as much as one can reasonably expect. That said, you can do way, way more with local type inference than C# is doing (as Scala demonstrates.) By Derek Elkins at Fri, 2008-06-20 22:12

Both necessary and not

In Typed Scheme, because Scheme inherently uses subtyping, we were unable to choose an H-M style global inference algorithm. Further, based both on our experience with ML/Haskell and our experience with the HtDP? style of giving a contract and a purpose statement for every function, we decided that that amount of annotation was a good middle ground. This also has the significant benefit that you can tell by looking at a function what the type is, which is often not the case in ML.

However, in some places, type inference should be considered mandatory. For example, with the exception of FX-89, no language I know of has ever had polymorphic functions which required explicit instantiation at call sites. For example, everyone wants (map add1 (list 1 2 3)) to work without an additional annotation. This, of course, requires type inference. We, like Scala, chose a solution based on Pierce and Turner's Local Type Inference system, but lots of other solutions are possible here (Java does inference at call sites as well, for example). By Sam Tobin-Hochstadt at Sat, 2008-06-21 12:12

" Probably because they don't know it's useless. I use and like C#. The type inference seems useful to me. Avoiding generic parameters on almost every linq extension method is a huge savings in comprehensibility. var x = new SuperDuperLongClassName?(); is a nice savings in redundancy. "

" Types can be parameterized over other types so that we end up with ???type constructor??? entities whose instances are concrete types much as data values are instances of types"

interesting; it's true, a List(Ints) is an instance of the List() type constructor; but it is also an element in the range of List() if List() is thought of as a function from types to types.

http://www.hypergraphdb.org/learn?page=RefTypes&project=hypergraphdb

Pro- toLisp's types include numbers, truth values, characters, procedures, sequences, symbols and pairs (a pair denotes a procedure call). 2

see also typeSystem, typeSystemDesignNotes (in other folder)

So i guess a type system that lets a type of all types be constructed is impredicative. I'm not sure if that's really a problem for programming languages though (except for its influence on the decidability of type checking and type inference). i guess if the programmer said, "T is the type of all types. R is the type of all types Y such that that Y is not a member of Y. Variable x is of type R. x = R.", the type checker might hang. Surely we can just rely on the programmer not to do that (and maybe have a max recursion limit in the type checker)?

" There are a few things about Go that I'm not super happy about, and that tend to bite me from time to time.

First, you need to "know" whether the variable you are dealing with is an interface or a struct. Structs can implement interfaces, of course, so in general you tend to treat these as the same thing. But when you're dealing with a struct, you might be passing by reference, in which the type is *myStruct, or you might be passing by value, in which the type is just myStruct. If, on the other hand, the thing you're dealing with is "just" an interface, you never have a pointer to it -- an interface is a pointer in some sense. It can get confusing when you're looking at code that is passing things around without the * to remember that it might actually "be a pointer" if it's an interface rather than a struct.

Go's type inference makes for lean code, but requires you to dig a little to figure out what the type of a given variable is if it's not explicit. So given code like:

    foo, bar := someFunc(baz)

You'd really like to know what foo and bar actually are, in case you want to add some new code to operate on them. If I could get out of the 1970s and use an editor other than vi, maybe I would get some help from an IDE in this regard, but I staunchly refuse to edit code with any tool that requires using a mouse.

Finally, Go's liberal use of interfaces allows a struct to implement an interface "by accident". You never have to explicitly declare that a given struct implements a particular interface, although it's good coding style to mention this in the comments. The problem with this is that it can be difficult to tell when you are reading a given segment of code whether the developer intended for their struct to implement the interface that they appear to be projecting onto it. Also, if you want to refactor an interface, you have to go find all of its (undeclared) implementations more or less by hand. "

Java generics are an example on how not to do it, but it does not mean all generic implementations are like Java's.

szalansky 1 day ago

link

Hope you write some further explanation for your statement, because so far it looks like part of this c++ vs. java flame.

pjmlp 1 day ago

link

Type erasure. Going the .NET way would be a better approach on my opinion;
Lack of type definitions. If it was possible to declare types in Java, the generic declarations wouldn't need to be so monstrous;
The common co-variance/contra-variance issues with the generic types;
Lack of support for builtins;
No way to specialize generic algorithms for specific types

reply "

some notes from http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/ECOOP-July09.pdf regarding haskell type classes:

" generics not subtyping

 Haskell has no sub - typing  Ability to act on argument of various types achieved via type classes: data Tree = Leaf

Tree f :: Tree - > Int f t = ... f‟s argument must be ( exactly ) a Tree square :: (Num a) => a - > a square x = x*x Works for any type supporting the Num interface  Means that in Haskell you must anticipate the need to act on arguments of various types (in OO you can retroactively sub - class Tree)

Branch Tree

however, compared to Java, which has subtyping, this restriction to generics allows haskell to do more type inference:

" Reason:  Generics alone => type engine generates equality constraints , which it can solve  Subtyping

> type engine generates

subtyping constraints , which it cannot solve (uniquely) "

bayle: i don't understand why constraints that cant be solved uniquely are a problem; cant you just say 'this is some type that's a subtype of that'? is this one of those problems caused by the doctrine of principal types?

haskell's parametric polymorphism can be used for some of the use cases of subtyping

because haskell does not have subtyping, the language designer doesn't have to think about http://en.wikipedia.org/wiki/Covariance_and_contravariance_%28computer_science%29

times when co- and contra-variance comes up (see en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)/ ):

a subtype method must be at least as permissive in what it accepts as the superclass, and at least as constrained in what it emits; so it must be contravariant in its arguments and covariant in its return types
- wikipedia adds: "Languages with first-class functions have function types..Those languages also need to specify when one function type is a subtype of another—that is, when it is safe to use a function of one type in a context that expects a function of a different type...In other words, the → type constructor is contravariant in the input type and covariant in the output type. This rule was first stated formally by Luca Cardelli.[1]..When dealing with functions that take functions as arguments, this rule can be applied several times...it is easy to calculate which positions are co- and contravariant: a position is covariant if it is on the left side of an even number of arrows."
- wikipedia adds: "This creates problems in some situations, where argument types should be covariant to model real-life requirements. Suppose you have a class representing a person. A person can see the doctor, so this class might have a method virtual void Person::see(Doctor d). Now suppose you want to make a subclass of the Person class, Child. That is, a Child is a Person. One might then like to make a subclass of Doctor, Pediatrician. If children only visit pediatricians, we would like to enforce that in the type system. However, a naive implementation fails: because a Child is a Person, Child::see(d) must take any Doctor, not just a Pediatrician. We could try moving the see() method to the Doctor class hierarchy, but we would have the same problem: If a Doctor could see a Person and a Child is a Person, then there is still no way to enforce that a Child must see a Pediatrician and that a Person who is not a Child cannot see a Pediatrician and must see another Doctor. In this case, the visitor pattern could be used to enforce this relationship. Another way to solve the problems, in C++, is using generic programming (see below)."
  - wikipedia adds that Giuseppe Castagna[2] noticed that the sort of reasoning that leads us to pair Child with Pediatrician is captured by runtime multiple dispatch. That article states, "In type-theoretic research on object-oriented programming, the issue of “covariance versus contravariance” is a topic of continuing debate. In this short note we argue that covariance and contravariance appropriately characterize two distinct and independent mechanisms. The so-called contravariance rule correctly captures the subtyping relation (that relation which establishes which sets of functions can replace another given set in every context). A covariant relation, instead, characterizes the specialization of code (i.e., the definition of new code which replaces old definitions in some particular cases). Therefore, covariance and contravariance are not opposing views, but distinct concepts that each have their place in object-oriented systems. Both can (and should) be integrated in a type-safe manner in object-oriented languages. We also show that the independence of the two mechanisms is not characteristic of a particular model but is valid in general, since covariant specialization is present in record-based models, although it is hidden by a deficiency of all existing calculi that realize this model. As an aside, we show that the &lgr;&-calculus can be taken as the basic calculus for both an overloading-based and a record-based model. Using this approach, one not only obtains a more uniform vision of object-oriented type theories, but in the case of the record-based approach, one also gains multiple dispatching, a feature that existing record-based models do not capture"
writing to a container, you may only write things that are subtypes of the container's value type; but reading from a container, you may only assume that the value you get will be a supertype of the container's value type. since we think of this in terms of casting the container's value type (rather than casting the value to be read/written), writing to a container is contravariant (you can only pretend that the container's value type is a superclass of what it actually is), and reading from a container is covariant (u can only pretend that the container's value type is a subclass of what it actually i).

regarding en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)/ and arrays , i assume the best thing to do is to actually treat array reads and array writes differently (e.g. covariance for reading, contravariance for writing)

regarding the child who can only see a pediatrician, not any ol' doctor, i guess the example demonstrates that in this case, the concept of 'Child' is not a subtype of 'Person', nor is 'Pediatrician' a subtype of 'Doctor', at least if the 'see' function is included in either one as normally defined. But how does this comport with the notion that children are subsets of people, and pediatricians are subsets of doctors? The problem is that, when we say that a 'Person' has a function 'see' whose domain is 'Doctor', we are stating that any Person may see any Doctor. What we wanted to say, was, rather, than if x is a Person, then a necessary but not sufficient condition for y to be in the domain of x.see is for y to be a Doctor.

this may relate to my proposed separation between 'demands' and 'promises'; if you see x.see(y), where x is a Person, this demands that y is a Doctor, but it does not promise that the expression is valid if y is a Doctor. I guess a better notation would be necessary vs. sufficient conditions for validity; if x is a Person, and you have x.see(y), a necessary condition for validity is that y is a Doctor, but this is not a sufficient condition. Alternately, we may speak of only necessary and sufficent conditions, but of sub- or super- types; if x is a Person, and you have x.see(y), then the condition for validity must be of the form that y is a some subtype of Doctor (note: as stated this would seem to rule out disjunctive and conjunctive expressions in the condition of validity, which isn't quite what i wanted).

Trying to understand Castagna's notion of relating multiple dispatch, i came up with the following:

if the compiler sees that Child is a subset of Person, then the fact that a Person can only See a Doctor gives us a necessary condition for x.see(y) to be valid when x is a Child: y must be a Person
if the compiler sees a multiple dispatch implementation of Child.see(Pediatrician), e.g. an implementation in the Child class of .see where the parameter is of type Pediatrician, then this provides a sufficient condition for x.see(y) to be valid when x is a Child: when y is a Pediatrician we know what to do.
the previous two cannot both be QUITE right, because what happens when Pediatrician isn't a subtype of Doctor? in that case the multiple dispatch definition and the declaration that Person.see() must take a Doctor parameter would seem to be in conflict. Three things that we could do (there may be more) are: (a) say that the sufficient condition is actually the conjunction of any necessary conditions, plus what we called above the sufficient condition (e.g. the Child.see() is only permitted to be called with a Doctor, so the Child.see(Pediatrician) definition is only actually called when the Pediatrician is also a Doctor); (b) say that what we called above the necessary condition is actually the disjunction of that with the sufficient conditions (e.g. Child.see(y) is actually valid whenever y is a Pediatrician, OR y is a Doctor), (c) notice this discrepency and throw a compile-time error. I favor (a).
there is a connection to success typings. Traditional typing seeks to guarantee through compile time analysis that the program will have no runtime type errors; success typings seeks to prove through compile time analysis that the program has a type error. So, traditional typing is concerned with the sufficient conditions for validity, seeking to prove that in every term in the program, the sufficient conditions for type validity are true, and success typing is concerned with the necessary conditions for validity, seeking to find a term in the program for which the necessary conditions for type validity are false.
the previous bullet point highlights that success typings may be badly named. In success typings, the type you associate with each term is the maximal type which meets the necessary conditions. If you ever unify two of these and get the null set (the inconsistent type which has no values as members), then you know that two terms have been combined in a way such that the necessary conditions for both of them can't possibly be met, therefore you have proved that there is an error in the program, e.g. a statement that if reached would cause a runtime error. So really, you might call this maximal type which meets the necessary conditions (and therefore, which represents the necessary conditions) a 'failure type', since knowing that a value falls within this type doesn't guarantee success, but knowing that a value falls outside this type does guarantee failure. And similarly, you might call the minimal type which meets the sufficient conditions a success type.
with actual multiple dispatch, there is an addition wrinkle; if you define a dispatch handler for Child.see(Pediatrician) but nothing else, then there is no way to handle Child.see(y) where y is a Doctor but not a Pediatrician. So in this case, just by looking at the multiple dispatch, you've also deduced something about the necessary conditions, not just the sufficient ones. If you say well then, we'll have to make our neccessary conditions more constraining, then this may be the dual to solution (a) in the point above starting with "the previous two cannot both be QUITE right".

also, in relation to the co/contra-variance of arrays, i think a simpler way to look at it is:

One way to look at it is that the container has an interface with two methods, read(): v and write(): v -> (); as usual, arguments must be contravariant and return types must be covariant; so when you when you write to an array you have contravariance in what you pass to the write() method, and when you read from an array you have covariance in what you get back from the read() method. For example, if you have an array of type CatArray?, and if this is a subtype of arrays called AnimalArray?, then in order to satisfy Liskov substitution w/r/t AnimalArray?, a CatArray? would have to accept any Animal written to it (it may demand only Animal), and promise to emit only Animals when reading from it (it must promise at least Animal).

Clearly that doesn't fit the bill; the above shows that what we called a CatArray? isn't the same notion as Cat[], or alternately, that Cat[] is not a subtype of Animal[], even though Cat is a subtype of Animal (that is, A \subset B does NOT imply that A[] \subset B[]).

When Wikipedia notes that immutable arrays can be covariant, that's because it's just like saying that anywhere you have an AnimalArray?.read(), you can substitute a CatArray?.read() without harm; the 'immutable' condition is here the same as saying 'an object with a subset of the full interface (namely, only the 'read' method).

the above suggests an inference algorithm that works in terms of the following: for each term, keep track of two types. in the following, i use the terms 'success typing' and 'failure typing' in the opposite of the traditional manner:

(a) a type representing the disjunction of all of the explicitly given sufficient conditions, e.g. the cases of a multiple dispatch (b) a type representing the conjunction of all of the explicitly given necessary conditions, e.g. the inferences on subtypes given by Liskov substitution

now the type associated with a term is the intersection of (a) and (b). In terms of conditions, that is:

(a disjunction of each explicitly given sufficient conditions) AND (a conjunction of each explicitly given necessary condition)

The Liskov substitution principal is not satisfied, because a Child cannot always be substituted for a Person (because, due to the conjoining of the disjunction of multiple dispatch cases, it is a type error for a Child to see a non-Pediatrician Doctor)

However, if the type annotations for Person are not regarded as a success typing but rather only a failure typing (not as, "if i use this Person in such a way so that these are satisfied then i want to be guaranteed that i wont have a type error at runtime", but instead, "i want the compiler to notify me if i DON'T use this Person in such a way so that these are satisfied"), then the corresponding principal to the Liskov substitution makes sense.

hmmm... needs more work..

if whenever we have multiple dispatch, we require (a disjunction of each explicitly given sufficient condition) to be a superset of (a conjunction of each explicitly given necessary condition), then the above type just becomes a conjunction of the necessary conditions, so Liskov substitution works; e.g. if we require that any multiple dispatch exhausts the type annotation given to the containing function. However, if a case like Child-Pediatrician, this just leads the programmer to create a multiple dispatch clause covering the case where a Child Sees a non-Pediatrician Doctor, which will just throw an runtime exception. We want a compile-time exception in this case.

ok how about this. we need a way for the programmer to be able to separately express "if this condition is satisfied, that sufficies to show that this will be well-typed" vs "if this condition is NOT satisfied, that sufficies to show that that this will be ill-typed", that is, success type annotations from failure type annotations. This goes back to my "demand vs. promise" idea: a success type annotation is a promise, a failure type annotation is a demand.

however, we also want a way to make explicit statements that are combined with statements elsewhere to form the full success or failure typing, e.g. the way that we want to constrain Child.see() to take only Doctors even if the multiple dispatch just has a clause for Pediatrician, and Pediatrician is not a subtype of Doctor.

So, the explicitly given demand or promise statements are really just partial demands or promises.

Now, when defining Person, we provide facilities for the programmer to say EITHER (or both) of: "i promise that if you do Person.see(x) and x is a Doctor, then this is valid" vs "i demand that if you do Person.see(x) then x must be a Doctor". If the former (the promise), and if Child is a subtype of Doctor, then Child.see(x) must accept any Doctor in x. If the latter (the demand), and if Child is a subtype of Doctor, then Child.see(x) might put additional conditions upon x beyond that it be a Doctor. Liskov substitution only applies to the promises.

As for combining explicit typings with implicit, what i mean is that if in one part of the code you make a promise that x is valid given some condition A, but then in a superclass you make a demand that condition B must hold of x, then the real promise is only that x is valid if (A and B). But what about duality? In what circumstances do explicit promises affect demands, instead of just explicit demands weakening subtype promises?

One thing we could do is have two notions of subtype rather than one; a promising subtype and a demanding subtype. I'm not going to consider that right now.

Another thing we could do is have supertype promises weakening subtype demands. E.g. if in the supertype we say that Child can See any Doctor, and then in the subtype we say that Child must see a Pediatrician, then we implicitly weaken the demand in the subtype to be "must see a Pediatrician OR a doctor". In this world we would have a compile-time error if Person promises that it can See any Doctor, but Child has a multiple dispatch with only Pediatrician; the Child multiple dispatch would infer both a promise and a demand, but that demand would be weakened, but it cannot be because it's coming from the multiple dispatch, so there would be an error. In this system, the supertype's promises and demands always implicitly overrides the subtypes, and when the subtype tells the compiler W, this is interpreted as something like "W, unless my supertype disagrees".

i think i like the latter suggestion.

---

co/contravariance, subtyping, promise/demand, failure and success typings

provide the programmer facilities to make type annotations that are either success typings (in my sense, which is the opposite of the traditional sense), which are promises that if they are obeyed, then things are well-typed, and also failure typings, which are demands that say that if they are not obeyed, then things are ill-typed.

explicit subtype promise and demand type annotations are implicitly weakened by supertype demands and promises.

multiple dispatch generates implicit promises and demands. If an implicit demand would be weakened by supertype promises, a compile-time error is thrown.

Perhaps promises are disjunctions and demands are conjunctions, although i'm not sure about that yet.

So, e.g. if Person is a subtype of Child and a Pediatrician is NOT a subtype of Doctor, if the supertype Person has a method .see(x):

   and it makes a demand that x is a Doctor, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then in fact the implied promise is weakened, and a Child can only .see Pediatricians which are also Doctors

   and it makes a promise that if x is a Doctor, then person.see(x) is valid, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then a compile-time error is thrown, because the implied demand that x be a Pediatrician must be weakend by the superclass's promise, so the actual demand of Child is only that .see(x) be passed a Doctor, and this demand is not exhaustively covered by the multiple dispatch cases.

Liskov substitution holds in both cases; if Person makes a promise that x in .see(x) need only be a Doctor, then others using Person can rely upon this demand, and we have a success typing where a call to person.see(doctor) is promised to succeed. But if Person makes a demand that x in .see(x) needs to be a Doctor, then we have a failure typing where a call to person.see(non-doctor) fails at compile-time, but there is no guarantee that person.see(doctor) will always succeed.

We can state a modified Liskov substitution principal in terms of promises and demands, rather than in terms of all properties of the type, as: the subtype must promise everything that the supertype promises, and may not demand anything that the supertype doesn't demand. This allows us to capture failure-typing properties (demands) as well as success-typing properties (promises).

Note that in the latter case, since Person doesn't give ANY promise regarding the type of x in .see(x), any uses of Person elsewhere in the program can never be proven to be valid. A programmer demanding success type safety cannot use Person.see (they must speak in terms of subtypes like Pediatrician). However, if Person was guaranteed to be one of an exhaustive list of potential subtypes, then we could take the conjunction of all of the promises in the subtypes to get a promise for Person, e.g. if a Person is either a Child or an Adult, x is guaranteed to be valid in person.see(x) iff x is both a Pediatrician and an AdultDoctor?. So this suggests that we should provide a way to state an exhaustively enumeration of the subtypes of a supertype.

note that subtypes can also expand functionality by making promises that their supertypes don't. E.g. if the supertype only promises success when .see is given someone who is both a Pediatrician and an AdultDoctor?, then a Child could promise that they can .see() a Pediatrician who isn't an AdultDoctor?. Another example is that an AlternativeMedicineBeliever? could promise that they can .see() a chiropractor, whereas an arbitrary Person might not promise that.

note that you still require the language to support parametric polymorphism (generics) in order to write functions that both read and write arrays of arbitrary type; an array which promises to give out Cats must also demand that you only put Cats in it; so if you are moving elements from one location in an array to another, you'll require the guarantee that it only gives out Cats in order to be able to promise that you will only put Cats in.

this language seems kind of confusing with regards to the type of results. It makes sense that a function demands that its inputs be of a certain type, or promises that if you give it inputs of a certain type, it won't crash. It makes sense for a function to promise that its output will be of a certain type. But what does it mean for a function to demand that its output be of a certain type?

i guess those term 'demand' isn't the best. a function says that succeed if its output is of a certain type, and will fail if its input is not of a certain type. a function says that its output will definitely be of a certain type, and it says that you must be prepared for an output that is any member of a certain type.

when talking about function outputs, the terms 'promise' and 'demand' sound like they make sense only if you talk about the promises and demands of the next INPUT connected to that output, and swap them. E.g. if i promise that i will output a cat, you can demand a cat from me. If i demand to be allowed to output any animal, you must promise that you can accept any animal. So whereas an input promise or demand refers to "i must get a subset of this", an output promise or demand refers to "you must accept a superset of this".

if A \subseteq B: If i make no promises nor demands regarding my output, and you make no promises nor demands on what you can accept, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you demand to be give A, we're cool (we are not guaranteed not to fail at runtime tho). If i make no promises nor demands regarding my output, and you promise that you can accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). if i demand to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output A and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you demand to be given A, compile time error If i demand to output B and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i demand to output B and you promise to accept B, we're cool (we are not guaranteed not to fail at runtime tho).

If i promise to output A and you demand and promise nothing, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you promise to accept A, we're guaranteed to succeed. If i promise to output A and you promise to accept B, we're guaranteed to succeed. If i promise to output A and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output A and you demand to be given B, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you promise to accept A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given A, we're cool (we are not guaranteed not to fail at runtime tho). If i promise to output B and you demand to be given B, we're guaranteed to succeed.

by 'we're cool', i mean 'no compile time error from the failure typing' and by 'we're guaranteed to succeed', i mean the program typechecks in the success typing (e.g. traditional) way, and by ' we're cool (we are not guaranteed not to fail at runtime tho)' i mean that although failure typing wouldn't give us a compile time error, the program does not typecheck in the success typing (e.g. traditional) way.

note that you demands and promises are related: you can't promise that you'll be happy with a type of input while also demanding something more restrictive, and you can't promise that you'll output a type while demanding that you be allowed to output a supertype. This is because these are modal logic operators with the usual relationship: you are NOT allowed to promise something that is NOT consistent with your demands, and you are NOT allowed to make demands that you have promises you wouldn't make. The direction of this relationship flips from input to output; you can't demand that you be given input that is a proper subset of what you promise that you will accept, or equivalently, you can't promise that you will accept input that is a proper superset of what you have demanded; and you can't demand that you be permitted to output a proper superset of what you have promised you will give, or equivalently, you can't promise to give a subset of what you've demanded. (but do these relations bind subclasses? i guess not, b/c a subset can promise to handle a type of input that the superclass can't handle; so a Child CAN say that it can .see() a Pediatrician who isn't a Doctor. hmm so i guess that kills the above idea of 'implicit weakening' controlled by the superclass, right? maybe implicit weakening only happens within the same class? or do we say that a Child can only promise to see a non-Pediatrician Doctor if the Doctor in the Person superclass was a promise, not a demand? e.g. so then you couldn't have a Child class which both (a) refuses to see all Doctors (non-Pediatricians), and also (b) can see some non-Doctors. i guess that makes sense. ok, so yes, these relations DO bind subclasses? Again, moving from superclass to subclass increases promises and weakens demands. So maybe superclass promises do bind subclasses, and furthermore, subclasses may not introduce new demands? but then what about the Child who demands a Pediatrician? So maybe moving from superclass to subclass we can increase promises, weaken demands, and also decrease promises when doing so does not conflict with superclass demands, and increase demands when such does not conflict with superclass promises? no that isn't right.. hmm... maybe it varies with input vs output too; moving from superclass to subclass we can increase promises, and weaken demands, certainly; we can also increase demands on the input provided it doesn't conflict with a superclass promise; but i guess we can do the same with the output.. maybe promises and demands are the asymmetrical thing: you can only increase promises, but you can both weaken and strengthen demands? so promises are binding on subclasses but demands (or lack thereof) are not?)

so the failure typing triggers a compile-time error when there are irreconciliable demands, and the success (traditional) typing guarantees success when there are consistent promises.

i feel like the demand/promise terminology and the success/failure terminology is perhaps more complicated than just necessary and sufficient conditions for well-typed?

IF subclasses can weaken demands, to keep syntax simple, maybe not provide a mechanism to do this explicitly, but a promise by the subclass can weaken the demand?

hmm i think that perhaps my modified Liskov above is bad. Perhaps subclasses should only be able to increase promises, AND only to increase demands. E.g. if the Person superclass demands a Doctor, you can't make a person who will see a Chiropractor. And if the Person subclass promises to see a Doctor, you cant make a Person subclass who insists on a Pediatrician.

note that if you promise something on the input, you promise to accept any subclass of that, and if you promise it on your output, you promise to emit only subclasses of that

so it's clear what to do with demands and promises and function composition and how they relate to each other and success and failure, and it's clear that promises made by a superclass must bind a subclass. What's still unclear is if or how demands made by a superclass affect a subclass.

a little later: ok, i think the right way is for subclasses to be bound by the demands of the parent class. So both promises and demands are additive. The reason is that, just as you should only make a promise if you are sure that the value won't cause an error if that promise is fulfilled, you should not make a demand, it's dual, unless you are sure that the value WILL cause an error if the demand is NOT fulfilled. So if you find yourself wanting to remove demands in subclasses, then the original superclass demands must be in error. For example, if a Person demands to .see() only Doctors, and you have an AlternativeMedicineBeliever? who can .see() a Chiropractor, and Chiropractor is not a Doctor, then you should have made Person demand to see ThingsThatActLikeDoctors? (some superclass of both Doctors and Chiropractors) instead.

I briefly toyed with letting you relax demands in subclasses as a form of default reasoning (relaxing the demand for the non-default special cases), but now i think that's a different kind of thing that would require a separate design (or at least a separate deeply thinking through, even if the end result is to change this design).

When a function input is unified with the provider of that input, the unification is the union of all of the places where the function gets input,

largest supertype of the provider's type that is a subtype of the function input's type.

When multiple type constraints apply to the same value, they are combined as follows. The necessary type constraints are conjuncted/intersected to make the unified necessary constraint. And the sufficiency type constraints are conjuncted/intersected, and also conjuncted/intersected with the unified necessary constraint, to make the unified sufficiency constraint.

For example, if you have a function 'f x = x + 4', and +: Num a => a -> a, and a subtype of Int is "3", and 3 \member 3, and you do y = f 3, then the x in f x is unified with the a in +'s type, and so is the output of f. f starts with no type constraints, so it starts as Any -> Any. So for its input, we take the largest supertype of Any that is a subtype of +'s a, and we get Num a => a. For its output, we take the largest supertype of +'s a that is a subtype of Any, and we get Any. So we have f: Num a => a -> Any.

"3" is unified with "Int" to produce "Int". If elsewhere you also do z = f 4? The intersection of types "3" and "4" would be empty.

If the compiler calculates that two necessary type constraints being made on a value are irreconcilable (for instance, if 'y = f x' and if f demands its output to be a Str and y demands its type to be an Int), then at least one of these necessary constraints cannot be satisfied, so the compiler halts with a type error.

If the compiler calculates that a chain of operations on values is such that each operation's output has a sufficient constraint that matches the sufficient constraint of the next operation's input, then no runtime type checking needs to occur between the two values (for instance, if y = 3 +5, then since 3 and 5 each have a sufficiency constraint of Int, and + has a sufficiency constraint of Int -> Int, then in this case no type checking need be performed on the inputs of +). Otherwise, runtime type checking is provided for each term's sufficiency constraint.

two necessary type constraints being made on a value are irreconcilable (for instance, if 'y = f x' and if f demands its output to be a Str and y demands its type to be an Int), then at least one of these necessary constraints cannot be satisfied, so the compiler halts with a type error.

A subtype inherits all sufficient and all necessary type constraints of its supertype.

If a subtype has a sufficiency constraint which is a proper subset of the sufficiency constraint of its supertype, that is a type error. This can only occur when improper demands are made on the subtype which conflict with the sufficiency constraints of its supertype. (note/todo: how can the programmer control/avoid unwanted implicitly inferred sufficiency constraints on the supertype?)

The type of input to a multiple dispatch function is determined to be the disjunction of all input types defined in the dispatch cases (e.g. this disjunction forms both a necessary and a sufficiency constraint). (note: this means that if the dispatch cases can be found in multiple places in the source, they must all be gathered before type checking. If the user wants the program to compile despite leaving out some of the cases, they can add a default dispatch case).

largest supertype of the provider's type that is a subtype of the function input's type.

"3" is unified with "Int" to produce "Int". If elsewhere you also do z = f 4? The intersection of types "3" and "4" would be empty.

A subtype inherits all sufficient and all necessary type constraints of its supertype.

todo

So, e.g. if Person is a subtype of Child and a Pediatrician is NOT a subtype of Doctor, if the supertype Person has a method .see(x):

   and it makes a demand that x is a Doctor, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then in fact the implied promise is weakened, and a Child can only .see Pediatricians which are also Doctors

   and it makes a promise that if x is a Doctor, then person.see(x) is valid, and the subtype Child implements via multiple dispatch the case where child.see(x) is a Pediatrician, then a compile-time error is thrown, because the implied demand that x be a Pediatrician must be weakend by the superclass's promise, so the actual demand of Child is only that .see(x) be passed a Doctor, and this demand is not exhaustively covered by the multiple dispatch cases.

so the failure typing triggers a compile-time error when there are irreconciliable demands, and the success (traditional) typing guarantees success when there are consistent promises.

i feel like the demand/promise terminology and the success/failure terminology is perhaps more complicated than just necessary and sufficient conditions for well-typed?

IF subclasses can weaken demands, to keep syntax simple, maybe not provide a mechanism to do this explicitly, but a promise by the subclass can weaken the demand?

note that if you promise something on the input, you promise to accept any subclass of that, and if you promise it on your output, you promise to emit only subclasses of that

would like to define integers by induction with 0 and successor, and do pattern-matching recursion on them. but would also like to use the speed of integer math on the underlying platform.

so, Jasper needs the concept of an isomorphism between two representations of one type, and a mechanism to state these and for it to use them

note that in general this would also be helpful when we want to specify a mapping between some Jasper data, and how it should be physically represented, e.g. serialized or presented through an FFI. Note that the same Jasper data type could have multiple physical representations (e.g. C or Pascal-style strings). Note that this stuff gets a little tricky so we want support for homomorphisms, not just isomorphisms.

interfaces can help but can we do better? how do multiple interfaces work with ADTs and pattern matching? is this a perspective? a view?

i guess it should be a perspective. s(s(0)) and 2 are two perspectives on the same value.

so with the metadata, if you attach some metadata to a metadata perspective of a value in x whose int perspetive is "2", then if y = 2 + x, clearly y's int perspective will be 4, but will the metadata carry over? how would it be merged with the metadata from "2", if any? i guess you'd have to explicitly specify stuff like that?

so then does every operation have to be specified in every perspective? seems like this misses the point. certainly if you define "+" in the normal int perspective, you shouldn't have to also define it in the successor perspective. do you just define both directions of conversion between these two and then perform each operation in the perspective it is defined in, and then convert it on demand to the others?

or isomorphisms between two types. looks like Frank Atanassow is thinking along similar lines, although he seems ( http://www.informatik.uni-trier.de/~ley/pers/hd/a/Atanassow:Frank ) to want automatic inference of isomorphisms, i'll settle for manual:

" Re: Arrow

I've seen you mention your programming language before, but never any real details on it.

Yes, mostly because I cannot honestly answer all the questions I put forward above. I will do my best, though. (I can't give full answers here, anyway, though; it would take too much time.)

What problem does this language solve? How can I make it precise?

The most important problem I'm trying to address with Arrow is the fact that ML-like algebraic datatypes are not abstract or compositional enough to support interoperability on a large scale.

For example, I can define a datatype of naturals in a number of different ways. One as data Nat = Z

S Nat. Another is data Nat' = Z'

S' Nat. These differ only naming but obviously have the same models. Another is data Nat = Z ()

S Nat. This one just has an extra unit. Another is given by Succ Nat where data Succ a = Fresh

Old a. Here I have factored the datatype into two; but when I compose the factors, the result is only isomorphic to Nat, not equal. By applying isomorphisms in this fashion I can obviously come up with an infinite number of ways to represent the naturals, all unequal, and therefore not interoperable.

But in fact I've only just started. The representations above were all numerals in unary base, but that's very inefficient. Better is something like [Bool], which is a representation in base two. And of course I can do all the things to this that I did above. But why stop there? How about base ten, [Ten]? Or [Either Five Five]? Or all compositions using functors only isomorphic to []? Other factorizations? And so on...

Here is one reason this is problematic: a realistic program is one that uses libraries from many disparate sources. For them to interoperate, they need to use the same representations. But libraries developed in isolation are not likely to do so. At the very least, people will choose different names.

The way I am trying to address this is to make the type system know when two types are canonically isomorphic -- this means roughly that it knows when two types are mutually coercible in a unique way (no guessing -- there is exactly one coercion, or none). Then I add constructs which let you define "generic" functions that don't care, say, if they get as input a [Bool] or a Nat, only that they fall in the equivalence class (what I call a "genus").

More precisely, the way I do it is by identifying a basis of types from which all others are generable by canonical isomorphism. These are "type values"; all the result are "type redexes". If two types have the same type value, then they are intercoercible. This is enough to generate a coercion between them, but using that coercion costs you run-time, and I think you can do more.

What I want to do is classify types according to kinds which denote different bases. Then there will be a functor from one kind to another which corresponds to change-of-base as in linear algebra. Natural transformations for such functors should be the generic functions I am looking for, and I think that the change-of-base can all be done statically, essentially by normalizing a program's types at compile-time. It is a bit like partial evaluation, but without the binding-time analysis.

Mind you, the problem I described above is only one application of this idea. I could give you four or five others.

How can I show it solves this problem? How can I make it precise?

Formal methods.

Is there another solution? Do other languages solve this problem? How? What are the advantages of my solution? of their solution? What are the disadvantages of my solution? of their solution?

Yes, there are other solutions, if you compromise on one point or another. The most obvious thing is to promulgate standards, but standards require cooperation, and the problem I am trying to solve is one where parties can't cooperate because they don't know of each other. Also, there is the adage that there are so many standards to choose from; given two choices of n standards, you have n-squared combinations, so this doesn't scale.

You can go in the direction of dependent types and strengthen type equality. But this is a variation of the standards argument. Dependent types exploit type equality, and, for example, cartesian products are generally not associative up to equality, yet they are absolutely fundamental. As you force more and more types to be equal by adding conditions, you get things which look less and less like sets, and type checking gets harder and harder. One advantage of my approach is (I conjecture) that it supports principal type inference.

There are module systems like ML. These are variations on dependent types. Module systems have exactly the problem I am solving, namely that when you apply an opaque functor, you lose the representation type. ML solves this with type sharing, which is a type equality assertion. Sharing assertions are tedious to write, and break abstraction: they force representations to be equal, which is in practice often not what you want. The problem is that signatures only tell you types, not laws. Also, most module systems are limited by the fact that you can only instantiate a finite number of functors in a program. And there is polymorphic recursion...

My approach has disadvantages. First, I only know at present how to handle very weak sorts of isos, although I think I know how I can handle more. Second, all the information is structural, so you have to program "typefully". You will never get the compiler to infer, for example, that file handles and files are one-to-one. Third, my approach seems to require bringing in a lot of other apparently unrelated stuff like linearity, and it is very abstract (though I hope that it will seem obvious once you see some examples).

How can I show that my solution cannot be expressed in some other language?

Again, formal methods. Showing there exists no translation, etc.

That is, what is the unique property of my language which is lacking in others which enables a solution?

There are several interrelated things. First, the type system needs to know what is a type value and what is not and how are they related. Second, it needs to understand the difference between a type function and a functor/monad. (Haskell datatype declarations don't give enough information for this; and Haskell type class notions of Functor and Monad are not guaranteed to be actual functors or monads.) Third, it needs a notion of datakinds, functors which are not endofunctors and their associated polymorphic functions. I have heard Omega has these things, but I think they are built on equality, and I have not had time to check it out.

What parts of my language are essential to that unique property?

I cannot say yet, as I have not formulated a solution.

There... not too dishonest...

(Please note, by the way, that untyped languages suffer from exactly the same problem but have no hope of ever using such a solution, since the whole thing is enabled by static typing. This is why I always say that typing increases freedom -- typing means you don't have to depend on other people's cooperation. It does not matter how they represent things, as long as you get a static description of it.)

Update: I want to give you another application because, although it is trivial, it suggests the possibilities that open up with iso-equivalence. People often complain that Haskell does not support extensible sums or extensible products. But, it does, of course, in the sense that you can always add constructors to a datatype T by defining them in S and using Either T S as your extended type. Or you can leave a "hole" in your type by adding a type parameter; so you have a datatype T a and extend it with S a getting T (S a) which can be extended again. (This is how objects are encoded sometimes.) But no one sees this approach as a solution, because it is a real pain to dig past the extra constructor to get to your extended ones. If data T a = C1 Int

By Frank Atanassow at Mon, 2006-06-05 21:33"

C2 a and data S = C3 Char	C4 a, they don't want to pattern-match on C2 every time they need to get at C3; they want data T' = C1 Int	C3 Char. But this T' is iso to T (S ()), so generic functions in my sense treat them the same, and you get mostly what you want, without compromising type inference.
login or register to post comments

y'know, one property of ADTs that is important that we dont want with interfaces is that they're closed; if Tree = Node val left right

Empty, then we want to write switch statements (multiple dispatch etc) for each of those cases and then statically know that we covered every case.

but really this property of being a Node or an Empty can be just a property on the object. So you have a "Type of tree" field that holds an enum which can either be Node or Empty.

later: this is relevant to the last paragraph if not the others http://en.wikipedia.org/wiki/Tagged_union

apparently Scala and https://github.com/lihaoyi/macropy#case-classes call them 'case classes'. Rust has 'em. http://en.wikipedia.org/wiki/Tagged_union#2000s

" Some languages such as Scala allow base classes to be "sealed", and unify tagged unions with sealed base classes. "

the first use of this is what i've been calling Any, Object, or a Dynamic type: http://en.wikipedia.org/wiki/Variant_type (the second use of this is as cases in an ADT)

upvote

termain 9 hours ago

link

Ambivalent or apathetic regarding the typing debate?

Typing can become useful in numerical code when you move past operating on scalars. Column and row vectors needn't be confused, a two-vector and a complex number have different types, etc.

Also, physical quantities can have different types and a type system can be useful there.

I totally agree that for numerical code, operator overloading is of great utility.

http://www.megacz.com/thoughts/types.and.static.analyses.html

i don't understand this:

	If there is one thing I've suffered greatly in order to learn about dependently-typed programming, it is that one should never attempt to concatenate lists. If you find yourself doing this, you probably ought to be using option-trees instead.

" -- http://www.megacz.com/thoughts/option-trees.html

" Scheme's missing ingredient: first-class type tags I've always felt that Scheme does data and types wrong, but I couldn't really say what bothered me until yesterday. After all, R7RS now has define-record-type for creating disjoint types. But that ain't enuff!

In my new language, Wat, I've found a satisfying solution, inspired by Kernel's encapsulation types, but going beyond them slightly. There are two procedures:

    (make-type) returns a list of three elements: 1) a first-class type 2) a tagger function for tagging a value with that type, creating a tagged object and 3) an untagger function for extracting the value of tagged objects of that type.
    (type-of obj) returns the first-class type of an object. The crucial point is that type-of not only works for tagged objects of user-defined types created by make-type, but also for all built-in types. E.g. (type-of 12) will return the number first-class type.

This system has the benefits of Kernel's encapsulated types: only someone with access to the tagger function (capability) may create new instances of a type. Only someone with access to the untagger function may access the contents of tagged objects. So object contents are potentially fully encapsulated.

But at the same time, the fact that every object, including built-in ones, has a first-class type makes it possible to efficiently program generically. E.g. one may create Smalltalk-like dynamic dispatch by using the first-class types of objects as indexes into a virtual lookup table of a generic function. This is not possible in either Scheme or Kernel. In both languages, programming generically requires one to use a cascade of type predicates (e.g. number?).

Example in Wat:

; destructuringly bind the three elements ; returned by make-type (def (person-type person-tagger person-untagger) (make-type))

(define (make-person name email) (person-tagger (list name email)))

; untagger also performs a type check that ; person is in fact of type person (define (get-name person) (car (person-untagger person)))

(define (get-email person) (cadr (person-untagger person)))

(define p1 (make-person "Quux" "quux@example.com")) (get-name p1) --> "Quux" (get-email p1) --> "quux@example.com"

Posted by Manuel Simoni at 15:07 Labels: kernel, scheme, types "

http://axisofeval.blogspot.com/2012/08/schemes-missing-ingredient.html

"Every fast lisp has a type checker, there is Typed Racket and there is Dylan, which has a richer type system (union, intersection and singleton types which are first-class values) than most static languages. "

toread:

http://lambda-the-ultimate.org/node/2828

http://lambda-the-ultimate.org/node/412

random type theory slides

http://www-logic.stanford.edu/seminar/1213/Scott_LambdaEnum.pdf

http://www.cs.cmu.edu/~rwh/talks/Scottfest.ppt

Doug's "mutable data structure which becomse immutable after freeze" example (for what intersection types might be good for)

" dgreensp 3 days ago

link

I'm not sure it's realistic to say someone could make a career in a programming language without being able to look at a function and deduce the static type of a value without asking the computer.

As I wrote elsewhere in the comments:

When you use any language long enough, you end up needing to simulate pretty much every observable aspect of it yourself -- not including, for example, the garbage collector, the JVM bytecode verifier, or the GCC code optimizer, which are supposed to be transparent, but including the type inferencer, the JVM threading model, and the JavaScript? semicolon insertion rules, which are not. Some of these things you can steer clear of for a long time or even avoid forever by staying on the beaten path, but they lurk, waiting for the thousands of people who have run into bugs or compiler errors and needed to understand them.

I don't know HM too well, but it seems to have more in common with the dataflow analysis algorithms in optimizers and verifiers -- which programmers don't usually have to understand -- than the basic unidirectional type inference that already serves pretty well to reduce redundancy in the source code.

I could imagine citing C++ as a counterargument -- no one understands the type system, but people use the language anyway -- but it's still not an abstraction you don't have to understand to use, like a CPU.

reply "

" colanderman 3 days ago

link

Of course you could say it's a subset of unification, as it is.

It definitely is, and a limiting one at that. How does Hoon infer the type error in whatever is the Hoon equivalent of the, say, JavaScript? expression (a == b && b == 5 && a == "foo"), where a and b are function parameters?

upvote

urbit 3 days ago

link

But my previous reply is a little content-free, as what you're really asking for is how we solve the same problem that typeclasses solve in Haskell.

Very briefly, there's two ways of asking whether a caller can change the type of a parameter. The normal way is to say, are all the possible values of the caller's type in the set of nouns that the parameter defines? This is what I call "geometric polymorphism."

But there's also "generic polymorphism," in which we ask: will the Nock code that was generated for the parameter type, actually work with the type we're calling it with? A really stupid way to ask this is to recompile the function, with the parameters exchanged for the arguments. The Hoon approach is a little bit smarter than this but not much.

So, with these two approaches, you basically get variance and genericity, which is all of polymorphism. "

woah, scala has "compound types" which are intersection types! yay!

https://docs.google.com/presentation/d/1aT7PdK54zL70P-MlZuxgQ6ASYg6pdxn6_Fcc9AIwIKM/pub?start=true&loop=true&delayms=5000#slide=id.gbfb484d3_027

some of netention's types (paraphrased):

bool, text, int, real, url, object, spacepoint, timepoint, timerange

could we make something like 'spacepoint' a core type? what about relative vs. absolute spacepoints? but wouldn't even supposedly 'absolute' ones be relative if you moved the origin (e.g. if the origin were the earth.. or the sun.. or the center of the milky way..)

"a netention object is easily created and described and can be visualized in various ways" (e.g. map view, graph view (shows related objects linked by tags), list view)

object editing: tags such as human, action, geometry, cause, effect, goal

" a netention object describes a real object, animal or human; has a unique URI; is described by custom tags; describes abstract objects; time, position (length, events); can be geolocalized (GPS + radius)"

is the freedom afforded by scala's feature to allow you to choose between covariant, contravariant, and invariant really useful? i feel like the compiler should derive whether something is input-like, output-like, or both, and derive the variance from that

---

closer to type-level programming:

http://brandon.si/code/a-typefamilies-primer/

---

Finally, types are supported in hstore (slides 11-13). They are numeric, boolean, strings and NULL

---

http://www.infoq.com/presentations/data-types-issues

---

http://www.infoq.com/presentations/data-types-issues

slide 12

classify type systems into dimensions static/dynamic, weak/strong:

dynamic, weak: assembly dynamic, moderate: js, ruby dynamic, strong: python, clojure dynamic-ish but somewhat static, moderate: typescript, dart static-ish but somewhat dynamic, moderate: java, C# static-ish but somewhat dynamic, strong: scala static, weak: C static, strong: OCaml, Haskell

slide 13

classify type systems into dimensions coarse/detailed, weak/strong:

coarse, weak: C coarse, strong ("my way or the highway"): Go, Java 4, Pascal detailed, weak ("cutting corners"): eiffel, typescript, dart detailed, strong ("type it to the max"): scala, F#, haskell, ocaml

---

My Way or The Highway (coarse, strong):

" Simple type systems No generics Not that extensible by users

Simpler tooling
Highly normative "

Type it to the Max: " Rich* language to write types Type combination forms, including generics. Type systems often inspired by logic.

Often, turing complete

Where dynamic languages had the upper hand: – No type-imposed limits to expressiveness  Rich type system + escape hatches such as casts – No boilerplate  Type Inference – Easier for exploration  Bottom type Nothing, ???

Other Strengths of Dynamic • Simpler languages − Rich types add complexity • Fewer puzzling compiler errors

Cutting corners: Appeal to user’s intuitions (covariant generics). – Employee are Persons – So functions from Employees to Employers are also functions from Persons to Employers, right? • Embrace unsoundness. • Easy, and superficially simple. • But, fundamentally, a hack. • Can we build great designs on false theories?

Precision Soundness Simplicity Take Any Two?

---

http://www.cs.uwm.edu/~boyland/fool2012/papers/fool2012_submission_3.pdf

looks promising

http://www.infoq.com/presentations/data-types-issues

i haven't watched this talk yet, perhaps they should, at least, starting with 'Types in Scala' and continuing on from there

slide 27

Abstractions

Two fundamental forms – Parameters (positional, functional) – Abstract Members (name-based, modular)

Types in Scala

Functional:

scala.collection.BitSet? Named Type Channel with Logged Compound Type Channel { def close(): Unit } Refined Type

Modular: List[String] Parameterized List[T] forSome { type T } Existential Type List Higher-Kinded

Orthogonal Design

crossproduct of modular and functional, where:

Modular:

Named T { ... } T with U

Functional:

T[U] T[_] Exists T

Dot and Dotty DOT: Calculus for Dependent Object Types Dotty: A Scala-Like Language with DOT as its core

paper: "dependent object types" http://www.cs.uwm.edu/~boyland/fool2012/papers/fool2012_submission_3.pdf " Abstract We propose a new type-theoretic foundation of Scala and languages like it: the Dependent Object Types (DOT) calculus. DOT models Scala’s path-dependent types, abstract type members and its mix- ture of nominal and structural typing through the use of refinement types. The core formalism makes no attempt to model inheritance and mixin composition. DOT normalizes Scala’s type system by unifying the constructs for type members and by providing clas- sical intersection and union types which simplify greatest lower bound and least upper bound computations. In this paper, we present the DOT calculus, both formally and informally. We also discuss our work-in-progress to prove type- safety of the calculus "

Types in Dotty scala.collection.BitSet? Channel & Logged Named Type Intersection Type Channel { def close(): Unit } Refined Type ( List[String] Parameterized )

removed: List[T] forSome { tpe T } Existential Type List Higher-Kinded

Modelling Generics class Set[T] { ... }  class Set { type $T }  Set { type $T = String } Set[String] class List[+T] { ... }  class List { type $T }List[String]  List { type $T <: String } Parameters  Abstract members Arguments  Refinements

Making Parameters Public class Set[type Elem] {...} class Set { type Elem ...} Set[String] Set { type Elem = String } class List[type +Elem] {...} class List { type Elem ...}List[String] List { type Elem <: String } Analogous to “val” parameters: class C(val fld: Int) class C { val fld: Int }

Expressing Existentials What is the type of Lists with arbitrary element type? Previously: List[_] List[T] forSome { type T } Now: List (Types can have abstract members)

Expressing Higher-Kinded • What is the type of List constructors? • Previously: List • Now: List • Can always instantiate later: type X = List X { type T = String } X[String]

In a Nutshell In this system, Existential = Higher-kinded In fact, both are just types with abstract members. We do not distinguish between types and type constructors.

Native Meets and Joins • The horrible type error message came from a computed join of two types. • Problem: In Scala, the least upper bound of two types can be infinitely large. • Adding native & and

types fixes that.

DOT expr.member Type = path.TypeName?

Def = val x: Type = Expr >:

Type { Defs }

...

def f(y: Type): Type = Expr

type T <: Type

extends

Subtyping Fundamental relation: T1 <: T2 T1 is a subtype of T2. Comes in many guises: Implementation matches Interface Type class extension Signature ascription

---

from nimrod:

http://nimrod-lang.org/talk01/slides.html#%2824%29

01 # in system module (simplified): 02 proc `$` [T: object](x: T): string = 03 result = "(" 04 for name, value in fieldPairs(x): 05 result.add("$1: $2\n" % [name, $value]) 06 result.add(")")

Notes:

    [T: object] -- generic type + constraint
    relies on builtin fieldPairs iterator

---

ek 1 day ago

link

Unfortunately this contribution is inhibited from being significant in value by the fact that TypeScript? doesn't support full gradual typing [0] and has an intentionally unsound type system [1].

[0] http://siek.blogspot.com/2012/10/is-typescript-gradually-typed-part-1.html

[1] https://typescript.codeplex.com/discussions/428572

brandonbloom 1 day ago

link

Accidentally unsound is a problem, but intentionally unsound just means that they balanced soundness against other design criteria such as completeness, complexity, and tooling capability, then decided that they were willing to trade soundness for other things.

For onlookers, here's a lucid explanation of soundness and completeness in type systems: http://eschew.wordpress.com/2009/08/31/sound-and-complete/

The author of that article argues that soundness is more important than completeness, but concedes that there are other ways to view the world too.

The bottom line is that most ad-hoc lint tools are both unsound and incomplete, but they are still useful for finding bugs!

ek 1 day ago

link

Unfortunately this contribution is inhibited from being significant in value by the fact that TypeScript? doesn't support full gradual typing [0] and has an intentionally unsound type system [1].

[0] http://siek.blogspot.com/2012/10/is-typescript-gradually-typed-part-1.html

[1] https://typescript.codeplex.com/discussions/428572

brandonbloom 1 day ago

link

For onlookers, here's a lucid explanation of soundness and completeness in type systems: http://eschew.wordpress.com/2009/08/31/sound-and-complete/

The author of that article argues that soundness is more important than completeness, but concedes that there are other ways to view the world too.

The bottom line is that most ad-hoc lint tools are both unsound and incomplete, but they are still useful for finding bugs!

http://siek.blogspot.com/2012/10/is-typescript-gradually-typed-part-1.html

complaint about golang polymorphism on return type, as applied to channel reads:

" Another syntax / semantics oddity is the behavior of reading from channels (like a pipe). Whether a read from a channel blocks depends on how the return value is used:

 res := <- queue /* waits if the queue is empty */
 res, ok := <- queue /* returns immediately if the queue is empty */

This bears repeating: the behavior of a channel read depends on how the return value is (will be) used. This seems like a violation of the laws of time and space! " -- http://ridiculousfish.com/blog/posts/go_bloviations.html#go_syntax

" With modern type systems employing structural subtyping, row polymorphism, proper variance and covariance handling and other advanced techniques static typing becomes a very powerful tool.

On the other hand, there are correct programs which we still don't know how to type. This means that there are programs we just can't write in statically typed languages, at least not yet. " klibertp, https://news.ycombinator.com/item?id=7861823

---

ocaml has the 'assert' escape hatch from the type system:

" If something can't be type-checked then you just add an "assert" and do the check dynamically.

For example, in 0install I need to ask the SAT solver which version of a program I should use. I pass it a load of SAT variables, each containing a user data field with the attached program details, run the solver, and get back the selected variable. The SAT problem also contains other variables, with other kinds of data.

I know the user data field for the selected result will be of type program details, but I couldn't figure out how to let the compiler check this statically. It's not a problem; I just check the type and "assert false" if it's wrong. I still get the benefit of static checking almost everywhere else in the code. "

notes-computer-jasper-jasperTypeNotes1

things that make type systems confusing

the Haskell type class defn syntax

http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function

example from http://kawagner.blogspot.com/2006/12/my-haskell-experience.html

taints and dataflow analysis (?)

> type engine generates

co/contravariance, subtyping, promise/demand, failure and success typings