Bayle Shanks's website: proj-oot-ootTypeNotes9

mbell 6 days ago [-]

I wonder what the reason for not supporting structural typing is. It seems like a very natural fit for Ruby.

ptarjan 6 days ago [-]

We believe that the main goal of a typechecker is to give you nice error messages. We've found giving names to things makes it easier for folks to reason about their errors, and introducing names for interfaces isn't that onerous at Stripe or with our beta testers.

We aren't opposed to eventually support it, but we'd like to see how it goes with the current form first.

---

Is this all ML modules are?

" Java's interface system...works well, but unfortunately it requires that every class you want to use declares itself to implement a particular bundle of functionality ahead of time. ML's functor system (no relation to the things C++ calls functors, which in ML terms would simply be higher-order functions) deals with this problem nicely using the concept of a parameterized module system....

signature PRINTABLE = sig type t val printT : t -> unit end

functor List(T : PRINTABLE) = struct datatype List = ListNode? of T.t * List

Empty

  exception EmptyList
  (* ... other functions as before ... *)

  fun print (ListNode (i,_)) = T.printT iend

structure L = List(struct type t = int val printT = print o (Int.toString) end);

L.print (L.ListNode? (5, L.ListNode?(6, L.Empty))); " -- [1]

(i don't understand why the last line is L.print instead of L.printT? is that a mistake or is it something i don't understand?)

i see two main features there:

potentially multiple related types are grouped together into a 'signature' (i guess a sig is sort of like a Haskell typeclass? except it seems to me that a signature is more general because it could relate many types, not just one)
the instantiation of a signature with specific types and functions (a 'structure') is bound to a name. This solves the issue Haskell has where you can only have one way to implement a typeclass's demands for each given type.

sounds great, if so. And here i thought they were going to be complicated. Let's do it.

do they do everything typeclasses do? No.

you have to figure out which 'structure' you are using, the compiler doesn't figure that out for you. And i think i read that there are times when i think you have to pass the structure around to make sure that you use the right structure with the right value (like if you make a hashmap or something); if you need to make sure you use the right structure everytime you access some value, Haskell provides this (it calls this property 'coherence') and it's safer to have the compiler check this.
i've read that they don't do ad-hoc polymorphism (at least not without some weird hack) [2]. Which seems odd to me, because isn't 'val printT = print o (Int.toString)' above ad-hoc polymorphism?

oh, i think i see; the previous two list items are the same. You can't just say 'printT listobj', you have to say L.printT listobj, which means you have to already have an L representing the type that you care about, and you have the possibility of using the wrong L. Which i guess OCaml takes care of by having different infix operators for different types, so instead of IntArith?.+ and StringArith?.+ i guess they have different punctuation characters for each. Whereas in Haskell, there is only one instance of + for ints and one instance of + for strs, so there's only one way in your whole program that and int can +, so you can just say a + b and Haskell figures out which one to use by type.

i think this post supports my guess in the previous paragraph: [3].

---

so of course you'd like to have both the ad-hoc polymorphism and also some of the benefits of the coherency of Haskell typeclasses with the expressiveness of ML modules.

So it seems to me that typeclasses 'guess' the right typeclass-instance most of the time, and then some of the time you need to override it ML-module-style. Typeclasses have 'coherency' (only one typeclass-instance for each typeclass-type pair) throughout the whole program but that seems a bit much. But objects probably do need 'coherency' even outside of their home module (e.g. one module produces an integer, then elsewhere a sort routine compares numbers), so maybe it has to go with the value, not with the module. But on the other hand i imagine some module might like to do something unusual and override the default typeclass-instance, and they probably want to do it throughout their module.

So it seems to me that one thing to do would be:

Haskell-style coherency by default
when there are multiple typeclass-instances present in the program for one typeclass, a default choice is made which runs with the value, and this default choice is chosen by which typeclass-instance is lexically closest to the point of value creation
- actually 'lexically closer' may not be great, see [4] which points out that "removing an implicit from the scope...could just cause a call to silently switch to using a different implicit". It may still be worth it though, just something to think about.
- this default can be overriden in four ways: on a per-value basis, or on a per-module basis, or only for one value and only when within one module; or at the point of usage (ML-style).

---

in fact OCaml special-cases comparison ops! yuck!

https://stackoverflow.com/questions/33413163/ocaml-why-comparison-operator-are-type-agnostic-whereas-arithmetic-ones-are-n?noredirect=1&lq=1

---

on OCaml:

" Ad-hoc polymorphism usually refers to being able to declare the same name (usually a function) with different types, e.g. + : int -> int -> int and + : float -> float -> float in SML. These are different functions, and they can act in totally different ways, but the compiler or interpreter chooses the appropriate one depending on the context. I can't think of any instances of ad-hoc polymorphism in OCaml. It is common in C++ and Java, however. ... Parametric polymorphism is when a single function can work with an argument of any type due to not trying to look into that argument's structure. For example, cons : 'a -> 'a list -> 'a list is able to prepend a value v of any type to a list of values of the same type...

Subtype polymorphism is when you can use values of type t where values of type u are expected. This could be because type t supports all the operations of type u, or because t's structure can be used where u is expected. Examples of this would be subclassing (perhaps a Bus can be used wherever a Vehicle can), or polymorphic variants (you can use 'A

'B where 'A

'C is expected).

EDIT per comment

Note, however, that subtyping has to be requested explicitly in OCaml. If you have, for example, a function f : u -> int, and you want to apply it to v : t where t is a subtype of u, you have to write f (v :> u). The (v :> u) syntax is a type coercion.

OCaml also supports row polymorphism, which is a form of parametric polymorphism with constraints. If f is instead f : #u -> int (for object types) or f : [< u] -> int (for polymorphic variants), the #u/[< u] syntax represents a type variable, similar to 'a, but which can only be replaced with the respective "subtypes" of u (in the restricted sense that they can support more fields/less constructors, respectively). Then, you can do f v without the coercion. OCaml automatically infers types that use row polymorphism for many expressions involving polymorphic variants and objects, but you have to write the types explicitly if you are creating a signature.

There are more usages and considerations to row polymorphism. I've neglected the actual row variables and additional syntax, and only described something that looks like bounded quantification (as in Java generics). More detailed and accurate discussion of row polymorphism, its name, and/or its formalism, is perhaps best saved for separate questions. "

so mb we should have all of:

ad-hoc polymorphism
parametric polymorphism
subtype polymorphism
row polymorphism

---

" The trouble with typeclasses

Haskell “enforces” typeclass coherence by asking that you define all the instances for a type in the same module (really, file) where you define that type. Whenever you import anything from that module, you bring all these instances into scope. It “works”, but it’s a kludge - when defining the type Foo a, you can’t anticipate (nor are you necessarily aware of) all the instances you might wish to give for Foo. We want an “open universe” where people can after the fact discover common structure and layer it onto existing data types. I feel this is an important property for a language ecosystem.

You can relax things a bit and give “orphan instances” for a type in other modules, but in so doing, you lose typeclass coherence guarantees, and it’s now possible for different parts of your code to use different Foo instances, leading to subtle bugs. Aside: I’d say that most working Haskell programmers make their peace with orphan instances in some cases while feeling a bit dirty.

A workaround that people suggest is to create a newtype around Foo in a separate module, and then give additional instances for this newtype, call it WrappedFoo?. This starts to fall apart anytime the type Foo is being embedded in another type. A good example is the class ToJSON? a, which is used by Aeson to encode values of type a to JSON. If we have a type Bar [Qux (Int, Foo Blah)] String, and we want to make Bar an instance of ToJSON?, we can’t easily use the ToJSON? a => ToJSON? (WrappedFoo? a) instance. The creator of Qux should not need to be aware of JSON at all.

I also dislike this sort of workaround because it’s one of those non-obvious language-specific hacks that I’d be embarrassed to explain to a student. It’s magic (in a bad way) and it doesn’t even work in general.

First-class representation of dictionaries have their own problems:

Typeclass coherence issues. I don’t really agree that typeclasses solve this problem in general.
It’s tedious to plumb around value-level dictionaries. Scala implicits “solve” this problem in an ad hoc way.
Related: subtyping of typeclasses (or lack thereof) is another source of plumbing. The lens library would not be usable without subtyping of typeclasses.

It seems like we can do better, but I’m not sure how. " -- https://pchiusano.github.io/2018-02-13/typeclasses.html (whole article quoted; there is additional discussion at this link though)

---

" Modular type classes and modular implicits are both proposals to integrate ad-hoc polymorphism into traditional ML. Though OCaml has already decided to head for the implicits route, I was wondering if any one with a little more knowledge of PL could discuss the pros and cons of the two approaches?

Modular type classes (mb also https://people.mpi-sws.org/~dreyer/papers/mtc/main-long.pdf )

Modular implicits

...

The core idea of modular type classes is that if we think of type-class elaboration as "guessing" a module instead of a value, a lot of the nice advanced features of type-classes come for free. In other words, there is a large overlap in the features of type class systems and module systems, and building type classes on top of modules allows to factor our this redundancy. This is a very nice exposition.

The main problem with "just Modular type classes" as originally presented is that it does not tackle the problem of coherence/consistency of elaboration: can we guarantee at any given point of the program that there is exactly one possible resolution choice for a type-class use?

In Haskell this is ensured by imposing non-modular restrictions. In modular type classes, the issue is worked around by imposing a very strict restriction on the places where new instances may be declared. You can think of Modular Implicits as an attempt to go sensibly further on that question of coherence/consistency.

This is the exciting, interesting question that must be asked about type-classes in a modular language.

(Interestingly, Coq solves the problem in a completely different way, which is to give up on consistency but use dependent types to statically represent elaboration choices, and thus let users reason about elaboration precisely even in an incoherent world.) "

" level 3 lpw25 7 points · 3 years ago · edited 3 years ago

It is worth pointing out the modular implicits are much easier to reason about than Scala Implicits:

    Scala implicits have a complex set of scoping rules for determining what implicits are in scope, whereas modular implicits uses traditional lexical scoping.

    Scala implicits do not always error when there are two choices for an implicit argument in scope. Instead there is a complex set of rules for deciding which of the two choices is "better" and there is only an error if neither one is better than the other. This means that knowing your desired implicit is in scope is not sufficient: you must know that it is better than all other implicits in scope. Whereas modular implicits treats all ambiguity as an error.

That second point is particularly important for refactoring: removing an implicit from the scope of a call will always be an error with modular implicits, whereas with Scala it could just cause a call to silently switch to using a different implicit. Similarly, adding an additional implicit to the scope of a call will always produce an error with modular implcits, whereas with Scala the new implicit may be silently ignored or silently override the existing implicit. "

" ezyang 2 points · 3 years ago

Let me stick my neck out, and say there is not really that much of a difference between modular type classes and modular implicits. The crux of the issue deals with what meaning you should ascribe to a use of a type class / implicit inside a function, when a valid instance / implicit value is in scope. In modular type classes, several different ways of getting canonicity are considered, one of which suggests that annotating the type of all using declarations is sufficient to say if instances from the caller should be used (though this is not the approach they decided on). In modular implicits, the explicit (ha!) implicit argument indicates that filling in of the implicit should be deferred to the caller. It's just two ways to skin the cat: one is type-based, the other is argument-based.

Haskell's use of canonicity of type class instances really isn't modular at all, and if you, say, transcribed containers into OCaml with modular implicits, it would be very un-ML. You want the type identity of a map to depend on the module you selected to define the ordering, so that maps ordered ascending are distinguished form maps ordered descending. This is discussed, e.g., in Section 4.4 of the modular implicits paper. This is a HUGE distinction between Haskell type classes, and these ML proposals.

lpw25 3 points · 3 years ago

I agree that the difference between type classes in Haskell and the two proposals for ML is larger than the difference between them. However I think you are under-selling the difference between the two proposals for ML.

The difference is not just about whether you indicate where resolution is resolved at through a type specification or through an argument specification: the difference is also about whether the system assumes resolution takes place inside the function unless told otherwise. In modular type classes you must always give a type annotation to any function which uses a function with implicit parameters, whilst with modular implicits it can safely assume that you want implicit parameters resolved locally unless told otherwise (by explicitly adding an implicit parameter). This is the benefit of making the distinction syntactic rather than type-based. When you make the distinction type-based you need to give up on some of the properties of ML type-inference.

Another way of looking at the difference is that Modular Type Classes attempts to preserve canonicity (a la Haskell type classes) locally, whilst Modular Implicits just abandons canonicity for non-ambiguity. Since canonicity does not play well with the rest of ML the result is that Modular Implicits fits into the rest of the language more smoothly -- avoiding the ad-hoc restrictions required to make Modular Type Classes work.

ezyang 1 point · 3 years ago

From a practical perspective, the idea that implicit parameters should show up in the expression language is an important one. Haskell's implicit parameters does not follow this design principle (it is type directed), and we would always joke at John Launchbury that implicit parameters were a big mistake. Speaking of which, someone should propose a design for explicit implicit parameters in Haskell...

But in both modular type classes and modular implicits you still have to explicitly annotate your implicit parameters with types; and type inference fails in a number of situations (e.g. as in section 2.4.3) all stemming from the fact that you cannot infer module signatures.

lpw25 2 points · 3 years ago · edited 3 years ago

Actually my earlier comment doesn't really capture the difference I'm trying to get at. In Modular Type Classes you don't annotate functions at all: you annotate top-level definitions. It doesn't require you to annotate the functions because it doesn't care about whether the implicit parameters are resolved inside the function or outside -- it only cares about whether the implicit parameters are resolved inside the top-level definition or outside it.

This works by preserving canonicity inside top-level definitions (so no introducing new implicit modules inside them), and by forcing users to give type-annotations to every toplevel definition.

Modular implicits, on the other hand, force you to be explicit about all implicit parameters, but do not require equivalent annotations on functions without implicit parameters. (They also allow implicit modules to be introduced anywhere, not just at top-level).

The point I'm trying to make is that the difference is not just about how the information is specified (type annotations vs expression syntax) but actually the information being provided is subtly different.

To expand slightly, consider a classic example of the issue of coherence:

instance Show a = > Show [a] where show l = show_list l

show_as_list x = show [x ]

which could be given the type

show_as_list :: Show [a] = > a -> String

or the type

show_as_list :: Show a = > a -> String

Haskell basically says those two types are the same since canonicity means it doesn't care about the difference. Modular Type Classes says that those two types are the same locally and only need to be distinguished in the types of top-level definitions. Whereas Modular Implicits says that those types are completely separate and are created using syntactically different definitions. This distinction also leads to modular implicits having types such as ({S : Show} -> S.t -> int) -> int which I do not believe occur in Haskell or Modular Type Classes. level 6 glaebhoerl 1 point · 3 years ago

(Show a => a -> Int) -> Int is a legal type in Glasgow Haskell, if that's actually the same type. You see types like it when you have evidence for a constraint stored in a datatype, and want to unpack it and pass it to a function which requires that constraint.

-- [5]

---

http://lambda-the-ultimate.org/node/5121 " 1ML — Core and modules united

This is an amazing paper that I've been anticipating ever since I read the abstract about half a year ago.

    ML is two languages in one: there is the core, with types and expressions, and there are modules, with signatures, structures and functors. Modules form a separate, higher-order functional language on top of the core. There are both practical and technical reasons for this stratification; yet, it creates substantial duplication in syntax and semantics, and it reduces expressiveness. For example, selecting a module cannot be made a dynamic decision. Language extensions allowing modules to be packaged up as first-class values have been proposed and implemented in different variations. However, they remedy expressiveness only to some extent, are syntactically cumbersome, and do not alleviate redundancy.

    We propose a redesign of ML in which modules are truly first-class values, and core and module layer are unified into one language. In this "1ML", functions, functors, and even type constructors are one and the same construct; likewise, no distinction is made between structures, records, or tuples. Or viewed the other way round, everything is just ("a mode of use of") modules. Yet, 1ML does not required dependent types, and its type structure is expressible in terms of plain System Fω, in a minor variation of our F-ing modules approach. We introduce both an explicitly typed version of 1ML, and an extension with Damas/Milner-style implicit quantification. Type inference for this language is not complete, but, we argue, not substantially worse than for Standard ML.

    An alternative view is that 1ML is a user-friendly surface syntax for System Fω that allows combining term and type abstraction in a more compositional manner than the bare calculus.

Basically, it unifies module and core languages, without dependent types, while supporting type inference without requiring more type annotations than OCaml (i.e. only on module or higher-rank types). It leaves applicative functors for future work.

An extended technical report and a prototype interpreter are available here. "

"What Andreas did is that he made modules into first-class objects while keeping the type system decidable; in essence, if I understand his paper correctly, he makes sure that types are not "expanded" when type-checking, i.e. abstract types cannot be instantiated by module signatures (which might contain yet other abstract types, which would need to be instantiated, and so on, exploding the type-checking system)."

"Predicativity

Indeed, but working with ML involves dealing with various issues that are the result of design decisions taken in ML before considering that records and modules (and type-classes) might be merged into one construct. The paper goes to a lot of trouble to handle opaque associated types, when I would just go with type-families and avoid the problem. If types cannot depend on values you lose nothing by doing this, and get a cleaner, more easily understandable type system. I agree though that predicativity is the key property, but my approach is to consider only type system features that are predicative."

(there is a discussion below about whether type families, whatever that is, really are modules, and about the definition of a module; i didn't copy it here but you can read it at the source, http://lambda-the-ultimate.org/node/5121 ; and there is another similar discussion below in the same comment section; i will quote one more from that one because it suggests that type families can't be used for ad-hoc polymorphism and so would be of less interest to me)

" type classes of module signatures.

I want to gather together my thoughts scattered above to get some opinions. I will try and use ML like terms to avoid confusion. If we say modules must only be values (no types that depend on the module instance), and all modules must have a signature (which is a type and can therefore have associated type definitions), we avoid the complexities involved in 1ML. We can have functors as type level functions (type-families) on signatures. If we wanted modules with associated types, they would have different signatures, but could be related by a type-class. Any problems with this approach? By Keean Schupke at Sat, 2015-03-21 23:05

Well

First, you will have a tough time arguing that this resembles modules. Second, throwing in type families does not avoid complexities, it adds them – the type theory of type families is significantly more complicated than System F. By Andreas Rossberg at Sun, 2015-03-22 09:35

and the following comment explains more about how predicativity is involved and how 1ML provides it:

"Only small types

The paper, which I've just skimmed so far, had some language about maintaining predicativity by only [ ... words ... ] small types. Would someone, Andreas perhaps, explain what kind of module use is ruled out to make this nice? Bonus points for example code. By Matt M at Fri, 2015-03-13 01:48

Small = not a module type

No module you can write in ordinary ML is ruled out (*). The way to think about this is that instead of distinguishing "types" and "signatures" syntactically, 1ML distinguishes them semantically, such that "small" corresponds to what is an ordinary type in traditional ML. There effectively is only one place where the distinction is relevant, and that is signature matching: you can have the equivalent of abstract types, but not the equivalent of abstract signatures:

type S = {type T} ;; large, because it contains an abstract type A = {type T = int} :> S ;; fine, abstracting int B = {type T = S} :> S ;; error, S is large, cannot be abstracted

(*) Other than cases of OCaml's abstract signatures, which are undecidable for the same reason. "

---

related to ML modules vs haskell type classes:

http://lambda-the-ultimate.org/node/1558

although apparently going by the comments to that post, this is not directly relevant to OCaml modules, which are apparently different(!)

---

and here is a great somewhat/surprisingly readable note on the 'dictionary passing' view of Haskell type classes vs ML modules, as well as on compilation techniques for typeclasses:

http://okmij.org/ftp/Computation/typeclass.html

---

more on Haskell typeclass deficiency vs ML modules, this time addressing the 'just use associated (data)types" comment that i saw somewhere else:

http://augustss.blogspot.com/2008/12/somewhat-failed-adventure-in-haskell.html

i haven't read it yet; i skimmed it and it looks good but would take a long time for me to read

---

[6] reminds that it would be nice to have a way to explicitly add laws to typeclasses (he has other proposals too but i don't agree with all of them)

---

Sounds like we want compile checked generics like in rust Swift haskell. "At the language level this is very similar to the kind of type system you need to implement generics with interface support using the boxing approach to generics, which is why Rust can support both using the same system! Rust 2018 even added a uniform syntax where a v: &impl SomeTrait? parameter gets monomorphized but a v: &dyn SomeTrait? parameter uses boxing. This property also allows compilers like Swift’s and Haskell’s GHC to monomorphize as an optimization even though they default to boxing."

---

 timothycrosley 5 hours ago [-]

> TypeScript? got typings right.

I have not used TypeScript?, but looking at it's documentation the syntax for type annotations look identical. Would you be willing to expand on why you think its approach is better / how it's different?

scrollaway 5 hours ago [-]

No importing basic types, using binary operators instead of awful things like Union and bracket accessors and what not, inline interfaces...

Try it a bit. it truly is enjoyable. Fifteen years of python and I'm still enjoying TypeScript? more.

Mypy is limited by annotations having to be compatible python syntax.

timothycrosley 5 hours ago [-]

> No importing basic type

This at least is being solved in Python: https://www.python.org/dev/peps/pep-0563/

> inline interfaces

Typed dicts in 3.8 look pretty similar at cursory glance https://www.python.org/dev/peps/pep-0589/

I'll definitely play around with it more!

scrollaway 5 hours ago [-]

Watch the difference: A variable that can be an object with two elements (one of them being a list of strings), or a tuple of exactly two strings.

    // typescript
    let t: {a: string[], b: number} | [string, string]

python 3.8 from typing import TypedDict?, Tuple, Union class SomeTypedDict?(TypedDict?): a: List[str] b: Union[float, int]

    t: Union[SomeTypedDict, Tuple[str, str]]

---

" (More) Recursive Type Aliases #

[Playground](/play/#example/recursive-type-references

Type aliases have always had a limitation in how they could be “recursively” referenced. The reason is that any use of a type alias needs to be able to substitute itself with whatever it aliases. In some cases, that’s not possible, so the compiler rejects certain recursive aliases like the following:

type Foo = Foo;

This is a reasonable restriction because any use of Foo would need to be replaced with Foo which would need to be replaced with Foo which would need to be replaced with Foo which… well, hopefully you get the idea! In the end, there isn’t a type that makes sense in place of Foo.

This is fairly consistent with how other languages treat type aliases, but it does give rise to some slightly surprising scenarios for how users leverage the feature. For example, in TypeScript? 3.6 and prior, the following causes an error.

type ValueOrArray?<T> = T

~~~~~~~~~~~~ error: Type alias 'ValueOrArray?' circularly references itself.

Array<ValueOrArray?<T>>;

This is strange because there is technically nothing wrong with any use users could always write what was effectively the same code by introducing an interface.

type ValueOrArray?<T> = T

ArrayOfValueOrArray?<T>;

interface ArrayOfValueOrArray?<T> extends Array<ValueOrArray?<T>> {}

Because interfaces (and other object types) introduce a level of indirection and their full structure doesn’t need to be eagerly built out, TypeScript? has no problem working with this structure.

But workaround of introducing the interface wasn’t intuitive for users. And in principle there really wasn’t anything wrong with the original version of ValueOrArray? that used Array directly. If the compiler was a little bit “lazier” and only calculated the type arguments to Array when necessary, then TypeScript? could express these correctly.

That’s exactly what TypeScript? 3.7 introduces. At the “top level” of a type alias, TypeScript? will defer resolving type arguments to permit these patterns.

This means that code like the following that was trying to represent JSON…

type Json =

string

number

boolean

null

JsonObject?

JsonArray?;

interface JsonObject? { [property: string]: Json; }

interface JsonArray? extends Array<Json> {}

can finally be rewritten without helper interfaces.

type Json =

string

number

boolean

null

{ [property: string]: Json }

Json[];

This new relaxation also lets us recursively reference type aliases in tuples as well. The following code which used to error is now valid TypeScript? code.

type VirtualNode? =

string

[string, { [key: string]: any }, ...VirtualNode?[]];

const myNode: VirtualNode? = ["div", { id: "parent" }, ["div", { id: "first-child" }, "I'm the first child"], ["div", { id: "second-child" }, "I'm the second child"] ];

For more information, you can read up on the original pull request.

https://github.com/microsoft/TypeScript/pull/33050

--declaration and --allowJs # "

---

(regarding "make illegal states unrepresentable" with eg a 'nonempty list' datatype and https://news.ycombinator.com/item?id=21476261 ):

zanny 6 hours ago [-]

Rust is extremely good at this with newtype syntax and the ability to apply impls and derives to masking types.

Your JSON data might be a string, but having a Json type that guarantees its valid Json is way better. Same with stringified Base64, a constrained numeric value for a tunable, etc. Because using from impls on these types lets the compiler figure out almost all invalid type usages at compile time and give you eloquent feedback on it.

---

atombender 21 days ago [-]

I agree that Go forces you to focus on mechanics of your code more so than some "fancier" languages.

However, Go's poor type system also forces you to write worse and less safe code than you want.

For example, the lack of enums and sum types means it's hard to represent, safely and efficiently, anything that should be the same type but have different values within a strict set of allowed values. The nearest you get in Go to an enum is:

  type Color int
  const (
    Red Color = iota
    Blue
    Green
  )

Alternatively, can do the same with a string:

  type Color string
  const Red Color = "red"
  // etc.

This gives you some type safety, since a Color requires casting to force an int or string or whatever to become it. But that doesn't give you real safety, since you can do:

  var c Color = Color("fnord")

This comes up whenever you're handling foreign input. JSON unmarshaling, for example. You can override the unmarshalling to add some validation, but it won't solve your other cases; it will always be possible to accidentally accept something bad. Not to mention that the "zero value" of such a value is usually wrong:

  type Color string
  var c color // Empty string, not valid!

A more safe way is to hide the value:

  type Color interface {
    isColor()
  }
  type colorValue string
  func (colorValue) isColor() {}
  var Red Color = colorValue("red") // etc.

Now nobody can create an invalid value, and the zero value is nil, not an invalid enum value. But this is unnecessarily complicated and shouldn't be necessary in a modern language in 2019.

The case of genuine sum types is worse. Your best bet is to use sealed interfaces:

  type Action interface {
    isAction()
  }

  type TurnLeft struct{}
  func (TurnLeft) isAction() {}

  type MoveForward struct{
    Steps int
  }
  func (MoveForward) isAction() {}

There are some downsides. You have to use static analysis tools (go-sumtype is good) to make sure your type switches are exhaustive. You get a performance penalty from having to wrap all values in an interface. And if you're going to serialize or deserialize this (e.g. JSON), you will be writing a whole bunch of logic to read and write such "polymorphic" values.

strken 21 days ago [-]

Agree with this. There are couple of observations I'd make, though.

Firstly, "enums" using iota should always be defined as either

    const (
        Red Colour = iota + 1
        Blue
        Green
    )

    const (
        Unknown Colour = iota
        Red
        Green
        Blue
    )

to avoid the zero value problem.

Secondly, and this is a personal preference, I've really enjoyed not having to work with sum types. In practice other programmers seem to use them when a public interface would have been sufficient, and it's convenient to be able to do:

    // original package 
    type Position struct {
        X, Y int
        Rot Rotation
    }

    type Action interface {
       Apply(Position) 
    }

    type MoveForward struct {
       Steps int
    }
    func (m MoveForward) Apply(p Position) {
        switch p.Rot {
        case Up:
            p.Y += m.Steps
            ... 
        }
    }

    // second package wrapping the first
    type WarpToPoint struct {
        X, Y int
    }
    func (w WarpToPoint) Apply(p movement.Position) {
        p.X, p.Y = w.X, w.Y
    }

morelisp 21 days ago [-]

> I've really enjoyed not having to work with sum types

Your example of what you prefer uses (a shitty approximation of) a sum type in p.Rot.

(This is also the most basic possible use of a sum type; they are not only useful for enums, it's just to point out that even a large amount of "simple" Go code would benefit from them.)

strken 17 days ago [-]

I understand that p.Rot is a shitty approximation of a sum type, but it still works and almost certainly won't break anything since Go forces you to explicitly typecast. The important thing is that the list of possible actions wasn't sealed by the type system, which in the original example it was.

I want to reiterate that I am aware sum types can be useful. I just don't think they're useful _enough_ to outweigh being a footgun for calling code.

atombender 21 days ago [-]

I would argue that this misses the use case of sum types, which typically don't have behaviour (or they'd just be interfaces!).

For example, consider an AST for a programming language. AST nodes don't have any behaviour (though they might have some methods for convenience, though nothing that implements behaviour). You want to do optimizations and printing and compilation and so on, but on top of the pure AST.

---

[7]

---

eqvinox 3 days ago [-]

What's the best way to deal with "transitive const-ness", i.e. utility functions that operate on pointers and where the return type should technically get const from the argument?

(strchr is the most obvious, but in general most search/lookup type functions are like this...)

Add to clarify: the current prototype for strchr is

  char *strchr(const char *s, int c);

Which just drops the "const", so you might end up writing to read-only memory without any warning. Ideally there'd be something like:

  maybe_const_out char *strchr(maybe_const_in char *s, int c);

So the return value gets const from the input argument. Maybe this can be done with _Generic? That kinda seems like the "cannonball at sparrows" approach though :/ (Also you'd need to change the official strchr() definition...)

pascal_cuoq 3 days ago [-]

Speaking as someone who is not in the committee but has observed trends since 2003 or so, I would say that solving this problem is way beyond the scope of evolutions that will make it in C2a or even the next one.

There are plenty of programing languages that distinguish strongly between mutable and immutable references, and that have the parametric polymorphism to let functions that can use both kinds return the same thing you passed to them, though. C will simply just never be one of them.

DougGwyn? 3 days ago [-]

Many uses of strchr do write via a pointer derived from a non-const declaration. When we introduced const qualifier it was noted that they were actually declaring read-only access, not unchangeability. The alternative was tried experimentally and the consequent "const poisoning" got in the way.

coliveira 3 days ago [-]

I believe C is doing the right thing. Const as immutability is a kludge to force the language to operate at the level of data structure/API design, something that it cannot do properly.

pwdisswordfish2 3 days ago [-]

One proposal solved this by doing exactly that:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2068.pdf

https://blog.golang.org/generics-next-step

---

on generics and Golang

nemothekid 9 hours ago [–]

I've been writing a lot of Rust lately after writing Go for 7 years now (unrelated, but that is incredibly odd for me to write out. It still feels like that funny new language that came out of Google). I've always said that generics is overhyped (in Go), but I find that I write a lot of generic stuff in Rust that is somewhat surprising. For example we have a JSON API that returns data in the format '{"ok": bool, "data":...}'. In Go I might do something with json.RawMessage? and pay for decoding twice, or annotate every struct with Ok/Data. But in Rust I can just have a type ApiResponse?<T>. Then I can have functions that operate just on ApiResponse?, or ApiResponse?<T=Users>. This problem is solvable in Go, but in a different way and with less guarantees. However that power comes at a mental cost, that creeps in C++ as well. I spent more time playing type system golf trying to come up with the optimal type for whatever usecase. In Go I might just "do the work", but in Rust I've turned 5 minute functions into 30 minute api design thought exercises. The jury is out if thats is a good thing or a bad thing.

That said, the only feature I'd steal from Rust is sum types and getting rid of null. `nil` interfaces are the only language feature to actually cost me money and Option<T> is just better in every aspect, and Result<T> is much better than err == nil. I'd be perfectly happy with compiler blessed Option and Result types even if the larger language didn't support generics.

nine_k 9 hours ago [–]

The cost you pay for appeasing the type system is the cost you're not paying writing more tests around corner cases ("this function never returns null"), debugging segfaults, null pointer exceptions, "attribute not found" errors, memory leaks (lifetimes help here), etc.

The type system may not always help you get your code out the door faster, but when you finally ship it, you have much easier time running it.

Thaxll 9 hours ago [–]

That's up to debate, you have to prove that in x.y.z language the time spend for maintenance would have been avoided if x.y.z had that feature from the start. Especially language like Rust that have a big learning curve / blockers, is it justified on the end? I'm not sure.

In Go I never felt that omg it misses some big features we're screwed, we're going to pay the cost of running that in Prod. All the services I worked on while not being perfect ran like a clock without issues on the long term, and it has to do with Go "simplicity", eventhough you never wrote the code you can actually jump into any code base without issues, there is no weird abstraction layer / complexity that takes you day to understand.

For me the fact that Go is a "simple" language is very powerful because on the long run it's much easier to maintain.

whatshisface 9 hours ago [–]

>The cost you pay for appeasing the type system is the cost you're not paying writing more tests around corner cases

I have spent a lot of time in Rust getting carried away with type system stuff that is not required for the actual program I am writing. I realize this when I go over it a second time and reduce line counts drastically by replacing my gee-whiz generics with primitive types. This is balanced out by an equal number of times where I've reduced line counts drastically by using the advanced type stuff to reduce the repetition that comes along with using primitives to represent a type that has way more structure than the primitive.

---

cmrdporcupine 7 days ago [–]

Enums, immutables, and generics aren't good because they're exciting. They're good because they give us expressive tools to write descriptive, type safe systems that manage state better.

...

akavi 7 days ago [–]

Exactly. I don't want sum types becuase they're exciting, I want them because they're the simplest way to communicate to the compiler that "my data can either be this or this and nothing else", so I don't blow off my foot by accidentally making it that.

frankie_t 7 days ago [–]

That actually surprises me a lot. I've seen plenty of critique about lacking generics and exceptions, while haven't seen much complaint about ADTs. From my rather dilettante point of view they are neither too complex for "philosophy of simplicity", nor they add much of compile time.

akavi 7 days ago [–]

Yeah, it's unclear to me why there's not more demand for them either. I think it might that most go users with prior static language experience are coming from Java, where there aren't sum types either.

From my naive perspective, they seem like an easy win, and are 100% more important to me than generics.

estebank 7 days ago [–]

Adding ADTs on their own is not enough, you need to add pattern matching and generics to unlock their biggest benefits.

ithkuil 6 days ago [–]

Go's type switches could be good enough if there was a way to restrict an interface-like type to be either one of A, B, C types.

    type foo either {
     int
     string
     *bar
      baz
    }

...

     var f foo = ....
     switch v := f.(type) {
       case int:
       // v has type int
       case string:

.....

masklinn 5 days ago [–]

The problem with this is it's extremely limiting and conflicts with interfaces. So it's not very useful: you can't define enumerations with no associated data, or enumerations with the same associated data types, without the syntactic overhead of explicitly wrapping and unwrapping those types. So you'd end up with

    type thing1 int
    type thing2 int
    type foo either {
        thing1
        thing2
    }

    var f foo = …
    switch v := f.(type) {
        case thing1:
            data = int(v)
        case thing2:
            data = int(v)
        …

and frankly that's a bit gross.

I think `select` would be a better basis for dispatching than switch as it already supports getting data out of stuff, and it better represents the linear top-to-bottom dispatching.

ithkuil 4 days ago [–]

I'm a bit confused about the example. It seems you care about tagging an integer with a qualifier.

    type foo struct (
      kind Kind
      data int
    }

Then you switch on kind and you use the data

The type switch is useful when you have to consume the actual type, without runtime errors for type conversions

masklinn 4 days ago [–]

> I'm a bit confused about the example. It seems you care about tagging an integer with a qualifier.

No, I'm pointing out that a proper sum type can (and often does) need that. It doesn't mean those are the only members.

> The type switch is useful when you have to consume the actual type, without runtime errors for type conversions

My point is that the type switch is garbage when multiple variants can have the same associated type.

ithkuil 4 days ago [–]

I see; that's because I'm this case we'd be fusing the discriminator with the payload.

With destructuring pattern match constructs you'd be binding variables to "inner" members of the sum type.

I do understand that. I'm not so sure it's so important, compared to Just being able to say that a box can contain either A or B or C.

Interfaces are great when you don't care what does a box contain as long as it quacks like a duck.

Sometimes though you need to carry around one out of N types of something and currently all you can do is to use an interface{} and deal with the possibility of a runtime error if somebody breaks the invariant.

masklinn 4 days ago [–]

> I do understand that. I'm not so sure it's so important

It absolutely is, even more so because of Go's interface. The difficulty of properly discriminating between interfaces extending one another is a reason the FAQ gives for rejecting union types.

> compared to Just being able to say that a box can contain either A or B or C.

I'd argue that this is by far the lesser use case, and furthermore trivially and completely subsumed by the alternative.

> Sometimes though you need to carry around one out of N types of something and currently all you can do is to use an interface{} and deal with the possibility of a runtime error if somebody breaks the invariant.

And sometimes you need to carry around one of N values some of which have overlapping contents and currently all you can do is get bent.

---

tleb_ 7 days ago [–]

I believe sum types are not added because they come close in functionality to interfaces, the idea being that if something should be X or Y, you make an interface that X and Y implement.

Edit: true explanation here https://golang.org/doc/faq#variant_types

mumblemumble 7 days ago [–]

There is some similarity, but it is so agonizingly superficial. At their core, they're for two very different, arguably orthogonal, purposes, and they behave in two very different ways. Sum types are for composing datatypes, and interfaces are for abstracting behaviors.

In practice, that means that there's just not much overlap between their semantics. Sum types let you say, "A Widget is either a Foo or a Bar, but nothing else." Interfaces give you no way to set boundaries like that. They say, "A Widget is anything that declares itself to be a Widget." And then you can declare Widgets Foo and Bar, sure, but anyone else can come along later and create Baz and Bof.

Interfaces, on the other hand, place restrictions on behavior. You say, "A Widget must support these operations," and, if Foo and Bar are Widgets, they must support those operations. Sum types don't let you do that. A sum type Widget with elements Foo and Bar places zero restrictions on what Foo and Bar look like. They could be literally anything.

The question, "What would happen if the elements of a variant type were themselves interfaces?" leaves me wondering if the authors' consideration of variant types consisted of taking a cursory look at Scala, which does (confusingly and hackily) co-opt Java's inheritance mechanisms in its implementation of sum types. Which does lead to some serious language warts. There are plenty of other languages which have both interfaces (or typeclasses) and sum types implemented more independently, though, and it does not typically lead to confusion.

That last paragraph is also somewhat bothersome, and makes me think once again that this response is more of an offhanded dismissal than a considered answer. The full question is essentially, "Why don't you implement this feature that would greatly increase the language's type safety?" and the response is, "Because you don't need it. See, if you just abandon type safety, what you can do is..."

I suspect that the real answer, even if the FAQ authors or the people who designed the language don't realize it, is that generics are practically (if not technically) a precondition for an algebraic type system. You could implement one without generics, but it wouldn't be very useful.

---

dorian-graph 7 days ago [–]

If anything, nil checking in Go makes it quite exciting (will I get a nil pointer error somewhere? Who knows!). Sum types would make this quite boring though.

masklinn 6 days ago [–]

> If anything, nil checking in Go makes it quite exciting (will I get a nil pointer error somewhere? Who knows!).

The most exciting part is finding that something is not nil but is actually nil. Great surprise, thanks typed nils!

erik_seaberg 6 days ago [–]

It's the only language I've ever seen that tries to dispatch on what type of object you don't have. Very zen.

---

pjmlp 7 days ago [–]

Pascal sometimes in the mid-70's

    type
       state = (on, off);

Go in 2020, about 50 years later,

    type state int

    const (
        on state = 0
        off state = 1
    )

Yeah we don't need those modern features from PhD? level languages.

bbatha 7 days ago [–]

That falls over very quickly in go unfortunately because of zero values.

   type state int

   const state (
      On state = 1
      Off state = 2
    )
    var s state
    fmt.Println(s) // prints 0

---

" No Compositionality

Protobuffers offer several “features”, but none of them see to work with one another. For example, look at the list of orthogonal-yet-constrained typing features that I found by skimming the documentation.

    oneof fields can’t be repeated.
    map<k,v> fields have dedicated syntax for their keys and values, but this isn’t used for any other types.
    Despite map fields being able to be parameterized, no user-defined types can be. This means you’ll be stuck hand-rolling your own specializations of common data structures.
    map fields cannot be repeated.
    map keys can be strings, but can not be bytes. They also can’t be enums, even though enums are considered to be equivalent to integers everywhere else in the protobuffer spec.
    map values cannot be other maps.

This insane list of restrictions is the result of unprincipled design choices and bolting on features after the fact. For example, oneof fields can’t be repeated because rather than resulting in a coproduct type, instead the code generator will give you a product of mutually-exclusive optional fields. Such a transformation is only valid for a singular field (and, as we’ll see later, not even then.)

The restriction behind map fields being unable to be repeated is related, but shows off a different limitation of the type-system. Behind the scenes, a map<k,v> is desugared into something spiritually similar to repeated Pair<k,v>. And because repeated is a magical language keyword rather than a type in its own right, it doesn’t compose with itself.

Your guess is as good as mine for why an enum can’t be used as a map key.

What’s so frustrating about all of this is a little understanding of how modern type-systems work would be enough to drastically simplify the protobuffer spec and simultaneously remove all of the arbitrary restrictions.

The solution is as follows:

    Make all fields in a message required. This makes messages product types.
    Promote oneof fields to instead be standalone data types. These are coproduct types.
    Give the ability to parameterize product and coproduct types by other types.

That’s it! These three features are all you need in order to define any possible piece of data. With these simpler pieces, we can re-implement the rest of the protobuffer spec in terms of them.

For example, we can rebuild optional fields:

product Unit { no fields }

coproduct Optional<t> { t value = 0; Unit unset = 1; }

Building repeated fields is simple too:

coproduct List<t> { Unit empty = 0; Pair<t, List<t>> cons = 1; }

Of course, the actual serialization logic is allowed to do something smarter than pushing linked-lists across the network—after all, implementations and semantics don’t need to align one-to-one. "

---

Ruby type annotations (RBS): https://developer.squareup.com/blog/the-state-of-ruby-3-typing/

---

is the solution to my worry of the Haskell type system being 'too smart' (and hence confusing) as simple as:

unification type inference within a function, one-way (forward) type inference between (multiple levels of) functions

This would mean that when you break apart a function into multiple functions, sometimes you have to add type annotations.

actually i think something like this is what i already have in ootType (search for 'the variadic haskell problem')

here's a review of what i've said elsewhere about this (created by grepping for http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function ):

http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function
whyOot: " However, the type system's power is limited in order to make code easier to understand.

One important principal is that, when reading code, complex reasoning about static type inference should n ot be necessary to figure out what the code does.

For example, in many ways Oot's type system is less powerful than Haskell's; we limit the power of the type system to (combine type inference with polymorphism in order to do compile-type computations that alter or determine the semantics of code), e.g. stuff like http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function .

Although the type system allows you to express complex statically-checked constraints on semantics, for the most part this power can only be used to doublecheck the semantics that are explicitly expressed in code, rather than to resolve semantics that the rest of the code leaves ambiguous. "

ootVariadicFnsVsImplicitPartialApplication.txt (search for http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function )
ootType: "* should you allow parametric polymorphic type to be inferred both ways, but only allow adhoc types to be inferred forwards (because we want to avoid the sort of unreadable black magic in http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function ; see [8] for more discussion)? or must they be the same for composability?"
ootDesignChoices.txt:

Carrying out computations in the type system should not be necessary in order for a reader to interpr

et 'syntactic' seeming things ==== For example, Haskell's use of induction in the type system to achieve variadic functions would be considered an abomination: http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function (see [9] )

Note that we are concerned here with readability; we don't want to even offer the option for programmers to use the type system in this way.

It's easy for programmers to reason in the direction of the control flow, but harder otherwise

For example, ad-hoc polymorphism depending on the types of inputs to a function is easy to understand. In order to simulate the type inference in your head for this, you just have to simulate what is happening in the program anyways, which programmers are already good at.

Furthermore, ad-hoc polymorphism in this direction, if resolved at runtime, is the same as if there was a 'switch' statement in the code at this point that switches depending upon the type of the inputs. So programmers can reason about this type system behavior in terms of something else familiar.

On the other hand, ad-hoc polymorphism depending only on the types of return arguments is more difficult to understand. It proceeds 'backwards in time', and it cannot be emulated by non-type-inferencing runtime code.

This further explains why http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function is difficult to understand (see [10] ).

It's less important for programmers to understand the parts of the type system that merely verify correctness (safety), but more important for them to understand the parts that alter semantics (conciseness)

ootTypeThoughts.txt: "current proposal to prevent black magic like the Haskell variadic function example (http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function ; see ootTypeNotes2):

just ban output ad-hoc polymorphism altogether and use type parameters for things like 'default'? "

ootTypeNotes2.txt (here's the main detailed explanation that i was thinking of): "=== http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function ===
you are using the type system's type inference combined with polymorphism to decide something which feels syntactic
in unpacking this, you switch from reading things from left-to-right and right-to-left three times
- when you first see 'sumOf 5 8 22 :: Integer', you read it left to right. And indeed, that's how it parses: (((sumOf 5) 8) 22) :: Integer
- but now, you see that you need to determine the type of 'sumOf 5' in order to compute the first term. You know it takes an Int as integer. But you don't know what type it yields as output.
- So you look further to the right. 8 is an Int, but you still don't know what you need. So you look further to the right. 22 is an Int and you gotta have an Int at the end. So the type of ((sumOf 5) 8) 22 must be Int -> Int.
- And now you have to switch and go right to left. Since ((sumOf 5) 8) 22 must be Int -> Int, ((sumOf 5) 8) must be Int -> Int -> Int, and sumOf 5 must be Int -> Int -> Int -> Int.
- Now you start substituting the fn defn. Int -> Int -> Int -> Int matches (Int -> (*)) so (sumOf 5) is replaced by (sumOf . (5 +) . toInteger).
- But now you have to go right again. (((sumOf . (5 +) . toInteger) 8) 22). And now you have to go leftwards from the 8 (not sure if i got this step right, i forgot how . work w/ grouping): ((sumOf . (5 +) . (toInteger 8)) 22) -> ((sumOf . (5 +) 8) 22) -> ((sumOf (5 + 8)) 22).
- Now you have to go right to left again (or just remember your earlier going-left work); ((sumOf (5 + 8)) 22) :: Int so (sumOf (5 + 8)) :: Int -> Int so sumOf :: Int -> Int -> Int. Then go left to right again to expand: (sumOf (5 + 8)) 22 -> (sumOf . ((5 + 8) +) . toInteger) 22. Then go right to left again to apply: (sumOf . ((5 + 8) +) . toInteger) 22 -> (sumOf . ((5 + 8) +)) 22 -> sumOf . ((5 + 8 + 22))
the types of parts of the expression are unexpected and context-dependent and seemingly inconsistent with each other. You wouldn't expect the type of 'sumOf 5' to be 'Int -> Int -> Int', you'd expect it to be 'Int', because if you said (sumOf 5) + 3 you'd get 8. It's only 'Int -> Int -> Int -> Int' in this particular context. Indeed, right in this derivation, we see that (sumOf 5) is of type Int -> Int -> Int -> Int, but (sumOf (5+8)) is of type Int -> Int, and (sumOf (5+8+22)) is of type Int->Int->Int.
in fact, the compiler is using its knowledge of how many times the function will be applied in the future to alter the function's behavior. In theory, you could make a function which does one thing if it will be applied less than 143 times before a scalar results, and another thing if it will be applied 143 times before a scaler results, and another thing if it will be applied 144-251 times... etc. The type logic counts via type-level inductive hooey. This seems like an abomination because it seems like something syntactic far away from the location of the function in the source code, and in the future from the perspective of data flow, is affecting the semantics of the function.

Hmm, the more i think about it, the more i think that we should ban type inference determining ad-hoc polymorphism which is polymorphic only in the return type (see [11] for more reasoning on that). This won't make the language any less powerful than, say, Python (in Python there is no type inferencing so in order to make an ad-hoc polymorphic choice, the class of the value that the object that you are dispatching on must differ (which would correspond to an input value in Haskell and probably in Oot). So i'm proposing that in Oot, you can have all the type inferencing you want, it's just that type inferencing won't be allowed to drive ad-hoc polymorphism from the "future" (from the types of the outputs).

Or, try THIS one on for size: the statically known (at the time of compiling the typeclass) kinds of the return types of different ad-hoc poly fns of the same inputs may not vary. Hmm.. but that sounds like you cant have two instances of the same kind of two different typeclasses. For example, it would prevent the empty homogeneous list from being both an empty list of ints and an empty list of fns, at different points in the program. No... it just prevents the existence of a fn, 'empty', that returns that. You can still have Int.empty and Fn.empty; or even, empty(Int) and empty(Fn). So, you can't have a single fn, 'empty', that sometimes returns statically known lists of ints and sometimes statically known lists of Fns, on the same inputs. This should prevent any type-level inductive hooey like Haskell's variadic fns, which uses the static knowledge of how many times the result of a fn will be applied in the future, which seems like a 'syntactic' thing, to alter the behavior of the function over the various applications. It still allows 'messages from the future', that is, it allows the compiler to automatically select among multiple instantiations of return-type-only-ad-hoc-polymorphic functions based on the expected return value, but it guarantees that all of the possibilities will be of the same kind (and kind incorporates the notion of 'syntactic' difference here)

but you could get around this by giving a dummy argument to the fun (empty(dummy)) so that the compiler can infer 'dummy' to be the desired return type

" hmm.. in addition to the restrictions contemplated above to prevent http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function , might want to have more restrictions on global type inference than on local (see http://lambda-the-ultimate.org/node/2862 ). "

" so, a conundrum remains:

we don't want inference that permits black magic patterns like http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function (see proj-oot-ootTypeNotes1).
but we do want to be able to define very abstract, 'algebraic' functions which work for anything in a typeclass, and which return new instances, e.g. a function which returns the 'default instance' under some (or all) conditions

---

one idea i had was not to allow backwards ad-hoc polymorphism (that is, ad-hoc polymorphism where a type variable appears in the output of the function) where the cases for the output type variable had different arities.

--- i guess one way to state the key problem with http://stackoverflow.com/questions/3467279/how-to-create-a-polyvariadic-haskell-function is:

(background:)

(1) the type checker saw 'sumOf 1 4 7 10 :: Integer'. In inferred that (sumOf 1 4 7) must be of type Integer -> Integer. It inferred that (sumOf 1 4) must be of type Integer -> Integer -> Integer... it inferred that sumOf must be of type Integer -> Integer -> Integer -> Integer -> Integer.

(2) it said, well sumOf is actually of type Integer -> r. Can r be Integer -> Integer -> Integer -> Integer?

(3) it said, maybe; although one instance for SumRes? has r = Integer, another one has r = a -> r' (where a is in the Integral typeclass, and r' is in the SumRes? typeclass). This matches if a = Integer and r' = Integer -> Integer -> Integer. Integer is in Integral, so i just need to verify that we can have Sumres r'. (recurse into step 3)

The problem, then is that: (1) the typechecker looked at how many times the function sumOf and its descendents were actually being applied in the code, (2) the typechecker used backwards inference to propagate this info backwards, (3) the typechecker used this information to decide which ad-hoc polymorphic implmentation of sumOf to use, (4) the typechecker recursed on this

As a result, the number of times the typechecker recursed affected what function was actually produced in the code, and this number was dependent not just on the name of the type used, but on the structure of the source code in the 'future'.

Perhaps all four of these need to be present for this to get confusing. It's not just that (2) backwards inference was used for (3) ad-hoc polymorphism; that's what a generic 'default' function would do, but that's fine. It's also that (1) the structure of code, not just the names of types, was used, and (4) this structure affected not just the internal computation of the typechecker, but also which computation would actually be performed.

I guess what i really find objectionable here is that i want the user to be able to think of the typechecker as a fancy, alien black box that proves some theorems about their code to catch bugs. I don't want the user to have to understand what the typechecker is thinking in order to figure out what a piece of code that they are reading will do. If the typechecker's thought pattern affects which implementation is chosen in ad-hoc polymorphism, then the user has to understand what the typechecker is thinking in order to determine what a piece of code will do.

And yet, i do want ad-hoc polymorphism. I want to be able to say "plus(3,5)" and "plus(3.0,5.0)", and let the compiler figure out that in the former, it should interpret 'plus' as referring to the implmentation 'plus-integers', and in the latter, as referring to the implementation 'plus-floats'. I don't want to have to write "plus-integers(3,5)" to resolve this ambiguity.

Is the problem with backwards inference? Perhaps, as long as the inference pattern of the typechecker matches the control flow of the code, there is nothing 'extra' for the reader to understand, or rather, they can understand the compiler's ad-hoc polymorphism choices by pretending that at runtime it is using dynamic typing and looking at what type a value actually has.

But, as noted above, i do want backwards inference sometimes. I want to be able to define e.g. a function 'arrayinit(x)', of polymorphic type [A], which takes a length x and returns an array of length x whose values are of type A, and whose actual value is the default value of type A, whatever that default value is. To do this, i need to be able to call an ad-hoc polymorphic function 'default', which returns the default value, and which is polymorphic in the type parameter A of arrayinit, even though the typechecker might infer which type A is based on how arrayinit is used. Why is that OK?

Maybe because arrayinit was generically polymorphic, not ad-hoc? But what if there is an optimized array implementation for integers, so arrayinit should be ad-hoc based on whether or not A is Integer? Then, what makes this OK if haskell sumOf is not?

Perhaps the recursion on the structure of the code is the problem? Perhaps, as long as the information from the future being used is a simple, opaque type, i don't have a problem with saying, 'yeah, the compiler figured out that this array is going to be an array of integers, so it initialized it to 0 and used the integer array optimizated implementation'; but i do have a problem with saying, 'the compiler saw that in the future i would apply this function to an integer and then apply the result of that again and then apply the result of that again, so it constructed the code (a + (b + c)) to match this structure. Because now the compiler's backwards thought process seems Turing complete or close to it, and i have to follow it. But if we just prohibited recursion and made it polynomial (e.g. it can't construct varying depth structures, only fixed depth structures) then would that really be any better? Even if it is not Turing-complete or even NP-complete, it could still be looking at the structure of my future code and building a matching structure here.

Could we prohibit the compiler from looking at structure at all? But then, in a world with currying, the compiler can't figure out that when it sees "f 3 4 :: Int", f must be of type Int -> Int -> Int, because (f 3) is of type Int -> Int. So that seems too weak.

My intuition is that it's that, the idea that the compiler is looking at structure in the future and using that to build, not just some type that it uses within its black box, but rather the actual control flow structure of my code via ad-hoc polymorphism. What we want to do is constrain either the information that affects backwards ad-hoc polymorphism to be non-structural (opaque black-box type names), or constrain the way that ad-hoc polymorphism can act so that it can't do structure-building.

The latter option seems fruitless because you don't want to prohibit recursion in the IMPLEMENTATIONs; and you don't want to prohibit the implementations from doing pattern matching on their inputs and destructuring them. But i think i am missing something there.

Also, if you do this right, it may make the typechecker simpler and maybe even less computationally complex, because i have this idea that we would reduce the process of typechecking from this complicated recursive process passing around complex recursive structures, into a simpler process that is locally recursive but that globally passes around opaque type names.

...

so, the current proposal:

if there is any pattern-matching done in the output of ad-hoc polymorphism: then the structure of the pattern must be the SAME for all cases; only the constants can change; EXCEPT that you can have different substructure as long as you only bind dummy variables (that can't even be attached to contexts). Example:

you CAN have a function with an implementation whose output is Int, and another implementation whose output is String
you CAN have a function with an implementation whose output is String -> Int, and another implementation whose output is Int -> String
you CAN have a function with an implementation whose output is Int -> Int, and another implementation whose output is Int -> Int -> Int
you CAN have a function with an implementation whose output is Int, and another implementation whose output is [Int]
you CAN have a function with an implementation whose output is Int, and another implementation whose output is [_] (where _ is a dummy type variable)
you CANNOT have a function with an implementation whose output is Int, and another implementation whose output is [A] (where A is a normal (bindable) type variable)