Bayle Shanks's website: notes-computer-programming-programmingLanguageDesign-languages

static typing

the compiler proves a little theorem that the program doesn't contain certain kinds of mistakes

unfortunately, every language i know that is static either:

makes the "type system" too inflexible to permit very generic code (or, allows generic code e.g. by typecasting, but when using generic code, you give up the benefits of static type checking) (e.g. Java before generics, Haskell without typeclasses)
or, requires the programmer to write lots of type annotation throughout the code, or alternately, other kinds of boilerplate, which are a pain to write and which increase the size of the codebase (e.g. Java, Haskell with typeclasses)
or, is hard to understand

it seems that one "holy grail" of language design is to create a statically typed language that doesn't have any of the above flaws.

type constructors

  "list of ints"

nullable types (null is a value of these types) vs. maybe type

layered routines which ea. return a nullable types causes confusion when you get a null -- which layer did it come from?
maybe types: layering ("escaping") is typechecked

non-local modifications

(van roos) eg, starting from a functional language: exceptions, state, concurrency

exceptions

eg without exceptions:

def a(x):
 (b_success, b_result) = b(x)
 if not b_success:
   print "Sorry, there has been some sort of error"
 else:
   print b_result/2

def b(x):
 (b_success, c_result) = c(x)
 if not c_success:
   return (False, 0)
 else:
   return (True, c_result+3)

def c(x):
 if x == 0:
   return (False, 0)
 else:
   return (True, 10/x)

problems:

gotta pass around those result codes
b has to explicitly deal with the result codes even though the exceptional condition was detected in c() and dealt with by a(), and b() is just a middleman

with nullable types as a poor man's exception:

def a(x):
 b_result = b(x)
 if b_result == None:
   print "Sorry, there has been some sort of error"
 else:
   print b_result/2

def b(x):
 c_result = c(x)
 if c_result == None:
   return None
 else:
   return c_result+3

def c(x):
 if x == 0:
   return None
 else:
   return 10/x

now it's easier to pass around the result code, b/c it is attached to the variable value (a non-None is an implicit success)
but b() still has to deal with it
also, it's easy to forget to check for None. a() might have just stored b_result into some data structure and passed it along to someone else. This may have led to an error or incorrect result much later in the program, making it hard to debug.
None doesn't contain any informatation about what caused the error or what sort of error it was

with exceptions:

def a(x):
 try:
   b_result = b(x)
   print b_result/2
 except ZeroDivisionError:
   print "Sorry, the input to a() cannot be 0"
 except:
   print "Sorry, there has been some sort of error: " + repr(e) 

def b(x):
 c_result = c(x)
 return c_result+3

def c(x):
 if x == 0:
   raise ZeroDivisionError
 else:
   return 10/x

now b doesn't have to care about the possible exception
the value of the exception holds information about the problem
the problem will be visible soon after it is caused

threaded state vs. references vs. monads

def xWithCounterInit(x): return (x,0)

def xSquaredAndCount(xWithCounter): (x, counter) = xWithCounter xWithCounterNew = (x, counter+1) return (x2, xWithCounterNew)

def getCount(xWithCounter): (x, counter) = xWithCounter return counter

xWithCounter = xWithCounterInit(11) (result, xWithCounter) = xSquaredAndCount(xWithCounter) (result, xWithCounter) = xSquaredAndCount(xWithCounter) (result, xWithCounter) = xSquaredAndCount(xWithCounter) print getCount(xWithCounter)

refs:

class XWithCounter?: def __init__(self, x): self.x = x self.count = 0

  def xSquaredAndCount(self):
    self.count = self.count + 1
    return self.x**2

  def getCount(self):
    return self.count

xWithCounter = XWithCounter?(11) result = xWithCounter.xSquaredAndCount() result = xWithCounter.xSquaredAndCount() result = xWithCounter.xSquaredAndCount() print xWithCounter.getCount()

monads: todo

lexical closures

vs. dynamically scoped closures: eg from elisp (later)

p.24 van roy

conditionals

if, cond, switch

loops

goto recursion while, for, repeat until foreach map, fold

syntax: list comprehensions

call-by-value, call-by-reference, call-by-name, call-by-need

todo

and: cbneed, laziness separates control from definition

referential transparency

todo

continuations

"like goto for functional languages"

memory management

garbage collection

syntax

stack based
- reverse polish, i.e. postfix
  - no parens needed -- but hard to read
  - x y g z f
functional
- f(g(x,y),z)
functional Lisp-style (fully parenthesized, first item in list is fn)
- (f (g x y) z)
ML style (left-associative partial application)
- f (g x y) z
binary infix operators
- x * y
- precedence, associativity
  - x * y * z = (x * y) * z
  - z + x * y = z + (x * y)
  - z + x * y == q + r
    - same as (z + (x * y)) == (q + r)
    - compare: (== (+ z (* x y)) (+ q r))
- pros: saves keystrokes, ez to read b/c reminder in the middle, better when chains of same operator

significant whitespace

http://en.wikipedia.org/wiki/Off-side_rule

Innovations in function calling

Positional
Keyword
Default
Partial

Pronouns

Perl's $_

Concurrency

Fine grained vs. course grained

Shared state locks * Locks dont compose (eg)

Message passing
transactions
- stm
race conditions
futures
CSP
pi-calc
ops on channels:
- create
- read
- write

the wikipedia entry on falcon reminds of some things you can do with sentinel values in basic functional list iteration functions:

The functional paradigm includes an out of band item marker. Items can receive an oob flag marker which can be tested through language operators and functions and indicate a special meaning for values traveling in functional sequences. For example, many functional loops, as floop and times can perform loop restarts or can be interrupted by returning either a out of band 1 or 0 from any of the involved functions. The map function, transforming all the values in an array through a mapping function, will ignore the returned value (discarding it) if it's an out of band nil; in this way, it is possible to perform map-and-filter operations in place.

Namespaces

Modules

as dicts, as records

Metaprogramming

preprocessing

source filters

eval, multistage programming, futures, typing

compilation pipelines

template metaprogramming

parsed? tokenized?

introspection, reflection

macros hygenic macros

higher-order functions

"higher-order programming combined with syntactic support (e.g., meta-object protocols and generics), to full-fledged tinkering with the kernel language (introspection and reflection). " -- van roy

Declarative

Higher order functions

prob: how to serialize

Turing completeness

as turing machine as lambda calculus

vs.: various finite automata regexps

 primitive recursive (unconstrained maximization; or goto)

Database joins

Favorite Data types

Records Lists Dicts (association lists, lookup tables, hash tables)

eg show how built out of records

Polymorphism

Parametric
Ad-hoc

Serialization

Parsing

Tokenizing, regexps

symbol table

cfgs

rec descent

ll(x) lr llar

left-recursion and its elimination

todo

programming language styles

imperative turing machine python functional (ref trans, but also tends to use hofs) lambda calculus haskell constraint satisfaction logic curry stack based function composition, not fn application http://en.wikipedia.org/wiki/Cat_%28programming_language%29

tour of some prog langs

c (or d or c# or go?) python prolog curry cat scheme

http://en.wikipedia.org/wiki/Categorical_list_of_programming_languages

roles

roles: to avoid false cognates in duck typing (just b/c it has a "bark()" method, does that mean it's a tree? mb it's a dog)

seems to me that typeclasses are like roles here

multiple inheritance, mixins, traits

ways to let a namespace (such as an object) import or "inherit" functions or methods from more than one "parent" namespace

the question is: what if more than one parent defines the same method name?

from most powerful to most safe:

multiple inheritance: look at how each parent got that name; was it defined directly in the parent, or was it inherited from a grandparent, etc? define a rule based on this stuff. inheritance is more than just method import; it also defines the "type" of the child (an "isa" relationship)
mixins: no tree of inheritance from parent's parents. method import only; not an "isa" relationship
traits: like mixins, but with a prohibition on overwriting method names; that is, if one namespace imports multiple traits, the names of methods defined by those traits must be disjoint

when one language implements both inheritance and traits, methods defined directly in a class should have the highest precedence, followed by methods coming from a trait, followed by methods coming from a base (parent) class.

Liskov Substitution Principle and contravariance vs. covariance

(todo; defn liskov)

Let's say you have some OOP class that you subclass. Consider a single method in your subclass that is overriding a method in the base class.

Clearly, this new method should only be allowed to return subtypes of whatever type the base method was declared to return. This is because you want client code that called the base class and expected the return value to have certain properties to work okay when the base class is replaced by the subclass. I believe that this restriction is called "covariance of return types"

What should the new method accept as input? It should accept everything that the base method did. That assures, again, that the subclass can be used anywhere that the base class could. It's okay, though, if the new method accepts MORE types than the base method did. This restriction, namely that methods in subclasses overriding methods in base classes must accept any type that the base method accepts, and may accept more, is called __contravariance__ of parameters.

Some languages choose to be more restrictive, and require overriding methods in subclasses to accept exactly the same types as the overriden method in the base class. This is called "invariance".

Other languages get it backwards, and forbid overriding methods from accepting any type that the base class did not accept, but permit overriding methods to refuse to accept some types that the base class accepted. In such languages it's possible for a subclass to not be able to be substituted for its base class in some code due to a type error.

So, parameter contravariance is appropriate for subtyping. However, parameter covariance is useful for specialization (and for multiple dispatch that uses the most specialized appropriate implementation to carry out a function).

generics

todo

todo bounded generics. See also http://download.oracle.com/javase/tutorial/extra/generics/morefun.html

pluggable annotations

todo

java 6 has 'em

exception chaining

This refers to attaching information to an exception as it escalates showing the path in came from, by way of re-wrapping the original exception inside new exceptions as it escalates.

"For example, a method to play a movie file might handle exceptions in reading the file by re-throwing them inside an exception of movie playing" -- http://en.wikipedia.org/wiki/Exception_chaining

assert

Many languages have a syntactically convenient way of saying, "this expression should return True; if it doesn't, raise an error".

Remote Method Invocation (RMI)

Can a program be separated into parts running on different computers? Some languages provide a mechanism for a function on one computer to call a library on another computer. Some OOP languages allow code on one computer to have a handle to an object instance on another computer, and to call methods on that object; those method calls are (semi)-transparently sent over the network, causing the method to be invoked on the other computer, and for a value to be (semi)-transparently returned over the network.

The reason we say "(semi)-transparently" is that sometimes information will be lost or corrupted during transmission over the network; and usually the call takes a long time (prompting the desire for asynchronous method invocation). So, for example, it's possible that you might make a method call, but the invocation gets lost in transmission and. Or, worse, the method executes, possibly mutating data on the other computer, but the return value is lost in transmission, and the caller does not know if the method was successfully executed or not.

varargs

todo

ranges

todo

enumerations

todo

boxed and unboxed types

todo

looping constructs

todo

currying

todo

anonymous functions

todo

decorators

todo

properties

todo

concurrency

and parallelism

threading

coroutines

lazy collections

laziness

monkeypatching

Allowing library A to modify the behavior of functions in library B, when library B is called in client code that has previously imported library A.

automatic memory management

some languages take care of allocation and deallocation of memory transparently

automatic resource management

a generalization of automatic memory management. a resource is here understood as something that should or must be "closed" in some way when you are done with it. a memory allocation is one example, but a file handle is another. languages which support automatic resource management have a general mechanism for the language to transparently close resources. An important detail is what happens if the code which attemps to "close" the resource throws an exception. see http://mail.openjdk.java.net/pipermail/coin-dev/2009-February/000011.html for a discussion on this point.

in java: finalizers, also http://mail.openjdk.java.net/pipermail/coin-dev/2009-February/000011.html

in C++: destructors

in C#: using blocks

component architecture

some languages have support for component architectures, e.g. registries of components, hooks, etc.

Some component architectures are cross-language. See XPCOM and COM.

events and listeners

todo

purity and referential transparency

todo

controversy: however, opponents of the closest-to-pure languages point out that computing is actually about much more than "computation" in the theoretical, Turing machine/lambda calculus sense; it is also about behavior, that is, I/O that takes place at certain times.

rebuttal: fine: in order to be good, a language must be good for input and output too. but in those cases where a program contains a pure subroutine, it's nice if the compiler can know (and prove) the purity of that part

collections libraries

A key part of the standard library of any programming language is its support for "collections", that is, groups of values of similar type. For example, lists, sets, dicts (e.g. hashes, lookup tables), etc.

It is desirable that the types and APIs of the collections be as interchangable as possible to promote generic code that can perform a similar operation on many collections, without manually writing a different implementation of the operation for each type of collection.

other crucial libraries

logging, warning
serialization/marshalling (i.e. creating a serial representation of the state of a data structure) (in an RMI context, I believe that marshalling a remote object instance only stores a reference to the instance, whereas serialization actually stores a complete copy of the object's state)

other common libraries

CGI
HTTP
XML, JSON parsing
TCP/IP
regular expressions
JPEG, PNG

inheritance vs delegation vs composition vs allomorphism

todo

---

toread: http://rebelscience.blogspot.com/2008/07/how-to-solve-parallel-programming.html http://www.rebelscience.org/Cosas/COSA.htm

todo: add stuff from http://www.codegod.de/webappcodegod/Programming-Concepts-That-Were-aeutomated-By-Modern-Languages-QID639590.aspx