Bayle Shanks's website: notes-computer-programming-programmingLanguagesBook-programmingLanguagesPartConstructs

Table of Contents for Programming Languages: a survey

Part III: Language constructs and features

Chapter : data

bool

fixed length list (array)

list

S-expression

dict/associative array/hashtable

what can be a key? can you add hash fns to custom data?

struct/record

enum

http://docs.python.org/dev/library/enum.html

note: this link describes the benefits of enums over just using preprocessor macros, e.g. C #defines: http://codingrelic.geekhold.com/2008/10/ode-to-enum.html

multidim array/table

dataframe

string

unicode? bytestring?

string interpolation

some syntaxes:

"$x" e.g Perl

"{x}" e.g. Hoon

"%d" % (x) e.g. Python

"\(x)" e.g. Yeti, Apple Swift

regex

int, real

arbitrary precision arith

sized vs. mathy numeric types

pattern matching

ML, Haskell

OMeta: parsing on data http://www.vpri.org/pdf/tr2008003_experimenting.pdf

in an object-oriented language without reflection or pattern-matching, the only way to make behavior conditional upon object class is through methods (which are polymorphic on the type of the object they are a part of). In such a case, if it is desired for the behavior of a method depend on its arguments, the Visitor pattern can be used (e.g.

" package syntax;

abstract class ExpressionVisitor? { abstract void visitIntExp(IntExp? e); abstract void visitAddExp(AddExp? e); }

abstract class Expression { abstract void accept(ExpressionVisitor? v); }

class IntExp? extends Expression { int value;

  void accept(ExpressionVisitor v)
  {
    v.visitIntExp(this);
  }}

class AddExp? extends Expression { Expression e1, e2;

  void accept(ExpressionVisitor v)
  {
    v.visitAddExp(this);
  }}

The interest of this construction is that it is now possible to define operations on expressions by subclassing ExpressionVisitor?. This can even be done in a different package, without modifying the expression hierarchy classes.

Behaviour can now be defined on Expressions

package tools;

class PrettyPrint? extends ExpressionVisitor? { void visitIntExp(IntExp? e) { System.out.print(e.value); }

  void visitAddExp(AddExp e) 
  { 
    e.e1.accept(this); 
    System.out.print(" + "); 
    e.e2.accept(this); 
  } } " -- example from nice.sourceforge.net/visitor.html

variable binding within pattern matching

todo: haskell example

e.g. Apple Swift (note this Apple Swift example and some others are from https://developer.apple.com/library/prerelease/ios/referencelibrary/GettingStarted/LandingPage/index.html ):

" let size = (20, 40)

    switch size {
    case let (width, height) where width == height:
      println("square with sides \(width)")
    case (1..10, 1..10):
      println("small rectangle")
    case let (width, height):
      println("rectangle with width \(width) and height \(height)")
    }"

pointer

ADTs

GADTs

case classes

ADTs for OOP

e.g. scala http://docs.scala-lang.org/tutorials/tour/case-classes.html

the point is to be able to perform pattern matching, possibly with guards

possible features:

constructor parameters are public fields (e.g. scala)

autogenerated methods

Common examples:

(recursive) structural equality (e.g. in Scala case classes, Haskell derives)
toString (e.g. in Scala case classes, Haskell derives)
comparison (e.g. Haskell derives)
read (e.g. Haskell derives)
scala case classes: http://docs.scala-lang.org/tutorials/tour/case-classes.html
haskell derives: https://en.wikibooks.org/wiki/Haskell/Classes_and_types#Deriving , http://www.haskell.org/ghc/docs/7.4.1/html/users_guide/deriving.html

unboxed types

garbage collection

first-class queries

LINQ

first-class functions

relations

first-class types

first-class metaprogrammy stuff

macros, calls continuations, call stacks, ASTs/expressions

Views

Wadler http://homepages.inf.ed.ac.uk/wadler/papers/view/view.ps 1987

Chapter : OOP

most OOP stuff is duplicated elsewhere, but..

http://journal.stuffwithstuff.com/2013/08/26/what-is-open-recursion/

 -- Bob Nystrom

Abstract Base classes

Classes which can't be instantiated, but exist to be subclasses.

Some languages allow code in base classes to refer to members which aren't defined in the base class, causing hidden requirements on the subclass to define those members (or override the base class code). e.g. http://docs.scala-lang.org/tutorials/tour/abstract-types.html

Some languages without interfaces use abstract base classes as a standin for interfaces, e.g. consumers of the class refer to the abstract base class, and then any descendent of the abstract base class will typecheck against these consumers. This is most helpful with multiple inheritance, as one object may need to implement multiple interfaces.

Constructors

avoid one sort of future-proofing

some languages, e.g. C++ and scala, have special syntax for declaring constructors which take parameters and simply store them with the object. In e.g. C++, the constructor value are stored in ordinary fields. In e.g. Scala, the constructor's formal parameters are simply made 'visible in the whole body of the class' -- http://docs.scala-lang.org/tutorials/tour/classes.html (to is this really different from fields? are these parameters immutable values, rather than variables?)

some languages create an implicit constructor function with the same name as the class, e.g. to call the constructor for class Point with two arguments 3 and 4 and assign the resulting object to x, you say "x = Point(3,4)". In other languages, you use a special operator 'new': "x = new Point(3,4)". In some languages (e.g. C++), both of these forms are available, but they do slightly different things (see 'resource management')

Chapter : extensible data

todo should we merge this with extensible code? with oop?

future-proofing

extensibility defined (in contrast to metaprogramming) as removing the need to future-proof

future-proofing defined by Bob Nystrom: adding boilerplate in case, in the future, you need to replace built-in functionality with a user-defined abstraction without having to change each callsite.

http://journal.stuffwithstuff.com/2010/09/18/futureproofing-uniform-access-and-masquerades/ http://journal.stuffwithstuff.com/2010/10/21/the-language-i-wish-go-was/

removing the need to future proof:

"Can I replace built-in functionality with a user-defined abstraction without having to change each callsite?"

constructors

properties

remove the need for getters and setters

uniform access principal

adding fields to existing types

(e.g. via subtyping)

subtyping

prototype inheritance

monkey patching

everything is an interface

Chapter : evaluation strategy (call-bys)

call-by-reference

call-by-value

call-by-name

" Blogger Alastair Reid said...

    I think Scala has an interesting approach to defining your own control constructs. It lets you declare a function parameter as call by name and it automatically inserts a lambda round the actual parameter and an application round uses of the formal parameter.
    Just one of Scala's tricks to make it easier to implement EDSLs.
    Tuesday, May 3, 2011 at 9:18:00 AM GMT+1"

call-by-text / fexprs

call-by-need / laziness

like call-by-name, can implement things like short-circuit 'if' this way:

http://augustss.blogspot.nl/2011/05/more-points-for-lazy-evaluation-in.html

other reasons for laziness:

http://augustss.blogspot.nl/2011/05/more-points-for-lazy-evaluation-in.html
- "any subexpression can be named and "pulled out", modulo name capture" e.g.
  - in a strict language, not true; " if c then error "BOO!" else 0

Which is not the same as

    let x = error "BOO!"
    in  if c then x else 0"  short-circuit if * alternatives: " For some language constructs the solution adopted by Smalltalk (and later Ruby), i.e., a very lightweight way on constructing closures is acceptable. So, for instance, I could accept writing

    ... myAnd x {y} ...

(In SML you could make something using functors, but it's just too ugly to contemplate.) "

Lazy constructors todo Cyclic data structures "Sometimes you really want cyclic data structures. An example are the Haskell data types in Data.Data that describe data types and constructors. A data type descriptor needs to contain a list of its constructors and a constructor descriptor needs to contain the data type descriptor. In Haskell this can be described very naturally by having the two descriptors reference each other. In SML this is not possible. You will have to break the cycle by somthing like a reference (or a function). In OCaml you can define cyclic data structures in a similar fashion to Haskell, so this isn't really a problem with strict languages, but rather a feature that you can have if you like. " " Reuse I've saved my biggest gripe of strict evaluation for last. Strict evaluation is fundamentally flawed for function reuse. What do I mean? I will illustrate with and example. Consider the any function is Haskell:

any :: (a -> Bool) -> [a] -> Bool any p = or . map p

It's quite natural to express the any function by reusing the map and or functions. Unfortunately, it doesn't behave like we would wish in a strict language. The any function should scan the list from the head forwards and as soon as an element that fulfills the predicate is found it should return true and stop scanning the list. In a strict language this would not happen, since the predicate will be applied to every element before the or examines the elements."

http://existentialtype.wordpress.com/2011/04/24/the-real-point-of-laziness/
John Hughes. "Why functional programming matters". The Computer Journal - Special issue on lazy functional programming. Volume 32 Issue 2, April 1989.

so, some languages offer optional lazy sequences; some languages offer lazy sequences by default; and some languages offer lazy control flow by default. One argument for lazy control flow by default is that otherwise 3rd party library functions will tend to be written in a strict fashion, which means that they can't be applied to infinite sequences (the catchphrase for this argument among lazy proponents is "laziness is the right default").

downsides:

space complexity
http://www.haskell.org/haskellwiki/Stack_overflow

todo https://en.wikipedia.org/wiki/Evaluation_strategy

Chapter : operations

maps and folds

library if first-class functions

also called vectorized

list comprehensions

perl contexts

types of =s

value equality for values, memory equality for references
equivalence via __eq__ protocol or overridden infix operator
equivalence plus implicit type coercion
- warning: depending on how implicit type coercion works, can be non-transitive! in js this is the '==' operator, and is non-transitive! in js, 0 == and 0 == '0' but == '0'!! transitivity is a requirement for something to be mathematically considered to be an 'equivalence relation'
isa
problem with structural equality taking Any,Any: no type error if values are of different types
problem with implicit conversion interaction; example from http://scalada.blogspot.com/2008/03/implicit-conversions-magical-and.html e.g. string1.reverse().reverse() != string1 if .reverse implicitly converts from String to RichString?

Chapter : extensible operations

operator overloading

esp. useful in numeric code dealing with vectors or matrices

object protocols and magic methods

e.g. Python

http://www.rafekettler.com/magicmethods.html

"pervasive protocols for built-in types"

adding methods to existing types

implicit coercions

example of print statements on numbers; autoconvert to strings

example from http://scalada.blogspot.com/2008/03/implicit-conversions-magical-and.html e.g. add .reverse() operation to String class by creating RichString? class and adding it there, then giving an implicit conversion

potential issue with implicit conversions and equality (example from http://scalada.blogspot.com/2008/03/implicit-conversions-magical-and.html e.g. string1.reverse().reverse() != string1 if .reverse implicitly converts from String to RichString? )

can slow down compilation

e.g. http://stackoverflow.com/questions/3606591/why-does-intellij-idea-compile-scala-so-slowly/3612212#3612212 e.g. "Most likely the influence of the number of implicits in scope is in some way multiplied by the actual number of implicit applications, because at each implicit application the compiler must ensure one and only one implicit “heals” the candidate compiler error. In other words, the compiler can't stop looking at implicits once it finds one that solves the type error at hand; it must keep on looking to make sure one or more other implicits don't also solve that same type error. Thus the compile time cost of implicit use is likely related to the number of implicits in scope multiplied by the number of times the compiler must check all those implicits (i.e., the number of implicit applications). I say “likely,” because as yet we haven't written a script to verify this theory. " -- http://www.artima.com/articles/compile_time.html

because of the issues with implicit conversions, many languages do not support them

transitive implicit conversions

scala: limit direct implicit conversions, but can chain via implicit parameters

example from http://stackoverflow.com/questions/5332801/how-can-i-chain-implicits-in-scala/15334997 possibly see http://www.scala-lang.org/old/node/114 for background? dont think its needed tho

todo: read http://stackoverflow.com/questions/5598085/where-does-scala-look-for-implicits and see if its useful

typeclasses

C++ concepts

interfaces

overloadable array subscripts

prototype inheritance

monkey patching

Chapter : modules and encapsulation

forms of import #include: direct text insertion import a,b,c from f from f import * import f hiding a,b,c (i.e. import everything exported from f, except for a, b, and c)

ML?

public/private/protected vs consenting adults protocols

inheritance

delegation

interface embedding (Go: http://golang.org/doc/effective_go.html#embedding )

Subclassing vs. composition

Subclassing is an OOP idea where instances of the subclass inherit, by default, all of the fields and methods of the superclass, and then can selectively override some of these methods. The overriding methods have access to the internal fields and methods of the superclass, just as if they were written in the superclass (in some languages e.g. Java this can be restricted by using 'private', as opposed to 'protected', access modifiers). In other words, the overriding methods in the subclass are 'behind' the encapsulation boundary, just like the internal members of the superclass, and as opposed to the clients of the class. In typed languages, any instance of the subclass is considered to be of the type of the subclass, but also to be of the type of the superclass. A key property of subclasses in most OOP languages is the Liskov substitution principle, which states that any instance of a superclass should be able to be replaced by a subclass. This is used to add functionality beyond what is offered by the superclass, or to implement the superclass in a different way.

Composition is a related but different way to achieve the same goals. In composition, instead of having a subclass, you create a class which has a member which contains an object of what would have been the superclass. Then for the psuedo-subclass you write methods wrapping all of the public methods of the superclass. The idea is that you should be able to substitute any instance of the psuedo-subclass for any instance of the psuedo-superclass, e.g. the Liskov Substitution Principal should apply. Now instead of overriding methods in the subclass, you modify the wrapper methods in the psuedo-subclass.

The main advantage of composition is that it is more decoupled. This is because psuedo-subclass is outside the encapsulation boundary of the psuedo-superclass, and does not have access to the psuedo-superclass's non-public fields or methods. When the programmer of a superclass changes its non-public fields or methods, this might break subclasses; but it will not break psuedo-subclasses which only interact with the psuedo-superclass via composition.

Some advantages of subclassing are (a) you don't have to bother to wrap the methods of the (psuedo-)superclass, (b) in most OOP languages, the type system recognizes instances of a subclass as also being of the type of the superclass, but does not recognize objects which extend the superclass by composition as being of its type, and (c) you can access the non-public fields and methods of the superclass. To elaborate on (c), say that you have a class A, and you need to make a class B which is almost exactly the same as A except for one small change. You can make this change by subclassing A and overriding one method which is called from various other methods of class A. Unfortunately, the programmer of A didn't forsee that you might want to change just the part that you do, and so there is no way provided for composing class to just change that part without also reimplementating a large part of A's non-public fields and methods. Using a subclass, you can override the method at issue, and when other methods in the superclass call that method, they'll get your implementation. With composition, you would have to reimplement a lot of code.

(c) may be a little bit confusing because this very same thing is the opposite of what is claimed to be an advantage for composition. The reason is that accessing internal fields and methods of a superclass has both advantages and disadvantages; an advantage is that you can change the behavior of the superclass in a fine-grained way, a disadvantage is that you become tightly coupled to the implementation of the superclass.

Chapter : security

Ambient authority

"A subject, such as a computer program, is said to be using ambient authority, if it only needs to specify the names of the involved object(s) and the operation to be performed on them in order for a permitted action to succeed.

...

For example, suppose a C program opens a file for read access by executing the call:

 open("filename", O_RDONLY, 0)

The desired file is designated by its name on the filesystem, which does not by itself include authorising information, so the program is exercising ambient authority.

...

...if the program should be able to access an object when acting on its own behalf but not when acting on behalf of one of its clients (or, on behalf of one client but not another), it has no way to express that intention. This inevitably leads to such programs being subject to the Confused deputy problem.

The term "ambient authority" is used primarily to contrast with capability-based security, in which executing programs receive permissions as they might receive data, as communicated first-class object references. This allows them to determine where the permissions came from, and thus avoid the Confused deputy problem. However, since there are additional requirements for a system to be considered a capability system besides avoiding ambient authority, "non-ambient authority system" is not just a synonym for "capability system". " -- http://en.wikipedia.org/wiki/Ambient_authority

Capability-based security

todo: what are the "additional requirements for a system to be considered a capability system besides avoiding ambient authority" mentioned above?

Links:

todo http://en.wikipedia.org/wiki/Capability-based_security

Chapter : linked data

Pointers

References

Aliases

variables vs. (immutable) values

Chapter : managing state

call-by-reference

call-by-value

class-based oop

prototype-based oop

multiple inheritance

variables that don't vary: haskell-styple let

and how let can make you break your functions into parts

variables that don't vary: scala-style val

e.g. scala val

C.O.W.

e.g. Apple Swift 'let'

note: in Apple Swift, "The value of a constant doesn’t need to be known at compile time, but you must assign it a value exactly once." -- https://developer.apple.com/library/prerelease/ios/documentation/Swift/Conceptual/Swift_Programming_Language/GuidedTour.html#//apple_ref/doc/uid/TP40014097-CH2

mutable variables (that do vary)

e.g. scala var

ref and deref

compile-time evaluated constants

sometimes error if cannot be evaluated at compile time

python's with

RAII

finalizers

closure

" Closures Because Common Lisp is lexically scoped, when we define a function containing free variables, the system must save copies of the bindings of those variables at the time the function was defined. Such a combination of a function and a set of variable bindings is called a closure. " -- http://ep.yimg.com/ty/cdn/paulgraham/onlisp.pdf

closures vs. references

" upvote

kyllo 34 days ago

link

Here's an SML example:

val a = 1;

val a = 1 : int

fun f() = a;

val f = fn : unit -> int

f();

val it = 1 : int

val a = 2;

val a = 2 : int

f();

val it = 1 : int a is still bound to 1 as far as f() is concerned

Now in Javascript:

> var a = 1;

undefined

> function f(){return a};

undefined

> f();

> a = 2;

> f();

If you bind a val, then you define a function that references that val, then you later shadow the val binding, then call the function again, the function still sees the earlier val binding, because it's a closure of the environment at the point where the function was defined, not at the point where it was called. This is unlike variable assignment in imperative languages. " -- https://news.ycombinator.com/item?id=6278047

undo and versioning

http://www.vpri.org/pdf/tr2008003_experimenting.pdf

in expr block: execute block in/upon the world that results from evaluating expr

sprout (copy-on-write from sprout parent world like in languages with prototype inheritance)

commit (write changes from a sprout back to the parent)

actor functions with state

"In Erlang, actor state is set by recursively calling the actor function with the new state as an argument."
clojure's agent

auto-initialization

watch out for non-uniform auto-initialization policies, which are hard to remember: e.g. in C, static vars (but not other vars) are initialized to 0 by default.

Chapter : managing state: scoping

lexical

if local vars are kept on a stack, could use separate stacks for the call stack and for variables

if a data stack, could use display registers (many registers, each pointing to one ancestor stack frame) to aid quick lookup of local variables in ancestor scopes

easier to debug stuff than with dynamic scope. Older Lisps used to have dynamic scope, but Scheme and Common Lisp have lexical scope.

note: lexical scope enables closures

dynamical scoping

shallow binding

deep binding

js's with

function scope

hoisting of declarations

Chapter : real-time systems

actually i don't know much about constructs for real-time systems

Chapter: data languages

todo: i don't know all of the languages i list here, must look thru them and make sure they're correctly categorized

todo: lookup which of these are Turing-universal and which aren't and list that here

explain difference between a data model and a data representation language

data languages for representing composite data structures (JSON, RDF, RDFt, XML, S-expressions)
data languages for building composite data structures (todo, but i'm thinking things like templating languages)
predicate/selection/projection/matching languages (XPath)
query languages (SQL, LINQ) (probably QBQL, XQuery (Turing-universal), SPARQL, htsql, MDX)
matching/parsing languages (e.g. regex, grammar, perl6 rules, ometa)
transformation languages (e.g. grammar with restricted actions, tree/graph transformations, XSLT, probably Gremlin)
- todo: find other tree and graph transform languages in my notes

Chapter : team programming

encapsulation / scope modifiers: prevent you're getting blamed for breaking the build when other ppl depend on code you told them not to depend on (vs; flexibility for others to reuse code without copying; flexibility for binary / type compatibility for existing code)

typing can help other team members when reading each other's code (although it can also make code less concise, if type inference is not used)

typing also helps by having the compiler enforce some architectural constraints, rather than trying to enforce them by social convention

readability becomes more important

languages with too many ways to do things may be problematic; otoh some people say the thing to do is to have the best programmers metaprogram a framework that everyone on the team then uses, and then it's fine

Chapter : I/O

printf

combining string interpolation, formatting, and I/O

console objects

e.g. STDOUT in Python, cout in C++

Chapter : std lib

maps, folds, hof stuff, haskell prolog

IPC

erlang's supervision model

Chapter : resource management

garbage collection

and also resource collection, e.g. finalizers, Python with, C# using

the issue of ordering destrucutors in reference cycles (Python just doesn't call them: http://eli.thegreenplace.net/2009/06/12/safely-using-destructors-in-python/ )

the issue of exceptions thrown within a destructor

C++ allows you to determine if objects are on the stack or on the heap (heap iff created via 'new'). stack-allocated objects are released upon leaving scope (e.g. their destructors are guaranteed to be called then; contrast with Python finalizers which are no guaranteed to be called until garbage collected, which is nondeterministic, e.g. it's like they are always on the heap). C++ also provides 'delete' to delete things created by 'new'

note: python with allows you to implement a context manager using generators and a @contextmanager decorator: http://www.python.org/dev/peps/pep-0343/

RAII

C++'s copy constructors, copy assignment, move constructors, move assignment. Assignments must sometimes consider deallocating the items that were previously in the place that you are moving the new things into.

C++'s new[] and delete[] (recursively calls new and delete for every item in an array)

memory management

in C++ construction without 'new' allocated the object on the stack (so that it automatically goes out of scope and is deallocated when the program exits and ascends above the stack frame from which the object was created), and construction with 'new' allocated objects on the heap, from which the programmer must manually use 'delete' to deallocate them.

Chapter : interop

ffi

calling conventions

data interop

low-level control over memory layout

embeddability

https://en.wikipedia.org/wiki/Domain-specific_language

Chapter : links

https://en.wikipedia.org/wiki/Category:Programming_constructs

dunno where to put this, but:

there's a similarity between macro application, logic-based production rule firing, and PEG grammar application, in that in these cases you have a control structure that looks over the thing to be matched (a state), applies a set of matching rules to it in order to find a potential match, and then mutates the state according to what the rule tells you to do, which may be a function of the part of state which matched

todo: the first few pages in http://users.rcn.com/david-moon/PLOT/ had some good concepts

todo: sort this into more chapters

Chapter 24: Syntax Chapter ?: Control Chapter ?: Error handling Chapter ?: Type Chapter ?: Numbers Chapter ?: Strings Chapter ?: Metaprogramming * contains: hacking the call stack, hacking the ENV, operations, hacking classes, syntax, macros etc, eval (todo generalize), misc Chapter ?: Low level Chapter ?: Alternative paradigms motivated by physics Chapter ?: Misc constructs Chapter ?: Other features