Bayle Shanks's website: proj-jasper-jasperNotes3

http://bartoszmilewski.com/2009/05/21/unique_ptr-how-unique-is-it/ :

some type properties:

immutable "compiler, i demand that this cannot be mutated"
mutable: not immutable
const "compiler, i promise (and you must verify) that in this function i will not mutate this"
unique (relevant to pointer only): "compiler, i promise that at any one time there will be only one copy of this pointer)
lent "compiler, i promise (and you must verify) that in this function i will not alias (make an additional copy of without deleting the original) this pointer"
owned: not lent
owner: each object has an 'owner' object
- if an object is self-owned, it can be accessed by multiple threads (he calls this a "monitor")
- if an object is self-owned, all of its methods are implicitly synchronized
- if an object is self-owned, it cannot have a thread-local member.
by default, an object's owner is 'thread', meaning it is a threadlocal
unique and immutable objects' owners are 'unique' and 'immutable' respectively
instance variables are owned by their object, by default (another owner may be specified if desired, usually self-ownership e.g. monitors)
the root of each ownership tree (except for unique, immutable, and threadlocals) is associated with a lock. this lock must be held in order to access any owned object (not just write, but also reading, esp b/c of sequential non-concurrency)
except if an object is declared as 'lockfree', in which case no lock is necessary (but the compiler does ensure sequential concurrency; analogous to 'volatile' in Java or strong (not weak) atomics in C++). lockfree objects are not protected by the system, e.g. this is the way to escape from the system
you cannot embed a threadlocal inside a monitor
in his implementation, if you try to access an instance variable of a monitor without going thru its methods (which are implicitly synchronized), and without being within a synchronization block for that monitor (i.e. acquiring the lock for that monitor) you'll get a compile-time error. but for jasper i think the compiler would just automatically enclose the statement within a synchronize block.
if one object was owned by another object without the former being embedded in the latter, then the owner must be declared 'final'. This is because the ownership relation must be unchangable throughout the life of the program.
Don’t share local (stack) variables—they might disappear
note: 'immutable' and 'unique' must be deep (recursive), not shallow
if immutable objects may be mutated during creation, make sure they aren't shared yet
note that value types (like int) are by definition immutable

i believe the previous is a complete description of milewsky's system, (except for owner polymorphism, which i think can be done in a simple flexible way in Jasper with annotations anyway), but should ask milewsky to be sure. according to http://fpcomplete.com/the-downfall-of-imperative-programming/ , he's since embraced functional programming to avoid data races, but i'm guessing he still thinks the above methodology would do the trick.

in jasper:

i think the 'lent' and 'const' could be inferred even if not declared
we probably want to make 'shared' the default rather than 'threadlocal'

additional operation: ':=', called "move": x := y == x = y; y = null;. The idea is that x gets the pointer and y doesn't have it anymore; this can be applied even to unique pointers, whereas normal =, which means copy, cannot.

you can declare a pointer with the type propery (other examples of type properties are , meaning "compiler, i demand that this cannot be mutated" and const, meaning)

his type system apparently provably avoids race condititions (both 'low level' and 'high level' race conditions, whatever that means). it does not prevent deadlocks.

he also says "Such system would be incomplete without some support for lock-free programming. There are very good reasons for the language to enforce sequential consistency. For comparison, C++ committee, after a long struggle, decided to allow non-SC primitives (weak atomics). Java only enforces SC for volatile variables. " then goes on to introduce the lockfree qualifier, which means "A lockfree variable is guaranteed to be always atomically accessed in a sequentially consistent way." He has noted elsewhere that 'low level' race conditions defeat even lockfree algorithms.

not quite sure what this means. i guess sequential consistency is a guarantee that sequential code will always execute sequentially? is he saying this is good and that C++ has gone too far? or is he saying that you only need it when you are writing lockfree algorithms, which is why it seems his language has SC off by default but on when the lockfree qualifier is used? i think the latter. he notes this example which i don't entirely understand:

Speaking of lock-free programming, here’s the implementation of the infamous Double-Checked Locking Pattern:

class Singleton<T> { private: lockfree T _value; public: T get() lockfree { if (_value is null) { synchronized(this) { if (_value is null) _value = new T; } return _value; } } }

A lockfree variable is guaranteed to be always atomically accessed in a sequentially consistent way. A lockfree method is not synchronized, but it can only operate on lockfree variables (or use a synchronized section). Even though the use of lockfree doesn’t introduce low-level races, it may create high-level races when accessing more that one lockfree variable at a time. But we all know that lock free programming is only for desperadoes. "

Lessons learned from previous work

There are many commonalities in various approaches to race freedom.

    Ownership should be built into the type system. In OO languages each object must have an owner. In C-like languages each piece of data must have an associated lock.
    There should be efficient mechanisms for passing data between threads
        By value
        By reference. (The object being passed must be a monitor.)
        As immutable data, by const reference
        By unique reference using move semantics

Most of those ideas are expressible through type qualifiers. "

other sources: http://bartoszmilewski.com/2009/11/30/unique-objects/

http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/ http://bartoszmilewski.com/2009/06/15/race-free-multithreading-owner-polymorphism/ http://bartoszmilewski.com/2009/06/02/race-free-multithreading-ownership/ http://bartoszmilewski.com/2009/09/22/ownership-systems-against-data-races/

---

 "Andrei insists that "shared" should guarantee the absence of low level data races. Whenever I give examples of how this makes performance unpredictable, I hear the answer: "cast shared away". Currently nobody complains about it because this feature hasn't even been implemented. "

" D1 to be discontinued on December 31, 2012 (after the end of the world) by andralexin programming

[–]DrBartosz? 7 points 8 months ago

I'd look at existing solutions (Chapel, C++AMP, and even Haskell) as a starting point for task-based parallelism and GPU. I like the ownership system and it would be nice to have it in the language, but that's not the highest priority. What I'd like to see is better support for move semantics/uniqueness, which is essential for thread/task safety. I'm also not sure what to think about the "shared" experiment. The idea seemed good at the time, but we could never agree (or I could never agree with the rest of the team) about the details. So now it seems like a half baked idea (essentially, in order to do anything interesting, you have to cast "shared" away). "

much earlier (i am writing this in aug 2012) he talks about http://bartoszmilewski.com/2008/07/30/sharing-in-d/ and http://bartoszmilewski.com/2008/08/11/sharing-and-unsharing-of-data-between-threads/ . if his objections are unchanged since then, the performance objections are:

" The first is performance–not for shared objects, mind you, but for the non-shared ones. Walter tells us that accessing a thread-local static variable adds between 1 to 3 instructions of overhead. That seems quite reasonable. Especially considering that in multi-threaded environment the use of global non-shared variables is rarely correct.

There is also a performance penalty when starting a new thread–all static variables it has access to have to be default-initialized, plus all module constructors have to be called. This might amount to quite a bit. We will recommend not to overuse global variables and module constructors. The way to amortize this cost is to create thread pools. "

another objection seems to be the difficulty of 'unsharing' objections, and of resharing them after unsharing them. this sort of thing is necessary if you want to get a shared object from another thread and then pass it into an unknown library.

also " Note to D programmers: The current semantics of D “shared” is slightly different from my proposal. For instance, it forces all embedded objects to be monitors (their methods must be synchronized by their own lock), requires explicit use of the synchronized keyword, and forces all access in non-synchronized methods to be sequentially consistent. (And it doesn’t guarantee freedom from races.)"

dunno if this is still his latest.

btw the 'promise' vs. 'demand' thing w/r/t const vs. immutable and lent vs. unique (lent should be called noalias or somesuch) seems pretty common (wasn't this also the deal with Haskell's existentially quantified types or maybe with Haskell typeclasses vs. Java interfaces? or were those other things?). should have syntax for it, so you can just say 'i promise it will be immutable' and 'i demand it be immutable' without learning two keywords for each of these cases and remembering which one is which (or at least, have a keyword naming convention like promiseImmutable demandImmutable)

---

toread: http://bartoszmilewski.com/2011/01/05/using-f-sequences-and-pipelines/ http://bartoszmilewski.com/2010/09/11/beyond-locks-software-transactional-memory/

--- http://bartoszmilewski.com/2010/05/11/parallel-programming-with-hints/

---

" The standard concern is that new type qualifiers introduce code duplication. The classic example is the duplication of getters required by the introduction of the const modifier:

class Foo { private Bar _bar; public: Bar get() { return _bar; } const Bar get() const { return _bar; } } "

in Jasper, Jasper should be able to infer that the const-ness holds iff certain of the args passed in are const

---

STM

" Back to shared memory, this time without locks, but with transactional support. STM, or Software Transactional Memory, is a relatively new paradigm that’s been successfully implemented in Haskell, where the type system makes it virtually fool-proof. So far, implementations of STM in other languages haven’t been as successful, mostly because of problems with performance, isolation, and I/O. But that might change in the future–at least that’s the hope. "

" In Chapel, you use the forall statement for loops or begin to start a task. "

" All three HPCS languages chose fine-grained tasks (activities in X10, implicit threads in Fortress), not threads, as their unit of parallel execution. A task may be mapped to a thread by the runtime system but, in general, this is not a strict requirement. Bunches of small tasks may be bundled for execution within a single system thread. "

" To make things more concrete, let’s say you want to distribute a (not so big) 4×8 matrix among currently available locales by splitting it into blocks and mapping each block to a locale. First you want to define a distribution–the prescription of how to distribute a block of data between locales. Here’s the relevant code in Chapel:

const Dist = new dmap(new Block(boundingBox=[1..4, 1..8]));

A Block distribution is created with a bounding rectangle of dimension 4×8. This block is passed as an argument to the constructor of a domain map. If, for instance, the program is run on 8 locales, the block will be mapped into 8 2×2 regions, each assigned to a different locale. Libraries are supposed to provide many different distributions–block distribution being the simplest and the most useful of them.

When you apply the above map to a domain, you get a mapped domain:

var Dom: domain(2) dmapped Dist = [1..4, 1..8];

Here the variable Dom is a 2-D domain (a set of indices) mapped using the distribution Dist that was defined above. Compare this with a regular local domain–a set of indices (i, j), where i is between 1 and 4 (inclusive) and j is between 1 and 8.

var Dom: domain(2) = [1..4, 1..8];

Domains are used, for instance, for iteration. When you iterate over an unmapped domain, all calculations are done within the current locale (possibly in parallel, depending on the type of iteration). But if you do the same over a mapped domain, the calculations will automatically migrate to different locales.

This model of programming is characterized by a very important property: separation of algorithm from implementation. You separately describe implementation details, such as the distribution of your data structure between threads and machines; but the actual calculation is coded as if it were local and performed over monolithic data structures. That’s a tremendous simplification and a clear productivity gain. "

" The combination of STM and PGAS in Chapel necessitates the use of distributed STM, an area of active research (see, for instance, Software Transactional Memory for Large Scale Clusters).

In Chapel, you not only have access to atomic statements (which are still in the implementation phase) and barriers, but also to low level synchronization primitives such as sync and single variables–somewhat similar to locks and condition variables. The reasoning is that Chapel wants to provide multi-resolution concurrency support. The low level primitives let you implement concurrency in the traditional style, which might come in handy, for instance, in MPMD situations. The high level primitives enable global view programming that boosts programmer productivity.

However, no matter what synchronization mechanism are used (including STM), if the language doesn’t enforce their use, programmers end up with data races–the bane of concurrent programming. The time spent debugging racy programs may significantly cut into, or even nullify, potential productivity gains. Fortress is the only language that attempted to keep track of which data is shared (and, therefore, requires synchronization), and which is local. None of the HPCS languages tried to tie sharing and synchronization to the type system in the way it is done, for instance, in the D programming language (see also my posts about race-free multithreading). "

" Here’s the tongue-in-cheek summary of the trends which, if you believe that the HPCS effort provides a glimpse of the future, will soon be entering the mainstream:

    Threads are out (demoted to latency controlling status), tasks (and semi-implicit parallelism) are in.
    Message passing is out (demoted to implementation detail), shared address space is in.
    Locks are out (demoted to low-level status), transactional memory is in."

" Message passing’s major flaw is the inversion of control–it is a moral equivalent of gotos in un-structured programming (it’s about time somebody said that message passing is considered harmful). MP still has its applications and, used in moderation, can be quite handy; but PGAS offers a much more straightforward programming model–its essence being the separation of implementation from algorithm. The Platonic ideal would be for the language to figure out the best parallel implementation for a particular algorithm. Since this is still a dream, the next best thing is getting rid of the interleaving of the two in the same piece of code. "

" But if you send a message to one thread and set up a handler for the result in another, or have one big receive or select statement at the top to process heterogeneous messages from different sources, you are heading towards the land of spaghetti. If you’re not careful, your program turns into a collection of handlers that keep firing at random times. You are no longer controlling the flow of execution; it’s the flow of messages that’s controlling you. Again, programmer productivity suffers. (Some research shows that the total effort to write an MPI application is significantly higher than that required to write a shared-memory version of it.)"

STM:

" (Transactions unfortunately do not address one other issue, which turns out to be the most fundamental of all: sharing. Indeed, TM is insufficient – indeed, even dangerous – on its own is because it makes it very easy to share data and access it from multiple threads; first class isolation is far more important to achieve than faux isolation. This is perhaps one major difference between client and server transactions. Most server sessions are naturally isolated, and so transactions only interfere around the edges. I’m showing my cards too early, and will return to this point much, much later in this essay.) '" -- http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx

---

" Everything I described so far is common to CSP (Communicating Sequential Processes) and the Actor model. Here’s what makes actors more general:

Connections between actors are dynamic. Unlike processes in CSP, actors may establish communication channels dynamically. They may pass messages containing references to actors (or mailboxes). They can then send messages to those actors. Here’s a Scala example:

receive { case (name: String, actor: Actor) => actor ! lookup(name) }

The original message is a tuple combining a string and an actor object. The receiver sends the result of lookup(name) to the actor it has just learned about. Thus a new communication channel between the receiver and the unknown actor can be established at runtime. (In Kilim the same is possible by passing mailboxes via messages.) " -- http://bartoszmilewski.com/2009/07/16/on-actors-and-casting/

" The D programming language with my proposed race-free type system could dramatically improve the safety of message passing. Race-free type system distinguishes between various types of sharing and enforces synchronization when necessary. For instance, since an Actor would be shared between threads, it would have to be declared shared. All objects inside a shared actor, including the mailbox, would automatically inherit the shared property. A shared message queue inside the mailbox could only store value types, unique types with move semantics, or reference types that are either immutable or are monitors (provide their own synchronization). These are exactly the types of messages that may be safely passed between actors. Notice that this is more than is allowed in Erlang (value types only) or Kilim (unique types only), but doesn’t include “dangerous” types that even Scala accepts (not to mention Java or C++). "

" The two topics that stood out were: functional programming and DSLs. It seems like every language tries to assimilate some functional elements in order to deal with parallelism. Still, the problem is that in most languages the added safety features for concurrency are not enforceable and don’t compose well. A compiler can potentially check whether a lambda passed to a parallel loop doesn’t modify shared state, but it can’t extend this check to functions called from that lambda. The solution would be an effect system, which includes side effects of a function in its signature, but effect systems are extremely hard to program with. Compare this with Haskell monads, which serve as a simple but safe effect system. Unfortunately many imperative programmers have fits when they hear the M word. (I remember C programmers having the same reaction to the word “object.”) "

---

" The especially interesting Microsoft talk was about support for asynchronous programming in C# and VB. The speaker was Mads Torgersen, the organizer of the conference. Mads described a coroutine-style programming model where you can block in any part of your code using await and, when an event is triggered, resume execution right where you stopped. In the meanwhile, your program can continue executing other code — on the same thread. (You might recognize here a continuation monad in disguise.) This is a perfect model for GUI applications that call asynchronous operations. I talked to Herb Sutter about it later and he said that a very similar thing is being proposed for C++ (I’m going to be on the C++ sub-committee for parallelism with him, and I’ll report more details). " --- " Herb also had a talk about C++11 and how different it feels from the previous versions. For me the really big deal is move semantics that tips the scales from passing by reference to passing by value — where passing by value is really passing by reference behind the scenes."

---

" The newest addition to C# is support for asynchronous interfaces (somewhat similar to Boost::ASIO). This is a hot topic at Microsoft because the new Windows 8 runtime is based on asynchronous API — any call that might take more than 50ms is implemented as an asynchronous API. Of course you can program to asynchronous API by writing completion handlers, but it’s a very tedious and error-prone method. Microsoft’s Mads Torgersen described how C# offers several layers of support for asynchronous programming.

But what caught my interest was how C# deals with composition of futures (they call them task objects). They have the analog of an aggregate join called WhenAll? and an equivalent of “select” called WhenAny?. However these combinators do not block; instead they return new futures. There is another important combinator, ContinueWith?. You give it a function (usually a lambda) that will be called when the task completes. And again, ContinueWith? doesn’t block — it returns another future, which may be composed with other futures, and so on. This is exactly what makes C# futures composable and, hopefully, C++ will adopt a similar approach. "

" I noticed that there seems to be a problem with C++’s aversion to generalizations (I might be slightly biased having studied Haskell with its love for generalizations). Problems are often treated in separation, and specific solutions are provided for each, sometimes without a serious attempt at generalization. Case in point: cancellation of tasks. A very specialized solution involving cancellation tokens was proposed. You get opaque tokens from a factory, you pass them to tasks (either explicitly or by lambda capture), and the tasks are responsible for polling the tokens and performing appropriate cancellation actions. But this is an example of an asynchronous Boolean channel. Instead of defining channels, C++ is considering a special-purpose one-shot solution (unless there is a volunteer willing who will write a channels proposal). By the way, futures can be also viewed as channels, so this generalization might go a long way. "

http://bartoszmilewski.com/2009/03/10/futures-done-right/

http://fpcomplete.com/asynchronous-api-in-c-and-the-continuation-monad/

--- need *expandlist, expanddict syntax from Python (mb in our case expanddict is good for both, so just call it *)

---

need notation for requesting O() notation in time and space of algorithms -- but also want notation for requesting approximate algorithms (even though this notation, unlike the other, changes the choice of what function is being requested)

---

in general, we have a lot of matching and handler stacks (exception trigger matching, structural type matching, handling a __set which might be overridden, searching through active lexically scoped namespaces for a name

general construct needed

---

can __get s be overridden like __set s ?

do we need to allow __set s to be overidden if we allow tuples to be passed to fns?

yes and yes (the second yes: b/c it's not necessary, it never was, but it makes it much more convenient for objects that want to define a whole virtual graph of objects inside themselves; o/w they'd have to create anonymous functions for each thing inside of themselves to hold its location within the network, and wrap the real methods that deal with __get and __set for everything in the network. it's much more convenient if the parent object can just demand to be directly passed the path to the child virtual object for which a get or set is being requested.

actually after thinking about it for a few more minutes i don't see what passing the path to the parent is good for that can't be done just in the manner described above (namely, produce a child which delegates to the parent but with a 'path' keyword term). if there is a 'delegate' fn in the std lib then this is ez (note: delegation should be per interface). note: to make this work you have to have an analog of tail recursion where a chain of delegations is automatically collapsed by the compiler and interpreter; o/w a very deep node would have to go thru a zillion delegate calls whenever you call '__get' on the deepest node. Similarly, the 'path' needs to be optimized somehow so that you don't end up with a very big thunk when you get very deep -- again, though, a general solution is probably better here -- i think just strictly (e.g. not lazily) appending the last path element will solve the problem here.

---

need token syntax for:

execute this function, but if it throws a 'null' exception, turn that into a 'null' value. mb ~f.

make sure it interacts well with the maybe tower notation

should this syntax recursively apply itself within the function, so that inside the function, null exceptions are also caught and turned into nulls? eg. what if the function is just a cache actually getting the thing from somewhere else -- if the exception is uncaught then the cache wont work on null values. but you dont want to just willy-nilly mess up the calls inside a function someone else wrote. mb this is more of a convention than a macro -- mb by default its not recursive but implementors can override the pattern (e.g. say 'if i am called like ~f then do this, but the definition of plain f is this other thing'. seems to me like the maybe tower idea may have some clever way to take care of this automatically though, maybe in concert with dataflow tracing of the source of the value which is returned.

---

if the language has guards then by default this permits the defintion of partial fns; what if no guard triggers? does the function return null (undefined) or throw an exception (or throw an exception by default, which may be replaced by a null somewhere up the call chain, see above)

mb my idea for exception handling makes exception raising kind of like coroutines.

mb coroutines are a specialization of concurrency.

we want to be able to replace any variable holding a value with a probability distribution, define how primitives affect probability distributions, and then run a naive function that thinks it's dealing with ordinary numbers, and get the resulting probability distribution. and we want to do other similar things.

i think typeclasses do this.

remember to check out ML modules.

---

! could be the assertion character -- which is also possibly the type annotation character. !! could mean a compile-time assertion.

f ^annotation1 ^annotation2 could annotate f

$ could be interpolation

[] could be normal G-constructor, but something else like [" could be G-constructor where keys are implicitly quoted (inside that, $ could be used if a variable is needed instead)

---

bind: identify two nodes

(um, mb we should reserve the '=' sign in graph constructors for this...)

also, we need an 'unbind' or 'fork' or 'dup' or 'free' or 'decouple' or something.. (so do bound nodes secretly maintain their separate identity? mb there are two kinds of binds.. also, does bind also bind meta attrs? probably not, probably its per view..)

something like Python's % and Pythons's repr (used to be `blah`, which was more handy) would be helpful

Subject: Three defaults for parsing symbols

"symbol" $symbol ?symbol

$symbol is the 'typical' default, in which a symbol is replaced by its value during evaluation; e.g. if you say: "d = e", then $ is the default for the right hand side: the symbol "e" will be evaluated and the result assigned to symbol d.

e.g. inside a Javascript-style graph constructor, given [" ], e.g. [" dog = 3 ], the left hand sides are shortcuts for quoted strings, e.g. [" dog = 3] is short for [ "dog" = 3 ].

?var is formal parameters, so if "f ?a = $b", that's a fn defn of function 'f' with one formal parameter, 'a'. This is the default on the left hand side of fn defns, e.g. if you say 'f a = b', that's the same as "f ?a = $b"

A more unusual function could be given by:

a = 3 f $a = ?b

which means "function f(3) = symbol 'b'"

can apply ", ?, $ to envirojments (literal constructors) to switch defaults

Two repetitions = exact, one Rep means common shortcut,

e.g.

[" dog = 3] is short for [ "dog" = 3 ]

["" dog = 3] is short for [ "dog" = "3" ]

(" dog cat) is short for ("dog" "cat")

what about symbols, like :sym in ruby? 'sym like in Lisp?

Haskell-style pattern matching in function calls; Haskell-style cons in patterns; except patterns are first-class objects, and they can have variables (statically at the time of creation) intepolated into them, using $varname as variable interpolation operator.

also, pattern matching should have multiple patterns ANDed together and within that, be 'two-way', e.g. unification (see http://www.haskell.org/tutorial/patterns.html ). so i guess it's like Prolog. no need for NOT operator. So i guess it's like type matching.

also need something like Haskell's 'as' operator in patterns (see above link). i guess a good way to do this is to have the variables being bound to just be annotations on the pattern's expression tree, rather than require that they be nodes.

can interpolate variables into patterns with $var. formal parameters in patterns are ?var (this is default, tho, so 'var' is synonymous with '?var' inside a pattern. but what if you want to interpolate a variable containing a pattern? $var if you just want the pattern as value, but $$var if you want to integrate it into the current pattern. mb $$$var is how you would place the contents of the expression tree inside the variable into the current position before evaluating it at all (like a C preprocessor macro)?

---

hmm, in general, how to interpolate a dynamically subexpression tree held in a variable into a given position before macro expansion? that would be a form of macro expansion, so how is macro expansion ordered

the previous idea is a start; $var puts the value of the var in (as a value); $$var inside a literal puts the value of the var, if the var is of the same type as the literal, into the literal as a subexpression (e.g. before the literal is constructed, although in many cases i guess this will end up the same as $$var; $$$var interpolates the variable's value into the expression tree early in the interpreter stages (like a C preprocessor macro).

how to generalize that? it seems like the more $$s, the earlier in the interpreter stages that the interpolation must happen.

Connection to nil tower, handler stack?

should # be 'count' ('len'), ## be 'lazy', @ be 'macro'?

or is ! 'strict', @ 'perspective'? we still need a 'lazy'.

perhaps '?' is 'lazy'? but '?' is 'variable'. maybe '?' is 'lazy', '??' is 'var'.

actually i don't think we want to mix 'lazy' and 'var'. 'lazy' is the opposite of '!', but 'var' is the opposite of '$'. 'lazy' and '!' concern when to evaluate, whereas $ and 'var' concern whether to evaluate.

mb just '!' for strict, '-!' for lazy. mb '$' for interpolate, '-$' for 'var'.

also, need a recursive strictify. '!!'?

both '-!' and '-$' are boundaries; lazy is a recursive strictify boundary, '-$' (equivalent to lisp backqutoe) is an evaluation boundary

wait, so is -$ equivalent to symbol? i guess so.. and is '$' equivalent to 'eval'? i guess so..

or should Eval be $$$?

should have some syntax for custom boundary markers.. could just use annotations; e.g. ~lazy is equivalent to '-!', ~var is equivalent to '-$'.

don't forget that the placement of edges to views is supposed to be dynamic by viewpoint, and that this mapping can be changed by some sort of meta command.

now, what about overriding __get? mb that is even more meta, e.g. bypassing the assignment of edges by type

---

i think ! should be the strictify operator.

in fact, perhaps strictify and macros can be merged. they both have to do with telling some stage of the interpreter, 'go ahead and evaluate this now, even though normally it wouldn't be evaluated until some later stage'. ! is the 'now' operator. ! is strictify, !! is macro. in general these are all shortcuts for things like '!strictify' and '!macro', which are 'now markers' targetted at various stages of the interpreter (i guess that means they are graph annotations)

---

btw a fundamental operation is converting between a graph and the graph of cosets under some homomorphism

a fundamental operation is converting between a graph and the graph where its nodes are annotation of certain node types (going the other way: taking certain node types and turning them into annotations)

and note that the 'bind' operation is usefully applied to the expression tree during substitution

---

'now' means, 'take something which was being discussed hypothetically and actually do it'. it could also be described as the 'actualize' operator or the 'for real' operator. i guess the opposite of this is '?', the 'hypotheticalize' operator. '?' is in general a barrier, a wrapper.

a fundamental operation is converting between a graph and the graph where barriers of a certain type are leaf nodes

so perhaps '?' not '~' should be somehow worked into the nil tower?

the reason that human 'is' (which i am proposing that == should mimic) treats both equality and isa is that it is really saying 'has-attribute'; and if the thing on the right is a single thing, then implicitly we are talking about the attribute of being that thing. E.g. 'i am human' means 'i have the attribute of isa-human'. Attributes naturally fall into a hierarchy; e.g. if 'i am an animal', this implies 'i am alive'.

must make sure that list comprehensions are convenient

so, mb ~s have to do with being okay with exceptions.

by default, if f raises an exception or returns nil, then the exception propagates up through the caller.

~f means 'i'm ok with f returning nil, don't treat this as an exception'

~~f means 'i'm ok with f returning nil, don't treat this as an exception, and also wrap the nil using the exception tower' (optional argument: the label associated with the wrapping)

~~~f means 'catch all exceptions and turn them into nils, adding two levels to the exception tower'. optional argument: exception filter.

in all cases there is an optional argument for an exception handler.

maybe generalize the optional argument for the exception filter, too.

how to syntactically separate the optional arguments for ~ from the arguments for the underlying?

mb some sort of construct like ~f ~filter=() ~handler=() ?

this would then be generalized to allow multiple fns to be grouped together and mixing their arguments, with some precedence relation determining which fn is wrapping which other..

interesting, i like that..

mb ~ should be reserved for that instead..

or `

ah, i see, just use the same symbol multiple times to denote different layers of wrapping

e.g. ~except f a b c ~filter=() ~handler=() would be equal to

except (f a b c) filter=() handler=()

doesn't really seem to be too helpful though, does it? just use the darn parens if you want to do that in general..

so mb using ~ just for exceptions is good enuf

btw ~ and ! are very good symbols because they are in the corner, mb should use them for something more common

like mb use ~ for annotations in general, or for map, or for count, or for assertions/type annotations, or views

hmmm ~ would also be good for footnotes. this could be combined with the exception handling syntax somehow. footnotes are for handling unusual cases, and for labeling hooks, right? hmm..

mb f~blah means "do f and then wrap it with footnote 'blah', if there is one. 'blah' is also the name of a hook interface point". there is a convenient syntax for catching exceptions in footnotes, as well as ~~ ~~~ things sort of like described above.

multiple OOB channels, e.g. STDERR, STDINFO, STDOUT, except not just character streams...

mb what is sent via each channel is a 'compound message', which is a set of nodes.

should views have a syntax? e.g. @?

---

besides source filters, need an easy way to chain two programs together in the same environment, e.g.

jas -e common_imports.jas -e myfile.jas

when common_imports is executed first, and then myfile is executed in the same environment, so common_imports can import things into the environment for myfile

there should also be an easy way to do this in the Jasper header, in addition to the source filter syntax

the Jasper header should also allow #s or somesuch to put in stuff for the version control system

hmm if patterns define types. thinking about logics for patterns. the usual tradeoff b/t expressiveness and feasible inference.

since type are in a hierarchy, this suggests descripition logics.

i feel like we can cut out negation somehow. but not quite sure..

some types, like 'immutable' or 'functionally pure', are 'purity' types, meaning that they are true whenever some condition of impurity never obtains in some transitive/data-flow-y manner.

others, such as Integer, are 'classy' types, defined as subsets of other classy types. in this case they aren't true unless you have a reason to believe that they are.

so it seems like you may want to write some notation and then put a 'not' in front of it. but you'll still never need a 'not' inside of it.

mb logics without negation are called "positive"?

i think the 'purity' types correspond to positive 'impurity' types, e.g. 'mutable', which we can use negation-as-failure semantics on, e.g. if you never any code that would mutate this value along the dataflow, you may consider it immutable.

description logics have negation, and are PSPACE-complete (at least as hard as NP-complete), so that's pretty hard (but hey, at least it's decidable)

positive first-order logic is NP-complete (but hey, at least it's decidable) ( http://ecommons.library.cornell.edu/handle/1813/7017 First Order Predicate Logic Without Negation is NP-Complete by Kozen, Dexter)

do we really need both universal and existential quantifiers in our patterns, though?

hmmm... if there exists any mutation along the dataflow, then the variable must be mutable, so there's a there-exists. and a list of ints must contain all ints, so there's a for-all. but perhaps these are two different kinds of reasoning? e.g. it doesn't seem like you would ever need a forall if you are looking at a list of known types and trying to deduce new types, e.g. if the thing has type Int and type single-world-in-memory you don't need to look at both of those to deduce type Number, that's just a there-exists. similarly, you might want the type of a list of ints to say each element is an int, but you don't care about a list of things-with-at-least-one-int, so there you only need forall, not there-exists.

but just thinking about graph patterns, sure, sometimes you want to match a node which has at least one arc going to (Int or Real or Complex), and sometimes you want to match a node whose arcs all go to Int.

mb just that you can have either forall or thereexists but not both on one expression?

but what if you want to look at a list and see if it's a list of ints; you look at every item in the list (follow each arc), look at each of their types (follow the type arcs in the meta view), and then look for at least one of these type arcs to terminate at Int. But this is saying, forall arcs in the list, there exists a type arc from the target of the initial arc that targets Int. So we are mixing them.

we want OR; if something is an Int or a Float then it is a Number.

and i guess the last statement shows that we want Implies too?

could we possibly get rid of AND?

hmmm.. mb the two different activies above are, reasoning from the types to allowed values, vs. reasoning from program code to types.

e.g. reasoning from program code to types, we see that if there is a possible mutation, we must have type Mutable.

reasoning from the types to allowed values, we see that a list of ints must contain all ints

e.g. if you see a single non-int go into that list, then you deduce the impure type "non-list-of-ints", which is like Mutable.

similarly, from Immutable, you deduce that for every step in the dataflow, there is no mutation.

i think i may be conceptualizing this wrong. first off, for straight-up graph MATCHING, the problem is simple (this is like checking if an assignment to variables satisfies some set of FOL statements). the inference complexity comes when you are not presented with a single graph, but rather with a statement in this language (a statement about the type of the graph, i.e. a graph pattern), and you must deduce merely from that whether it will match another graph pattern, without reference to a single particular graph.

also, when i say that we want to deduce that because something is one type, it is another type, the types form a lattice. we can require the users to explicitly construct the rungs of this lattice, we don't have to infer them from the definitions of the types.

also, when i say that the compiler should infer if matching this graph pattern sufficies to show that that graph pattern will be matched, we can require a user to use the assertion facility to write a proof of this for us; the compiler is only checking the proof, not writing it. so if the problem of finding the proof is NP, then of course the problem of checking the proof is in P.

so mb it doesnt matter if theorem-proving in this logic is NP.

mb we just tell the programmer:

if you say "type A is a subset of type B", then if there is a chain of these statements that shows that type A is a subset of type C, then the compiler will allow things of type A to match against patterns of type C. However, it will not otherwise infer that A is a subset of C just because their definitions imply it.
you can't use negation within a pattern. But whenever you match a pattern, you can request that the handler be invoked when the pattern is NOT matched, instead of when it is matched. E.g. it's almost like you can have negation, but only at the top of the pattern (but we don't phrase it this way, because patterns are composable, and this would not be)
we want to be able to support recursive concepts like dataflow so that we can define functional purity and immutability. so we want to be able to say "if there is ANY PATH to node A from a node B such that each step of the path satisfies these constraints, and node B satisfies those constraints, then node A is of type T". if our inference system was complete then we would define it as usual by giving the transitive property to the thing being measured. but in this case we don't want to guarantee complete inference in general, so the user must ask for it explicitly. this is essentially a 'reachability' operator. it also lets us say with respect to graphs representing expression trees, 'any expression that contains a subexpression of type Y is type T' -- immutability is a special case of this i think.
you can use AND, OR, FORALL, and THEREEXISTS.
- i feel like further comment is warrented about this. eg. if you say 'forall B' and you say 'A is a subset of B' then we'll match something that is guaranteed to be 'forall A'.
- will we do any transformation of the ANDs and ORs? probably not, again aside from reordering ones on the same level..
- and i guess we'll reorder the quantifiers for you within chains of identical quantifiers

wait a minute... i think i am misunderstanding quantifier here.. we won't have statements like 'for all arcs a from node n, and for all arcs b from node n, a = b'. we'll have statements like 'for all arcs a from node n, let the target of the arc be m. for all arcs b from node m, ...'. we'll only quantify once at each level. so no, the quantifiers won't be exchanged. but used this way, this isn't really FOL quantifiers. we can't express things like 'for all arcs a from node n, there exists an arc b from node n such that b's label = a's label + 1'.

also, the reachability is like * is regexs, leading to the analogy that the reachability component of these patterns is like a regex over linear paths instead of over linear strings. a character in a string has a single successor but a node in a graph can have many (i.e. the outgoing arcs; or even the ingoing ones, i see no need to require the programmer to always respect directedness here, although that may be a good default, and some types of graphs may not permit the 'what links to here' query). so we need forall and thereexists to choose how to handle the multiple (Kant would say manifold) successors.

remember that we need to ask things about the reified edges, too, like what their label is

so i guess the type system will accept two kinds of definitive statements; "type A is a subset of type B" and "if something matches this pattern, then it is of type B". If you say "variable V is type B", then the system will enumerate all known types which are subsets of type B, and go through all known "if something matches this pattern, then it is of type T" for each such type T, looking for a match.

note: semantically, giving a pattern with "C or D" for type B is the same as saying "type C is a subset of type B" and "type D is a subset of type B". However, the system may not bother to infer this. (or will it? perhaps it should try replacing each type in the pattern with all of its subtypes..? or can we leave it up to the user to assert those things? no, if you have a type of 'list of numbers' and you assert that a list of Ints is of this type, that assertion should typecheck without you having to explicitly say it is a 'list of numbers' first. so it does try to match each type in the pattern against any subset of that type; but what it does not do is try to match the patterns of each of those subset types (because then we'd be recursing..). so if complex numbers are defined as things of the form (float, float), and you have a list like [(float, float), (float, float)], but you haven't asserted that the (float, float)s are complex numbers, then this list won't typecheck as a list of numbers ([Num]), because although it known that complex numbers are a type of number, it's not going to bother to match these patterns.

if you have a 'list of list of numbers', and you give it a list of list of ints, this should typecheck, but why? because it substitutes the 'list of numbers' match structure directly into the list match structure at the top. so it'll substitute in match structures of things directly mentioned, but not their subset types.

note: this way we never have to actually go into the 'type' metanode in our patterns for typechecking (except mb if we are doing something weird, not sure if we NEVER have to..)

how can we make the syntax for these patterns like [Num] for list of numbers? i guess that's just universal quantification. but mutability is existential..

with mutability: i guess it's important to be able to say that fns by default mutate unless we say they don't.. so we allow negations at both ends of the pattern (at the atomic elements, and at the top), just not in the middle.

hmm.. also should have an 'exactly one' operator (in addition to forall and exists), and an 'is-injective' operator (no two arcs have the same target), and a 'is-function' (no two arcs have the same label) operator.

hmm, now that we aren't doing inference, why not allow FOL anyways?

also remember we that we need a cons operator.. and i guess we need the analog of cons for graph nodes (i guess it's 'add this edge to this node') probably other more graph-y ones will arise.. the cons operator and its like must be able to appear on the left hand side of function definitions. Haskell used : and Shen used

also need parametric types; e.g. "swap(a,b)" of type A,B -> B,A where A, B are type variables

hmm.. principal types are too restrictive (since we want to uniformally treat type qualifiers such as 'unique'), but mb each node should only have one type when seen from each perspective?

@@ could refer to the last mentioned perspective (lexical scope), sort of an 'ibid' for Jasper

mb -f means 'negation of f' (negation of True is False, negation of 3 is -3), and f- is 'inverse of f'

---

messages to handlers are a graph with a set of nodes distinguished as 'headers'. handlers match on any of the headers, and then get the graph, with that header as an entry point

---

another fundamental attribute of nodes is reverse arcs, that is, 'what links here'

---

http://www.shenlanguage.org/learn-shen/prolog.html

---

http://clojure.org/Protocols

http://clojure.org/datatypes

easy for loops like python enumerate

---

types:

there are three kinds of properties associated with types. one is propositions that hold true for every value of the type; these are things like "list of Ints". the second is propositions that hold true for every access of the value in the type; these are things like "unique" (seen as 'copy never applied to this'), "immutable". the third are things about how multiple accesses to the value relate to each other, or how access to this value relates to access to other values (this third definition is unsatisfactory because it is so general). included in this are things like 'unique' (seen as 'this variable's value cannot be changed without accessing it directly'), 'deterministic' (vs stochastic), 'no side effects', 'purely functional'.

a suitable notation for the first one need only use positive connectives, and possibly quantifiers, over values, with subtyping.

a suitable notation for the second one need only be expressed as marking each primitive operator with respect to the type, and then confirming that no negative primitive operator of the type is every applied to a variable containing the value (e.g. if you want it to be immutable, then no mutation is ever applied)

i don't know to capture the third one simply, since defining 'no side effects' seems to be impossible without a language for the semantics of the programming language. of course, 'no side effects' can be treated as the second case if we allow it as a primitive.

--- should name those. mb 'value types', 'expression purity types'.

---

types can have properties, e.g. conditions, and these can either be necessary or sufficient or both. necessary: every value of this type satisfies this condition, or in the second case above, every access to every value of this type satisfies this condition.

sufficient: if there is a value satisfying this condition, or if there is a variable whose accesses satisfy this condition, then it is of this type.

--- in fact, we should have a general notation that lets us write 'necessary', 'sufficient', and 'iff'

---

should we allow types whose semantics aren't fully defined, that is, ones for which a list of sufficient conditions is not provided? yes, because this is how we use the type system to attach arbitrary tags and qualifiers for use in metaprogramming.

---

note that we need to attach tags both to locations in the expression tree, and also to types.

mb ~ can be expression tree tags, not needing ^ also? or are footnotes different from arbitrary tags?

i guess if we use -> for logical implication then move should be :=

compiler/interpreter option to say if the necessary conditions of supertypes of the type of something are checked

'promise' has a different meaning for 'expression purity types' than its usual meaning; we are only promising that the expression purity holds within the function we are defining, not in general; but when we demand it, we demand it for all past and future, over all functions.

for value types, 'promise' has the usual meaning.

want a generic way to add modals and quantifiers to statements. 'promise' and 'demand' are modals.

mb should look at

http://en.wikipedia.org/wiki/Category:Logic_symbols

http://en.wikipedia.org/wiki/Modal_operator

interesting application of modal operators in http://en.wikipedia.org/w/index.php?title=Modal_operator&oldid=503040598

if Jasper is general enough it should enable us to give commands that people would describe as rather philosophical, rather than practical computer programs that we know how to execute. E.g. the 'immortality program':

   Search for a program whose semantics is extremely similar to the input-output relationships of the agent known as Eugene Baylis Shanks III, who has the attributes in the language English "human being born in Tennessee in 1979 C.E. on the planet Earth in this universe". If such a program is found, execute it.

This contains all sorts of complex ideas, all of which however are generic enough that Jasper (or its std libs) could include them: search (without specifying how to carry out a search; e.g. looking for a program on the Internet or deterministically searching the space of all programs, or stochastically searching the space of all programs), a program, the semantics of a program, similarity, an agent, a name, the idea of attributes specified in a foreign language, a conditional on the result of search, the execution of a program.

note that placing the concepts of 'search' and 'similarity' in a std lib means giving Jasper the ability to express ontologies (not necessarily baking a large ontology into Jasper, however, although i think a bit of that would be a good idea also)

btw i used 'is' instead of 'are' on purpose because 'the semantics of a program' SHOULD be a similar concept.. perhaps as noted earlier, in this context 'semantics' is interchangable with the word 'meaning'..

---

slightly older notes:

How to tell which symbols in guards are unbound variables? later: ?x means 'variable x'

http://research.microsoft.com/en-us/um/people/simonpj/Haskell/guards.html

http://www.haskell.org/haskellwiki/Pattern_guard

http://www.haskell.org/tutorial/patterns.html

First class patterns

Two way patterns / want to support patterns that have: Two way matching (unification) with backtracking?

---

Src filters

Can use #! syntax (ignore the first one) (two!s for a Jasper src filter rather than external executable)

Whitespace and lines starting with # at the beginning of the file are file-level preprocessor directives

two things we need to do there: src filter, and run-this-first-in-the-same-environment (like Python 'from blah import *')

a boolean expression on the left hand side (lhs) indicates a barrier, e.g. f a and g b = a+b means wait until both f and g have been called then return a+b.

also need constructs to return different things to the different callers. should be subsumed into the general call/cc framework.

note that this sort of barrier is also an opportunity to exchange information between threads. combining this with its barrier-ness, what it is is a synchronized way to exchange information between threads (how to generalize this to an asynch or synch way? should asynch and synch be modes like any and all? can they be identified with any and all (mb: do this when any of the participating threads are ready; do this when all of them are ready)?)

---

tree (or even semilattice?) over perspectives. subperspectives could have the same arcs but different arc labels, etc.

---

everything is late bound in the following sense: can redefine the meaning of any function called within a function that you call; effectively, you can execute a function that you call within a little environment 'bubble' that you pass in. the effect is that any symbol referenced from within a function is equivalent to referencing a formal parameter with a default value, the default value being the environment (the closure) in which the function was defined.

this seems like a category-theory diagram..

in other words, (prototype) inheritance = closure = lexical environment = default function arguments

---

warnings fork and fire off an exception object; errors stop and fire off an exception object (divert the current thread of control)

exception objects carry a call/cc

---

mb should implement in clojure b/c it already runs on 3 platforms (jvm, clr, javascript)

---

the all/any mode is pretty common. e.g. instead of an if statement, we can have a switch statement (which is something like call(map())). But if multiple branches of the switch match, do we only run one matching branch of the switch, or do we run all branches that match? i like only giving those two options (rather than the C way of doing it that executes the switch sequentially) so that we might do the switch in parallel if they opt to match everything (which would be faster if the switch statement has millions of options, e.g. it is a parallel recognizer). I guess if you guaranteed that ONLY one would execute in case of the or, it would have to be serial, but if you said AT LEAST one then you could do that in parallel too (leaving room for an 'XOR' version that would mean 'EXACTLY one', which would be in serial)

---

btw, how to represent f()? that is, how to distinguish it from f? it's not quite 'eval' because that takes source code, not a function. "call"? should be syntax for that..

---

in general, can visualize control flow in this system as a bunch of nodes with arcs between them and bright points (the points of execution; the histories of those points are 'threads', although not necessarily hardware threads as one hardware thread may emulate simultaineous execution) traveling round the network.

the barrier thingee described above is like two or more arcs converging into one node. a fork looks like a fork. a switch statement looks like a node with one arc coming in and many coming out, annotated by their conditions (like an FSM diagram).

---

i think i mentioned this before, but you can generalize the 'cons' in an lhs pattern to any function that provides a partial/sortof inverse. E.g. if you see a function defn with a cons pattern, e.g. "f cons(a,b) = a+b", and someone calls "f x", you just call "[a b] = cons^-1(x)" to try to match it.

i say "partial" inverse because the original function may not be injective (and we only need any one of the inverses, altough again, there's room for an any/all mode here), or it may not be surjective (in which case trying to invert it on some members of the range will lead the partial inverse function to raise a nil error/return nil).

so it's not just cons..

also, this means that 'inverse' (how to say it? '-'? e.g. '++.-' is the subtraction fn, ".-" is division? don't we need to switch perspectives there because .- is in some meta perspective? or is that the default, in which case, how to override that default?) should be one of the basic/widespread (meta?) properties of nodes.

--- read vs. write seem like another fundamental 'mode' but it can be expressed by a picture with two nodes representing two variables, and whether one is reading or writing from/to the other is just which way the directed arc is pointing.

mb we should make a list of the various ways that diagrams can be interpreted like that that seem related to programming language semantics -- language primitives can then be just a diagram, and which primitive way to interepret it.

time (control flow)

dependency

data flow/motion of data (like read/write)

Are patterns special cases of sets (types)?

Persepective change based on type pattern match?

But then how to determine type of new arcs for __set?

So must define newtype based possibly on pattern match first

Would like to infer typr subset relations from patttern syntax

todo: i dont remember exactly what i was thinking when i wrote that..

perhaps i meant:

if you write a function f a = a.c, and you say in the function defn that 'a' is expected to be of class 'M', then 'a.c' is short for 'a@M.c'. For example, if object 'blueBox' is a 'Box' object which is also serialized to a 'File', then you can do blueBox@Box.open to "open the box" (e.g. call the Box class's .open method) or you can do blueBox@File.open to "open the file" (e.g. call the File class's .open method), and if you requested a value of type Box in the function header (pattern match), then the default used if you just type 'blueBox.open' is 'blueBox@Box.open'.

the question about the __set was that, what if the pattern that matched was 'type A or type B or type C' -- now when you add an arc, is it only added to the perspective that happened to be matched out of that OR? so i suggest defining a newtype, 'type D = type A or type B or type C' so that the arc would be added to type D (this is still not quite satisfactory, what if you wanted to add the new arc to ALL of type A and type B and type C in that case?)

and, i noted elsewhere that the pattern matching etc knows about subtype relations; and here i guess i said, we should be able to just look at the syntax defining necessary and sufficient conditions for types and infer subtype relations. if that's what i meant, then i no longer think that, because in general, that's a difficult inference problem.

-- aw heck let's make ':' the (default) symmetric delimiter and ; the (non-default) ordered delimited after all..

  name = input ; print 'hi'

is ordered and

  x = 3
  y = x + 2

is the same as x = 3 : y = x + 2 which is unordered

@@ shortcut for '(lexically) previous perspective'? e.g. 'x = f@blah 3 : y = g@@ 4' is the same as x = f@blah 3 : y = g@blah 4

Capital letters as punctuation?

need to support 'spreadsheet behavior', e.g. when you mutate one variable, others change with it; and also, 'watcher' event handlers can be attached to variables which fire when they mutuate (allowing dynamically implemented spreadsheet-like propagation, or GUI updates)

---

i feel like i need to get deeper into common graph operations for the std lib.. this will probably illuminate some better ways of doing things in general. things like merges, recursive descent broken by barriers, homomorphisms on nodes, diagrams (mappings of nodes and arcs).

Caps could be used for macros/keywords, symbols, constants (dont like for macros/keyword bc they are common)

-- Or how about for labels? No need for ~ then.

i like that. --

need function composition operator to be syntactically ez so we can do Haskell-ish things like

  count w = length . filter (==w) . words

to count the number of occurences of word w in a string

we already have /; what is the diff in Haskell between $ and . again?

oh i see, length $ filter (==w) $ words == length ( filter (==w) (words)) -- in this one, 'filter' is operating ON the function words, not on the result of it.

so introduce the 'o' operator. allow it to be a binary operator

---

the annoying thing about custom binary operators is really just that the reader has to remember how to parse them -- but they are convenient for some things, like a chain of function compositions; it's easier to read '# o filter (==w) o words' than 'o # / o filter (==w) words'

mb have some rule for which symbols are used for binary operators, and make them all left associative and equal precedence with one another.

e.g. things starting with '>' or '<' or '

' are binary operators.

special rule for things starting with =; they are the AND of all of the items in the chain. e.g. a == b == c means a == b and b == c.

arithmetic ops are NOT binary.

should ordinary greater than and less than be > and < or >> and 1 3

x << 5

---

This is something I've always been curious about, is exactly why Google appends while(1); in front of their (private) JSON responses.

For example, here's a response while turning a calendar on and off in Google Calendar:

while(1);[['u',[['smsSentFlag','false'],['hideInvitations','false'],['remindOnRespondedEventsOnly','true'],['hideInvitations_remindOnRespondedEventsOnly','false_true'],['Calendar ID stripped for privacy','false'],['smsVerifiedFlag','true']]]]

I would assume this is to prevent people from doing an eval() on it, but all you'd really have to do is replace the while and then you'd be set. I would assume eval prevention is to make sure people write safe JSON parsing code.

I've seen this used in a couple other places, too, but a lot more so with Google (Mail, Calendar, Contacts, etc.) Strangely enough, Google Docs starts with &&&START&&& instead, and Google Contacts seems to start with while(1); &&&START&&&.

Does anyone know what's going on here? j It prevents cross-site request forgery.

Contrived example: say Google has a URL like gmail.com/json?action=inbox which returns the first 50 messages of your inbox in JSON format. Evil websites on other domains can't make AJAX requests to get this data due to the same-origin policy, but they can include the URL via a <script> tag. The URL is visited with your cookies, and by overriding the global array constructor or accessor methods they can have a method called whenever an object (array or hash) attribute is set, allowing them to read the JSON content.

The while(1); or &&&BLAH&&& prevents this: an AJAX request at gmail.com will have full access to the text content, and can strip it away. But a <script> tag insertion blindly executes the JavaScript? without any processing, resulting in either an infinite loop or a syntax error. "

"semiglobal perspectives": immutable shallow binding dynamic scoping. e.g. you can attach a perspective to a variable such that changes to the values within that perspective are thrown out as you go back up the call stack (e.g. it's like a set of shallow binding dynamic scoping variables)

there is an implicit semiglobal perspective passed into each function. you can use this as e.g. semiglobal variables, but as the program expands and later on you want to split this into two separate states which assign different values to the same globals, you can choose to explicitly pass something in here, eg. its like

fn(x) is really fx(x, semiglobal_state=sg)

and you can do fx(x, semiglobal_state=my_custom_state) instead to override the default

other than this, there are no globals (can you put a reference in a semiglobal though if you want to pass info up the stack? todo)

why would you want to attach a semiglobal perpective to another object? consider I/O streams and defaults like how many decimal places to print when you send a floating point number to the stream. you dont want a subroutine that you call to change these defaults on you without you explicitly knowing about it, but you do want to change the default and have it affect the subroutines that you call.

lexical scope for programming in the small: mutation allowed

outside of lexical scope, you are programming in the large: mutation controlled

i think ruby has quick way to do this Python bit, mb we should too:

[results['x'] for results in resultss] [results.x for results in resultss]

lets say a superclass defines a method 'idle' which will be called when idle, and intends for it to be overrided with actual functionality. Now a subclass/mixin wants to add put functionality in idle(), but wants to expose the same API to its own subclass/other mixins, that is, in order to add functionality to idle() they override idle() (rather than inventing a new API, e.g. telling subclasses to leave idle() alone and to override only idle_main()). This subclass doesnt know anything about from where idle() was called; the only thing it can do is overridle idle(). How to accomplish this?

Footnotes:

1. ? mb we should use > for directed arcs in graph constructors (and <> for a directed arc both ways, e.g. an undirected arc; of course there is also a fn that makes all directed arcs undirected, so to save a few keystrokes you can use > and then use that function on the graph).

-- old note

Indices

A language must be good at indexing

old note

map _2 _1 to flip map _ _ x _ y f partial function application syntax (note: this was written when i was considering right-to-left syntax) like haskell but more explicit like lisp but assumed parentheSes

actually i still like both of these

if a _ is reqd for partial fn application, it is indeed more readable and explicit

actually, on second thought, that significantly reduces expressivity. what if you want to write a hof (higher order fn) that flips the last and second to last arguments but is indifferent to the arity of the result? this would prohibit that. you could be only slightly more permissive by saying something like, in the body of a fn that requires the arity to be at least n, you must explicitly have that arity in fn applications, and use _ for partial application, but this may be too burdensome for the compiler, and seems a little silly, since this would change what is required depending on if a subroutine were broken out or were left as a part of a larger function. so nix that requirement. it's still good style tho. and still useful for when you want to partially apply arguments in the middle. so the net result is: trailing _s are NOT required (but are good style).

wait, that was never the proposal in the first place. the proposal is: if you more than saturate a function's arity, you have to have parens where you saturate. e.g.:

;;yields_plusx x = fn* ?y , + x 1 ;; how does this parse? i dont think the , works... how is , different from / again? yields_plusx x = fn* ?y +.x.1 print / yields_plusx 3 5 ;; arity error print / (yields_plusx 3) 5 ;; correct

for continuations: could just let each label define a continuation (after it is passed thru, but within the same fn). use an operator, mb $, in front of a label to denote that you are referring to its value as a continuation rather than labeling the current point. using that operator with the name of the current function refers to the continuation just before it is called (that is, equivalent to the 'return' command in other languages, but flexible because we can return to multiple points, which is needed for the boolean lhs syntax). umm wait.. there needs to be a distinction between PASSING the continuation to a function, and CALLING it immediately ourselves... so we need 2 operators if we want to use labels as implicit continuations. note: as an optimization, the compiler doesnt actually have to construct a continuation unless it will be used.. labels have lexical scope of the current block/environment.

which operators? mb $LABEL to call it right now (e.g. jump), and

is it essential to have continuations cause a jump whenever they are evaluated? or can we require them to be explicitly called, e.g. $continuation for a variable containing a continuation? i think the latter because we can always do:

callit x = $x

g(callit $BLAH)

if we want to pass g some normal seeming thing that actually calls a continuation when evaluated.

so let's do:

?LABEL to get a value that refers to a label, and $LABEL to call the continuation (in general, $var to call the continuation in that var's value; if that variable contains a string, eval the code in that string; if it contains an intermediate Jasper code representation, call that, etc; but if you are inside a string or pattern, interpolate in the value in var)

should we allow stuff like C does in the header to make mutation convenient, e.g.:

mutatePlusOne &x = x += 1

which is short for

mutatePlusOne x = *x += 1

and, unlike C, rather than implicitly changing a pass of 'x' into a pass of '&x', i would require this to be called as:

p = 3 mutatePlusOne &p

note: this sort of thing IS still unique (we haven't created an alias to p) iff x is unique in mutatePlusOne.

note: forking makes all local variables non-unique (because it makes a shallow copy of the local context)

still a little confused between the difference between a graph and a node, and what happens if you refer to SELF or ROOT within a graph.

i guess a graph == a node with arcs to each node in that graph, with these arcs labeled with their node labels (because node labels are specific to each graph, dig it?). So if x = [[image:x?]] means that x is the set which has a single element which is a set with one element, which is the set x, then that corresponds to the graph

[[ROOT?]] or mb instead of ROOT it should be [[SELF?]]