proj-oot-old-150618-ootNotes12

in oot, w/o baking in a single message bus implementaioh, can we have a std optional language for wheether a piece of info is an address, a command, just a data update, metadata that will be kept with the dataupdatee (like a trade timestamps) or message-level metadata? this is also a good examplee of what i mean by putting the LANGUAGE into programming langue via conventions. relatd: diplomacy plan suggestion symbols

---

i've heard that: akka is baked into scala deeply but debugging it is too hard

---

some important stuff in Doug's 'classical' library ( https://github.com/drubino/Classical ): hashing dictionaries (doubly generic; keys can be anything in lang) (without javascript default dict implemenetation's keys that collide with yours) rationalized reflection system lazy list comprehension-ish query system

--- " I’ve always liked the idea of building complex logic systems out of a simple primitive that is just powerful enough to construct all logic - particularly in videogames. For example, in LittleBigPlanet?, you can build a Tic-Tac-Toe AI out of physical elements like pistons and magnetic switches. In Minecraft, you can build an ALU out of a primitive construction element that behaves, essentially, as a NOT gate. And, if games aren’t your thing, you can build CMOS logic out of UNIX pipes with a simple “mosfet” program. " -- https://fail0verflow.com/blog/2012/dcpu-16-review.html http://www.youtube.com/watch?v=yUbRFfXEtls http://www.youtube.com/watch?v=LGkkyKZVzug http://www.linusakesson.net/programming/pipelogic/index.php

---

" Let/Const

The ES6 ‘let’ feature is similar to ‘var’, but it aims to simplify the mental model for the variable’s scope. With ‘let’, you can scope variables to blocks of code rather than whole functions. For example:

function f() { let total = 0; let x = 5; for (let x = 1; x < 10; x++) { total += x; } console.log(x); }

f(); outputs 5

Notice how the two ‘let x’ statements do not conflict. This is because the one used in the loop is in a different scope than the one outside of the loop. If we re-wrote this using vars, all ‘x’ vars effectively combine into one, leading to rather confusing output.

function f() { var total = 0; var x = 5; for (var x = 1; x < 10; x++) { total += x; } console.log(x); }

f(); outputs 10 " -- http://blogs.msdn.com/b/typescript/archive/2015/01/16/announcing-typescript-1-4.aspx

---

a good writeup of the haskell record type problem, and pointers to various proposed solns:

the problem: http://nikita-volkov.github.io/record/

various proposed solns:

discussions:

https://news.ycombinator.com/item?id=8909126

---

Haskell module system proposal:

---

Haskell initial blog post on lenses:

---

http://prog21.dadgum.com/203.html

in summary, he says Python isn't a good first language because:

instead, for these reasons, he recommends Javascript.

within the discussion, https://news.ycombinator.com/item?id=8921781 and its replies touch on a related point: sometimes CS curricula start with Racket rather than Python because it lends itself more to teaching core CS concepts, eg complexity of algorithms, functional programming techniques, recursion, data structures. However, many students aren't going to be professional programmers or CS theorists, they are going to do something else with their lives but they need to learn to do a little programming, purely as a tool. For these people, they want to learn to program something as quickly as possible, at the expense of not completely understanding the advanced stuff or the CS theory basics.

---

mb uninteresting: https://www.google.com/search?q=clr+addressing+modes&ie=utf-8&oe=utf-8

---

a learn to program game:

http://play.elevatorsaga.com/documentation.html

https://news.ycombinator.com/item?id=8929314


when i say i want Oot to be 'languagey', one thing i mean is that the core concepts and syntax of oot should usefully transfer even to DSLs that build on top of it, even when they have to override the implementation/interpretation of these concepts and syntax (by 'usefully transfer' i mean that the DSL authors should want to reuse these concepts and syntax because they are useful; i don't mean that oot should mandate that, although maybe it should, i havent thought about that yet).

sort of in the way that ppl talk about 'Pythonic' interfaces to foreign libraries; these reuse Python concepts like dicts, OOP, properties (__get and __set overriding rather than explicit getter and setter methods), etc (what else?)

i suppose one aspect of that is that i want much of Oot to be 'declarative' and 'first-class'; for example, i want to have 'binding' or symlinks to be a first-class concept within Oot, and i want people to be able to say (eg when writing dynamic websites or web frameworks) that the value in this DOM element is bound to that value in the database, or to that local variable; but when they say this, i want it to be in the form of a first-class declaration ("this is bound to that") that various metaprogramming frameworks (eg web frameworks) can then look at (is this reflection, or is it not, because it is so first-class and so declarative from the get-go?) and implement differently. Eg people talk about ember.js vs angular v1's vs rubino's classical's way of syncing these things that are supposed to be bound together, and they are different, but the basic idea that this is supposed to be bound to that is shared.

another example is the way that llvm does garbage collection; they don't provide an actual garbage collection module, but they do provide a language for telling the garbage collector the things it might need to know (eg @llvm.gcroot ; see http://llvm.org/docs/GarbageCollection.html#in-your-compiler), and for leaving extension points in your code for the garbage collector (eg llvm.gcread, llvm.gcwrite.

and another example is the idea of following a path of getattrs and obeying the __get and __set metaprogramming along the way.

E.g. you might say 'a.b.c.d.e = 4', but maybe d doesn't 'actually' exist and is merely being virtually simulated by c via a.b.c.__get and a.b.c.__set. Oot implements the most general case here, checking each node for __get and __set as it is traversed; but the metaprogramming in a.b.c might virtually simulate both d and e at once via a 2-d table. Still, the language of paths, effective addresses, (and even __get and __set??) should be usefully reused.


Haskell looks at pure expressions as a special case of deterministic side-effectful functions, that is, pure expressions are functions with no side effects.

However, the reason that this doesn't always work well in programming is that sometimes you want to change some guy deep in the call chain to have side effects, eg caching or logging or whatever. Another way of looking at this is that functions are things that return a value, and functions that both return a value and have a side-effect are a special kind of function that returns a value.

---

React v0.13.0 Beta 1 discussion

https://news.ycombinator.com/item?id=8958731

---

we should provide multiple choices for which minimal 'kernel' of methods to override to implement a represeentation (via circular defns?). For example, in order to derive all of <, <=, >, >=, ==, you only need to define some subset of these; <= and == will do, as will <= and <, or < and >; so in Oot, you could define this typeclass by providing an abstract implementation for all five of these methods, and then a representation could override only two of these (but which two is up to the implementation).

(see also haskell's derived instances of Ord)

---

MS's open source .NET runtime uses CMake:

https://news.ycombinator.com/item?id=8992656

ixtli 2 hours ago

link

Even more reason to port all of your C/C++ code to CMake. I'm excited to see upstream contribution from MS.

reply

stinos 1 hour ago

link

Exactly, CMake getting more traction like this should be good. I know quite some devs who're like CMake? Why don't you just use good old Makefiles? and then struggle with it's syntax and/or fail to provide working Windows or even Mac builds of their software simply because they stick with plain makefiles. (don't get me wrong, they work fine, just not so much for cross-platform projects)

reply

stefantalpalaru 2 minutes ago

link

The alternative is not plain Makefiles, but Autotools: https://autotools.io/index.html

reply

---

i still don't think Oot is quite broad enough yet for:

---

well at least for the above, the 4-role 'grammatical' oot assembly idea seems to fit. It seems like a lot of things can be divided into object or rhs or input params, subject or lhs or output params, a verb or instruction or construct identifier, and mode modifiers. And getting a second level of these allows you to express even more (eg modes on of the input). It also imparts a rather strong flavor to the language, maybe a little strong for my taste, but some flavor is good, probably even for metaprogramming, as it presumably creates conventions (well, pre-conventions; cognitive structures that probably tend to lead different people to thinking of similar conventions).

Also the idea of having the '/' sign be 'arrow' gives you a lot; grammars, logic, and simulations all seem to be phrased in terms of tables of rules of the form: a -> b

which might be written a / b in oot

What else do these things have aside from tables?

---

i guess one other problem with Haskell is that it's not really that portable in practice. This comes from two sources:

My impression is that the GHC extensions tend to be type system extensions. Which suggests that a language with a more extensible type system (so that most of these type system extensions would be libraries that run on any implementation of the language) would not have this problem quite as much.

You would think i wouldnt care about this so much, because i like languages with one canonical implementation, but i guess i do. Non-portability prevents things like Haskell and Python from being run on top of Javascript in the browser, or on mobile devices, etc, which makes them vulnerable to quickly becoming out of date when a new platform like the Web or Mobile comes along.

---

i am becoming more impressed with the JVM and with Java as i learn more about languages and language implementation.

note that the JVM has weathered the shift to mobile relatively well.

---

an example of regrettable lack of standardization in Wordpress post formats:

https://www.doitwithwp.com/saying-goodbye-post-formats/

---

an example of tradeoffs that most languages have between code that is easy for newbies and code that is good:

"While programmers like myself start convulsing when we have to wade through such code, less experienced programmers actually like it, because it is easier for them to work with. Because the "template functions" are in the global namespace, they work anywhere. Because the data returned by the database is in a global variable, you don’t have to use any tricks to get at it. It makes extending, enhancing, and modifying the code easier for newbies." -- https://github.com/WordPress/book/blob/master/Content/Part%201/2-b2-cafelog.md

(was that from http://web.archive.org/web/20050901152840/http://revjim.net/comments/3955/ ? i dont see it there)

---

a book on the history of wordpress! toread/toskim (i skimmed Part 1 already):

https://github.com/WordPress/book/tree/master/Content

---

a Clojure survey on what features users wanted http://tech.puredanger.com/2013/11/19/state-of-clojure-language-features/

---

some discussion on features in various Python notebook systems:

https://news.ycombinator.com/item?id=9004689

---

(not about monads, it's about some MS shell framework)

http://www.jsnover.com/Docs/MonadManifesto.pdf

---

some ideas relating to generalizing/translating the instructions of Nock:

to review, the Nock instructions (see also my notes on Nock)

0: /(b)
1: arg(b)
2: nock(b,c)
3: ?(b)
4: +(b)
5: =(b)
6: if(b,c,d)
7: compose(b,c)
8: assign(b,c)
9: call(b,c)
10a: hint(b,c)
10b: hint(b.c,d)
1  ::    nock(a)           *a
2  ::    [a b c]           [a [b c]]
3  ::  
4  ::    ?[a b]            0
5  ::    ?a                1
6  ::    +[a b]            +[a b]
7  ::    +a                1 + a
8  ::    =[a a]            0
9  ::    =[a b]            1
10 ::
11 ::    /[1 a]            a
12 ::    /[2 a b]          a
13 ::    /[3 a b]          b
14 ::    /[(a + a) b]      /[2 /[a b]]
15 ::    /[(a + a + 1) b]  /[3 /[a b]]
16 ::
17 ::    *[a [b c] d]      [*[a b c] *[a d]]
18 ::
19 ::    *[a 0 b]          /[b a]
20 ::    *[a 1 b]          b
21 ::    *[a 2 b c]        *[*[a b] *[a c]]
22 ::    *[a 3 b]          ?*[a b]
23 ::    *[a 4 b]          +*[a b]
24 ::    *[a 5 b]          =*[a b]
25 ::
26 ::    *[a 6 b c d]      *[a 2 [0 1] 2 [1 c d] [1 0] 2 [1 2 3] [1 0] 4 4 b]
27 ::    *[a 7 b c]        *[a 2 b 1 c]
28 ::    *[a 8 b c]        *[a 7 [[7 [0 1] b] 0 1] c]
29 ::    *[a 9 b c]        *[a 7 c [2 [0 1] [0 b]]]
30 ::    *[a 10 [b c] d]   *[a 8 c 7 [0 3] d]
31 ::    *[a 10 b c]]      *[a c]
32 ::
33 ::    =a                =a
34 ::    /a                /a
35 ::    *a                *a

the instructions with state ("a") can be modified by having the state be implicit, like implicit 'this' in C++. In Oot, this refers to a local island/domain of state.

instruction 0, /, select, can be generalized to '.' or __get, and further, to 'follow query'. But in Nock you also have to look up variables within the state (because there are no distinct variables, there is only the one state), so 'select' also does looking up an identifer and interpolating a value, like Perl5's '$variableName'.

instruction 1, arg, i'm not sure about, but i think it's just the projection functions (eg p1(x,y,..) = x, p2(x,y,...) = y, etc) or maybe even just constants.

instruction 2, nock, can be generalized by giving it a third argument which defines the language to apply. The language could be defined in a spec like the Nock spec (eg where 'eval' means 'recursive match and replace'), or in some other way. In fact, more generally, you could specify a pattern into which the arguments are to be placed. Then each argument (or some subset of them) is 'eval'd. Then the whole pattern is 'eval'd. But that is just substitution (the namesake of the S combinator, which is what the Nock instruction (and hence the Nock language) is based on); the 'form' or 'pattern' into which the arguments are to be placed is the function; the other arguments are the formal parameters (arguments) being passed into the function. Even so, this brings home the importance of having substitution into data structures, particularly graphs, and grammar-type substituions into anything 'grammar'ed, and into code (macros), as easy operations (rather than only providing function calls). this is pretty key; so the Nock operator is a special case of substitution, and function application and macros are both substituion in code (but with top-down vs. bottom up evaluation order, i suppose?), and i suppose this is connected with a 'replace' operation in our 'graph regexes', and (hence?) with grammar. i guess in some way combinatorial calculus (and hence Nock) is based on the S combinator being a sort of 'canonical form' of function application

instruction 3, isCell, can be generalized into a 'pattern matching' or 'structural typing match' test

instruction 4, addition, can be generalized into various (primitive only, or all?) function calls. In Oot, i'm not sure if we need to distinguish primitive from non-primitive functions.

instruction 5, structural equality, is still structural equality (that's already a pretty big concept)

instruction 6, if, can be generalized into switch

instruction 7, compose; this really is just sequential mutations of the state

instruction 8, assign; this is just a convention for storing variables in the state by treating the stack as a stack and pushing them onto it. In graphs, we would just add a new edge to a new node (like adding an entry to a dict). Not to say we shouldn't have stack ops..

instruction 9, call; compose a state mutation with an 'eval' of some part of state; this is supposed to be like a calling convention for looking up a function and calling it, although the composition with the state mutation makes it a little less clear. the way i am translating/generalizing the others, i dunno if there's really a place for this one that's separate from 'eval'; it's pretty easy to just write eval(select()).

rule 17: data constructor

also; part of the point of Nock is that, in ordinary programs (and in macro instructions 6-9) you aren't allow to use the asterisk notation (eval) within the program code; but of course we could allow that directly.

Hence instead of writing macro instruction 7 as *[a 7 b c] --> *[a 2 b 1 c], you could write it as *[a 7 b c] --> *[*[a b] c].

note also that one of the key things in Nock (and in lambda calculus) is the immutability and the copying of these large structures with impunity.

note also that one of the key things in Nock is the locality (or relativity) of reference, by which i mean not anything like is meant with caching, but rather than we don't have a global memory, we can only address things by relative position into a structure that we have a copy of; this structure may be a subset of another structure elsewhere in a larger expression that we are part of, or some other structure elsewhere may be a subset of this structure, but we can't know that because of the way Nock is pure; and since addressing is relative, the addresses to the equivalent place into these structure will not be the same if one is embedded in the other. Hoon abstracts this some extent but not completely.

oot is good at graphs annotating other graphs, so it will be good at specifying extensions to the type system in a library. Also, type system specifications seem to look like grammar matching rules.

this sort of thing also reminds me to read the Nock algorithms, and especially the comments by which he names each use of the instructions, in http://doc.urbit.org/doc/nock/tut/3/

---

so how exactly are graph regexs or even more general graph grammar supposed to represent types? How are generics (type variables) represented? How are typeclasses and Haskell typeclass 'contexts' represented? How are function types represented?

---

i think i've said this before, but views aren't much different from creating child objects that mutate the parent when they are mutated. The key here is that we discourage long-term hidden aliasing like this; so views are a way of grouping together a bunch of aliasing behavior of this sort in a way so that it's obvious to the reader which variables are grouped. In other words, it's a special case of the usually-discouraged hidden aliasing behavior that isn't discouraged.

also, i've said this before too, but what we really dislike is long-range aliasing, not local mutation. So, we don't mind writing an in-place insertion into a dictionary as a mutation, rather than Haskell-like syntactic contortions. However, unlike eg Python our library functions are immutable by default, and then we use special syntax to indicate when we want them to be performed in-place. Also, because of our emphasis on concurrency, we need to correctly handle the case where you apply 'map' in-place to a list but another thread might read it in the middle of the 'map', getting some elements that have been mapped already and some that haven't -- the best soln here is just don't share data (no aliasing), but for when there is aliasing, we need atomicity/locking or STM too.

---

here's the kind of stuff perl6 does with its ASTs:

---

havent read yet. In search of the perfect JavaScript? framework https://news.ycombinator.com/item?id=9014890

---

excitom 2 days ago

link

I like the comparison to COBOL. Another IBM language of similar vintage is RPG (Report Generator) and it also uses packed decimals.

The control flow is:

I was always struck by the similarity to:

BEGIN { ... }

while (<>) {

  ...

}

END{ ... }

reply

agumonkey 2 days ago

link

Because these are two basic building blocks of 'computation', I'd say first order. It's map and reduce in disguise (or the other way around, doesn't matter).

sed is also a little twisted version of awk, per line mapping, with pattern spaces for accumulations.

It's first order, because it can't reuse itself on smaller pieces of data, namely recursion (grammars, trees, etc)

reply

cbd1984 2 days ago

link

That's awk. That is literally the control flow style of awk, which Larry Wall copied into Perl very directly and entirely deliberately.

I find it very interesting that RPG uses it, and I wonder if it's convergent evolution (relative to awk and RPG) or if one is copying the other.

reply

---

this segment copied from [1]

Furthermore, these semantics are confusing when it comes to unboxed values. If r3 = 9, and then we add 1 to r3, what does that mean? It doesn't mean that we alter the meaning of "9" systemwide. r3 = 9 means there is a single copy of the value 9 in r3, and mutations to it affect that copy, not some system-wide thing.

If we were to run "MOV r3 to r4", this would literally mean 'take what is in r3 out of r3 and place it into r4'. r3 then becomes undef (at least as far as the type system is concerned).

If we were to run "CPY r3 to r4", this would literally mean 'make a (non-aliased) (deep) copy of the value that is in r3 and place it into r4'. CPY would be implemented by the VM as COW (copy-on-write) under the covers. Further writes to r4 would not change the value in r3.

If we were to run "ALIAS r4 to r3", this would literally mean 'into r4, place a symlink that links to r3". Further writes to r4 would change the value in r3.

So what if we say "ALIAS r3 to x[33]; MOV 3 to r3"; does this mean x[33] = 3? I think it does. If we want to look at the alias itself, we use a meta-command, like ALIAS or GETALIAS. Or maybe ALIAS is just SET with a meta-modifier (or 'alias' modifier), and GETALIAS is just GET with that same modifier.

So what if we say "ALIAS r3 to x[33]; ALIAS r4 to r3; ALIAS r3 to y[2]"; do changes to r4 now affect x[33], or y[2]? Todo.

So it might be easier to think in terms of two assignment operators, "=" for values and "=&" for aliases (symlinks), and two equality operators, '==' for structural equality and '=&' for pointer equality, in a higher level language.

---

this segment copied from [2]

The idea of 'alias' and 'value' addressing is that each memory cell has two 'levels'; the 'alias' level, which might contain a pointer (or actually, maybe an entire 'alias record', which is a pointer with some attached metadata), or nothing is the cell is not a symlink but an actual value; and the 'value' level, which in the case of non-symlinked cells contains the contents of the cells, and which in the case of the symlinked cells looks like it contains the contents of the symlink target. So when you use value addressing on a symlink cell (that is, one with a non-empty alias level), you are really manipulating the cell that is the symlink target; and then you use alias addressing on this cell, you are reading or mutating its symlink record.

---

" Working with dynamic json(mostly in case of REST apis) can be a pain in Golang, but you can work with dynamic JSON objects swiftly in Node.js.

You have to learn about structs and interfaces in order to work with Golang, but nothing as such in Node.js " -- http://www.quora.com/It-seems-as-if-GoLang-is-better-than-Node-js-in-terms-of-performance-not-that-big-a-margin-and-syntax-style-so-why-is-Node-js-way-more-popular-than-Go

i find this interesting for 2 reasons:

--- twitter's finangle is built upon futures, services (a function that receives a request and returns a Future object as a response..both clients and servers are represented as Service objects), and filters (filters are service transformers)

" Finagle provides a robust implementation of:

    connection pools, with throttling to avoid TCP connection churn;
    failure detectors, to identify slow or crashed hosts;
    failover strategies, to direct traffic away from unhealthy hosts;
    load-balancers, including “least-connections” and other strategies; and
    back-pressure techniques, to defend servers against abusive clients and dogpiling.

Additionally, Finagle makes it easier to build and deploy a service that

    publishes standard statistics, logs, and exception reports;
    supports distributed tracing (a la Dapper) across protocols;
    optionally uses ZooKeeper for cluster management; and
    supports common sharding strategies."

from https://blog.twitter.com/2011/finagle-a-protocol-agnostic-rpc-system https://twitter.github.io/finagle/

---

jzelinskie 233 days ago

Thank you for your efforts. Some people may be giving you flack for "making just another Go web framework", but I wouldn't even be looking at a Go web framework if it was using reflection. Sometimes, you just don't want to pay the performance tax of reflection. I'd actually reckon that the use of reflection is a common reason for why you see alternate implementations of many libraries (i.e. JSON encoders)

-- https://news.ycombinator.com/item?id=7969018 (on comments about the Go Gin web framework)

---

todo incorporate

https://en.wikipedia.org/wiki/Semantic_primes#List_of_semantic_primes

https://en.wikipedia.org/wiki/Cultural_universal

https://en.wikipedia.org/wiki/Natural_semantic_metalanguage#Semantic_primitives

https://en.wikipedia.org/wiki/Thematic_relation#Major_thematic_relations

some notes: cooking: resource; consuming a resource; transformation of resource trade: exchange this for that promise/demand see/hear: i dont think we should include specific sensory modalities in here, just the idea of sensor/effector. to sense (see, hear) is to impurely take input (eg if the 'input' is the output of a pure function of a constant, then that input is unchanging, but impure inputs might vary as the program executes; so i think 'see' is like impure input, but not like pure input). Probably should use the words sense (impure input) and act (impure output).

---

i guess the question of whether we want the convention 'toInt' or 'fromInt' (eg Haskell) for explict type coercion is that we want 'toInt' because we dislike backwards inference.

---

mb no implicit type coercion; for eg truthiness, we can use a separate 'protocol'

---

first-class lenses

first-class call stacks

first-class assertions

first-class channels

first-class messages (eg even 'return')

(first-class control flow? what does that mean?)

first-class statespaces

first-class labels

---

just to reiterate, exceptions are for when you break your contract with the caller

(but there are also 'messages' which propagate similarly to exceptions but which do not reflect broken contracts)

---

if exceptions are message emission, and messages are things which float around a handler tree displaying 'antigens' which various message receivers may bind to, then i guess the call stack is a special case of a handler stack. and i guess pure fns which only talk to their caller are a special case of fns which 'emit' and thereby affect the state of the entire statespace to which they emit...

so the call stack is a special case of pub/sub..

and the call stack is generalized into a dag, that is, a function can have multiple 'callers'

'return', 'raise', 'yield' are three types of message issuance. In 'return', you terminate, you return control to the caller, you don't break your contract, and you send a value. In 'raise', you terminate (except for languages with resumes), you return control to the caller, you break your contract, and you don't send a value (well okay, you do but it's supposed to be a 'stderr' type thing). In 'yield', you don't terminate, you return control to the caller, you don't break your contract, and you send a value

which suggests others: 'emit': don't terminate, don't return control, don't break your contract, do send a value 'alternate return': you terminate, return control to whoever handles this message type, you don't break your contract, and you send a value.

calling a fn which only 'emits' and doesn't 'return' (at least for a long time), for example starting a server, is like forking (or execing). Perhaps if !fn() is needed to execute a side-effectful function, then !!fn() is needed to call a function which may not terminate (whether or not it forks and allows the caller to execute in parallel). Maybe more generally !fn() has the auto-timeout (except when this is compiled away, in the case of 'short' terminating fns; is 'short' based on O notation based on the inputs? mb provably terminating, and no more than linear in the inputs?) and !!fn() is needed to run fn() without wrapping it in a timeout?

---

" So it goes with software. That software which is flexible, simple, sloppy, tolerant, and altogether forgiving of human foibles and weaknesses turns out to be actually the most steel cored, able to survive and grow while that software which is demanding, abstract, rich but systematized, turns out to collapse in on itself in a slow and grim implosion.

Consider the spreadsheet. It is a protean, sloppy, plastic, flexible medium that is, ironically, the despair of all accountants and auditors because it is virtually impossible to reliably understand a truly complex and rich spreadsheet. Lotus corporation (now IBM), filled with Harvard MBA’s and PhD’s? in CS from MIT, built Improv. Improv set out "to fix all this". It was an auditors dream. It provided rarified heights of abstraction, formalisms for rows and columns, and in short was truly comprehensible. It failed utterly, not because it failed in its ambitions but because it succeeded.

Consider search. I remember the first clunky demos that Microsoft presented when Bill Gates first started to talk about Information at your fingertips with their complex screens for entering search criteria and their ability to handle Boolean logic. One of my own products, Access had the seemingly easier Query by Example. Yet, today half a billion people search every day and what do they use? Not Query by Example. Not Boolean logic. They use a solution of staggering simplicity and ambiguity, namely free text search. The engineering is hard, but the user model is simple and sloppy.

Consider user interface. When HTML first came out it was unbelievably sloppy and forgiving, permissive and ambiguous. I remember listening many years ago to the head, then and now, of Microsoft Office, saying contemptuously in 1995 that HTML would never succeed because it was so primitive and that Word would win because Word documents were so rich and controlled in their layout. Of course, HTML is today the basic building block for huge swathes of human information. What is more, in one of the unintended ironies of software history, HTML was intended to be used as a way to provide a truly malleable plastic layout language which never would be bound by 2 dimensional limitations, ironic because hordes of CSS fanatics have been trying to bind it with straight jackets ever since, bad mouthing tables and generations of tools have been layering pixel precise 2 dimensional layout on top of it. And yet, ask any gifted web author, like Jon Udell, and they will tell you that they often use it in the lazy sloppy intuitive human way that it was designed to work. They just pour in content. In 1996 I was at some of the initial XML meetings. The participants’ anger at HTML for “corrupting” content with layout was intense. Some of the initial backers of XML were frustrated SGML folks who wanted a better cleaner world in which data was pristinely separated from presentation. In short, they disliked one of the great success stories of software history, one that succeeded because of its limitations, not despite them. I very much doubt that an HTML that had initially shipped as a clean layered set of content (XML, Layout rules - XSLT, and Formatting- CSS) would have had anything like the explosive uptake.

Now as it turns out I backed XML back in 1996, but as it turns out, I backed it for exactly the opposite reason. I wanted a flexible relaxed sloppy human way to share data between programs and compared to the RPC's and DCOM's and IIOP's of that day, XML was an incredibly flexible plastic easy going medium. It still is. And because it is, not despite it, it has rapidly become the most widely used way to exchange data between programs in the world. And slowly, but surely, we have seen the other older systems, collapse, crumple, and descend towards irrelevance.

Consider programming itself. There is an unacknowledged war that goes on every day in the world of programming. It is a war between the humans and the computer scientists. It is a war between those who want simple, sloppy, flexible, human ways to write code and those who want clean, crisp, clear, correct ways to write code. It is the war between PHP and C++/Java. It used to be the war between C and dBase. Programmers at the level of those who attend Columbia University, programmers at the level of those who have made it through the gauntlet that is Google recruiting, programmers at the level of this audience are all people who love precise tools, abstraction, serried ranks of orderly propositions, and deduction. But most people writing code are more like my son. Code is just a hammer they use to do the job. PHP is an ideal language for them. It is easy. It is productive. It is flexible. Associative arrays are the backbone of this language and, like XML, is therefore flexible and self describing. They can easily write code which dynamically adapts to the information passed in and easily produces XML or HTML.

...

In the same way, I see two diametrically opposed tendencies in the model for exchanging information between programs today:

On the one hand we have RSS 2.0 or Atom. The documents that are based on these formats are growing like a bay weed. Nobody really cares which one is used because they are largely interoperable. Both are essentially lists of links to content with interesting associated metadata. Both enable a model for capturing reputation, filtering, stand-off annotation, and so on. There was an abortive attempt to impose a rich abstract analytic formality on this community under the aegis of RDF and RSS 1.0. It failed. It failed because it was really too abstract, too formal, and altogether too hard to be useful to the shock troops just trying to get the job done. Instead RSS 2.0 and Atom have prevailed and are used these days to put together talk shows and play lists (podcasting) photo albums (Flickr), schedules for events, lists of interesting content, news, shopping specials, and so on. There is a killer app for it, Blogreaders/RSS Viewers. Anyone can play. It is becoming the easy sloppy lingua franca by which information flows over the web. As it flows, it is filtered, aggregated, extended, and even converted, like water flowing from streams to rivers down to great estuaries. It is something one can get directly using a URL over HTTP. It takes one line of code in most languages to fetch it. It is a world that Google and Yahoo are happily adjusting to, as media centric, as malleable, as flexible and chaotic, and as simple and consumer-focused as they are.

On the other hand we have the world of SOAP and WSDL and XML SCHEMA and WS_ROUTING and WS_POLICY and WS_SECURITY and WS_EVENTING and WS_ADDRESSING and WS_RELIABLEMESSAGING and attempts to formalize rich conversation models. Each spec is thicker and far more complex than the initial XML one. It is a world with which the IT departments of the corporations are profoundly comfortable. It appears to represent ironclad control. It appears to be auditable. It appears to be controllable. If the world of RSS is streams and rivers and estuaries, laden with silt picked up along the way, this is a world of Locks, Concrete Channels, Dams and Pure Water Filters. It is a world for experts, arcane, complex, and esoteric. The code written to process these messages is so early bound that it is precompiled from the WSDL’s and, as many have found, when it doesn't work, no human can figure out why. The difference between HTTP, with its small number of simple verbs, and this world with its innumerable layers which must be composed together in Byzantine complexity cannot be overstated. It is, in short, a world only IBM and MSFT could love. And they do.

On the one hand we have Blogs and Photo Albums and Event Schedules and Favorites and Ratings and News Feeds. On the other we have CRM and ERP and BPO and all sorts of enterprise oriented 3 letter acronyms.

As I said earlier, I remember listening many years ago to someone saying contemptuously that HTML would never succeed because it was so primitive. It succeeded, of course, precisely because it was so primitive. Today, I listen to the same people at the same companies say that XML over HTTP can never succeed because it is so primitive. Only with SOAP and SCHEMA and so on can it succeed. But the real magic in XML is that it is self-describing. The RDF guys never got this because they were looking for something that has never been delivered, namely universal truth. Saying that XML couldn’t succeed because the semantics weren’t known is like saying that Relational Databases couldn’t succeed because the semantics weren’t known or Text Search cannot succeed for the same reason. But there is a germ of truth in this assertion. It was and is hard to tell anything about the XML in a universal way. It is why Infopath has had to jump through so many contorted hoops to enable easy editing. By contrast, the RSS model is easy with an almost arbitrary set of known properties for an item in a list such as the name, the description, the link, and mime type and size if it is an enclosure. As with HTML, there is just enough information to be useful. Like HTML, it can be extended when necessary, but most people do it judiciously. Thus Blogreaders and aggregators can effortlessly show the content and understanding that the value is in the information. Oh yes, there is one other difference between Blogreaders and Infopath. They are free. They understand that the value is in the content, not the device.

RSS embodies a very simple proposition that Tim Berners Lee has always held to be one of the most important and central tenets of his revolution, namely that every piece of content can be addressed by a URL.

...

By contrast, the rigid abstract layers of web service plumbing are all anonymous, endless messages flowing through under the rubric of the same URL. Unless they are logged, there is no accountability. Because they are all different and since the spec that defines their grammar, XML Schema, is the marriage of a camel to an elephant to a giraffe, only an African naturalist could love these messages. They are far better, mind you, than the MOM messages that preceded them. Since they are self describing, it is possible to put dynamic filters in to reroute or reshape them using XPATH and XSLT and XML Query and even other languages all of which can easily detect whether the messages are relevant and if so, where the interesting parts are. This is goodness. It is 21st century. But the origination and termination points, wrapped in the Byzantine complexity of JAX RPC or .NET are still frozen in the early bound rigidity of the 20th.

I would like to say that we are at a crossroads, but the truth is never so simple. The truth is that people use the tools that work for them. Just as for some programmers the right tool is PHP and for others the right tool is Java so it is true that for some programmers the right tool is RSS and for others it is WS-*. There is no clear “winner” here. What I am sure about is the volumes and the values. The value is in the information and its ability to be effortlessly aggregated, evaluated, filtered, and enhanced.

...

There is a lot of talk about Web 2.0. Many seem to assume that the “second” web will be about rich intelligent clients who share information across the web and deal with richer media (photos, sound, video). There is no doubt that this is happening. Whether it is Skype or our product Hello, or iTunes, people are increasingly plugging into the web as a way to collaborate and share media. But I posit that this isn’t the important change. It is glitzy, fun, entertaining, useful, but at the end of the day, not profoundly new.

What has been new is information overload. Email long ago became a curse. Blogreaders only exacerbate the problem. I can’t even imagine the video or audio equivalent because it will be so much harder to filter through. What will be new is people coming together to rate, to review, to discuss, to analyze, and to provide 100,000 Zagat’s, models of trust for information, for goods, and for services. Who gives the best buzz cut in Flushing? We see it already in eBay. We see it in the importance of the number of deals and the ratings for people selling used books on Amazon. As I said in my blog, “My mother never complains that she needs a better client for Amazon. Instead, her interest is in better community tools, better book lists, easier ways to see the book lists, more trust in the reviewers, librarian discussions since she is a librarian, and so on”. This is what will be new. In fact it already is. You want to see the future. Don’t look at Longhorn. Look at Slashdot. 500,000 nerds coming together everyday just to manage information overload. Look at BlogLines?. What will be the big enabler? Will it be Attention.XML as Steve Gillmor and Dave Sifry hope? Or something else less formal and more organic? It doesn’t matter. The currency of reputation and judgment is the answer to the tragedy of the commons and it will find a way. This is where the action will be. Learning Avalon or Swing isn’t going to matter. Machine learning and inference and data mining will. For the first time since computers came along, AI is the mainstream.

I find this deeply satisfying. It says that in the end the value is in our humanity, our diversity, our complexity, and our ability to learn to collaborate. It says that it is the human side, the flexible side, the organic side of the Web that is going to be important and not the dry and analytic and taxonomical side, not the systematized and rigid and stratified side that will matter. " -- http://web.archive.org/web/20060219031421/http://www.adambosworth.net/archives/000031.html

(some comments; now RSS appears to be dying (although i hope i'm wrong), and the next web did indeed turn out to be about media (although i personally think more rating is still to come); what does that mean for eis hypothesis? i think actually it validates his hypothesis more than he had hoped; (a) RSS is dying b/c most ppl dont care about standard great ways to read lots of things, they care about checking SOCIAL media of the few ppl they care about; so local feeds are good enough; and businesses find walled gardens more profitable, so they quietly cut off the feed interoperability; (b) rich media was the winner b/c most ppl care more about having ordinary human interaction than in getting information.

---

one potential issue is that i'm hoping that Oot's graph datastructures and annotations will be good for things like escape analysis that optimizing compilers have to do (and which will be especially relevant for metaprogrammy languages like Oot, in order to recognize the common non-metaprogrammy case and optimize it). but i'm not really familiar with these things.

---

" The WG0 Scheme would contain the basic types of R5RS (characters, numbers, cons pairs, vectors, etc.) It would contain only the special forms LAMBDA, SET!, FEXPR, and THE-ENVIRONMENT. It would contain the procedure EVAL. It would contain a simple mechanism for creating new encapsulated types (much like the one observed in the Kernel programming language specification).

THE-ENVIRONMENT (invoked (the-environment)) would yield a procedural interface to dynamic instance of the enclosing lexical environment. The procedural interface can read and write existing variables, but not (by default) change binding contours. Environments in which binding contours can change can be created using LAMBDA.

EVAL would of course accept a form to evaluate and a reified environment, and "do the obvious thing", although it would have some provisions to afford "extensible" environments (e.g., permit at least the late introduction of formerly unbound variables).

A FEXPR is like a LAMBDA except that (a) when it occurs in the first position of an application, the operands are not evaluated and an additional parameter is passed which is the caller's value for (the-environment). APPLY of a FEXPR would pass an extension of (the-environment) in which a QUOTE form is certainly bound, and pass the arguments as QUOTED forms. (e.g., (apply and (list #t x)) with X bound to #f would be equivalent to (and '#t #f).

In WG0, the default "top-level" is immutable and contains only bindings for the WG0 primitives. The basic "unit" of code is a simple stream of S-exp forms. A dialect, such as R5RS, can be expressed as a WG0 program which, when run, ignores WG0 "units" and takes over reading, processing, and evaluating forms directly.

...

As to LAMBDA, can't it be defined in terms of FEXPR and EVAL?

...

yes. C.f. the Kernel programming language, roughly speaking. "

-- http://lambda-the-ultimate.org/node/3861#comment-57967

---

random notes from http://www.yacoset.com/Home/signs-that-you-re-a-bad-programmer

"Executing idempotent functions multiple times (eg: calling the save() function multiple times "just to be sure")"

" Object Oriented Programming is an example of a language model, as is Functional or Declarative programming. They're each significantly different from procedural or imperative programming, just as procedural programming is significantly different from assembly or GOTO-based programming. Then there are languages which follow a major programming model (such as OOP) but introduce their own improvements such as list comprehensions, generics, duck-typing, etc. Symptoms

... (OOP) Attempting to call non-static functions or variables in uninstantiated classes, and having difficulty understanding why it won't compile (OOP) Writing lots of "xxxxxManager" classes that contain all of the methods for manipulating the fields of objects that have little or no methods of their own (Relational) Treating a relational database as an object store and performing all joins and relation enforcement in client code (Functional) Creating multiple versions of the same algorithm to handle different types or operators, rather than passing high-level functions to a generic implementation (Functional) Manually caching the results of a deterministic function on platforms that do it automatically (such as SQL and Haskell) Using cut-n-paste code from someone else's program to deal with I/O and Monads (Declarative) Setting individual values in imperative code rather than using data-binding

... Phase 1: "OOP is just records with methods" Phase 2: "OOP methods are just functions running in a mini-program with its own global variables" Phase 3: "The global variables are called fields, some of which are private and invisible from outside the mini-program" Phase 4: "The idea of having private and public elements is to hide implementation details and expose a clean interface, and this is called Encapsulation" Phase 5: "Encapsulation means my business logic doesn't need to be polluted with implementation details"

...

Phase 1: "Functional programming is just doing everything by chaining deterministic functions together" Phase 2: "When the functions are deterministic the compiler can predict when it can cache results or skip evaluation, and even when it's safe to prematurely stop evaluation" Phase 3: "In order to support Lazy and Partial Evaluation, the compiler requires that functions are defined in terms of how to transform a single parameter, sometimes into another function. This is called Currying" Phase 4: "Sometimes the compiler can do the Currying for me" Phase 5: "By letting the compiler figure out the mundane details, I can write programs by describing what I want, rather than how to give it to me"

... Re-inventing or laboring without basic mechanisms that are built-into the language, such as events-and-handlers or regular expressions Re-inventing classes and functions that are built-into the framework (eg: timers, collections, sorting and searching algorithms) * "Roundabout code" that accomplishes in many instructions what could be done with far fewer (eg: rounding a number by converting a decimal into a formatted string, then converting the string back into a decimal) Persistently using old-fashioned techniques even when new techniques are better in those situations (eg: still writes named delegate functions instead of using lambda expressions)

...

If you don't understand pointers then there is a very shallow ceiling on the types of programs you can write, as the concept of pointers enables the creation of complex data structures and efficient APIs. Managed languages use references instead of pointers, which are similar but add automatic dereferencing and prohibit pointer arithmetic to eliminate certain classes of bugs. They are still similar enough, however, that a failure to grasp the concept will be reflected in poor data-structure design and bugs that trace back to the difference between pass-by-value and pass-by-reference in method calls.

...

Symptoms

    Failure to implement a linked list, or write code that inserts/deletes nodes from linked list or tree without losing data
    Allocating arbitrarily big arrays for variable-length collections and maintaining a separate collection-size counter, rather than using a dynamic data structure
    Inability to find or fix bugs caused by mistakenly performing arithmetic on pointers
    Modifying the dereferenced values from pointers passed as the parameters to a function, and not expecting it to change the values in the scope outside the function
    Making a copy of a pointer, changing the dereferenced value via the copy, then assuming the original pointer still points to the old value
    Serializing a pointer to the disk or network when it should have been the dereferenced value
    Sorting an array of pointers by performing the comparison on the pointers themselves

...

You write busy-wait loops even when the platform offers event-driven programming

You don't use managed languages and can't be bothered to do bounds checking or input validation

...

The following count only when they're seen on a platform with Declarative or Functional programming features that the programmer should be aware of.

    Performing atomic operations on the elements of a collection within a for or foreach loop
    Writing Map or Reduce functions that contain their own loop for iterating through the dataset
    Fetching large datasets from the server and computing sums on the client, instead of using aggregate functions in the query
    Functions acting on elements in a collection that begin by performing a new database query to fetch a related record
    Writing business-logic functions with tragically compromising side-effects, such as updating a user interface or performing file I/O
    Entity classes that open their own database connections or file handles and keep them open for the lifespan of each object

...

If you are writing a program that works with collections, think about all the supplemental data and records that your functions need to work on each element and use Map functions to join them together in pairs before you have your Reduce function applied to each pair.

...

    Homebrew "Business Rule Engines"
    Fat static utility classes, or multi-disciplinary libraries with only one namespace
    Conglomerate applications, or attaching unrelated features to an existing application to avoid the overhead of starting a new project
    Architectures that have begun to require epicycles
    Adding columns to tables for tangential data (eg: putting a "# cars owned" column on your address-book table)
    Inconsistent naming conventions
    "Man with a hammer" mentality, or changing the definitions of problems so they can all be solved with one particular technology
    Programs that dwarf the complexity of the problem they solve
    Pathologically and redundantly defensive programming ("Enterprisey code")
    Re-inventing LISP in XML

...

3. Pinball Programming

When you tilt the board just right, pull back the pin to just the right distance, and hit the flipper buttons in the right sequence, then the program runs flawlessly with the flow of execution bouncing off conditionals and careening unchecked toward the next state transition. Symptoms

    One Try-Catch block wrapping the entire body of Main() and resetting the program in the Catch clause (the pinball gutter)
    Using strings/integers for values that have (or could be given) more appropriate wrapper types in a strongly-typed language
    Packing complex data into delimited strings and parsing it out in every function that uses it
    Failing to use assertions or method contracts on functions that take ambiguous input
    The use of Sleep() to wait for another thread to finish its task
    Switch statements on non-enumerated values that don't have an "Otherwise" clause
    Using Automethods or Reflection to invoke methods that are named in unqualified user input
    Setting global variables in functions as a way to return multiple values
    Classes with one method and a couple of fields, where you have to set the fields as the way of passing parameters to the method
    Multi-row database updates without a transaction
    Hail-Mary passes (eg: trying to restore the state of a database without a transaction and ROLLBACK)

...

You will need to make yourself familiar with the mechanisms on your platform that help make programs robust and ductile. There are three basic kinds:

    those which stop the program before any damage is done when something unexpected happens, then helps you identify what went wrong (type systems, assertions, exceptions, etc.),
    those which direct program flow to whatever code best handles the contingency (try-catch blocks, multiple dispatch, event driven programming, etc.),
    those which pause the thread until all your ducks are in a row (WaitUntil commands, mutexes and semaphores, SyncLocks, etc.)

There is also a fourth, Unit Testing, which you use at design time.

...

4. Unfamiliar with the principles of security

If the following symptoms weren't so dangerous they'd be little more than an issue of fit-n-finish for most programs, meaning they don't make you a bad programmer, just a programmer who shouldn't work on network programs or secure systems until he's done a bit of homework. Symptoms

    Storing exploitable information (names, card numbers, passwords, etc.) in plaintext
    Storing exploitable information with ineffective encryption (symmetric ciphers with the password compiled into the program; trivial passwords; any "decoder-ring", homebrew, proprietary or unproven ciphers)
    Programs or installations that don't limit their privileges before accepting network connections or interpreting input from untrusted sources
    Not performing bounds checking or input validation, especially when using unmanaged languages
    Constructing SQL queries by string concatenation with unvalidated or unescaped input
    Invoking programs named by user input
    Code that tries to prevent an exploit from working by searching for the exploit's signature
    Credit card numbers or passwords that are stored in an unsalted hash

Remedies

The following only covers basic principles, but they'll avoid most of the egregious errors that can compromise an entire system. For any system that handles or stores information of value to you or its users, or that controls a valuable resource, always have a security professional review the design and implementation.

Begin by auditing your programs for code that stores input in an array or other kind of allocated memory and make sure it checks that the size of the input doesn't exceed the memory allocated for storing it. No other class of bug has caused more exploitable security holes than the buffer overflow, and to such an extent that you should seriously consider a memory-managed language when writing network programs, or anywhere security is a priority.

Next, audit for database queries that concatenate unmodified input into the body of a SQL query and switch to using parameterized queries if the platform supports it, or filter/escape all input if not. This is to prevent SQL-injection attacks.

After you've de-fanged the two most infamous classes of security bug you should continue thinking about all program input as completely untrustworthy and potentially malicious. It's important to define your program's acceptable input in the form of working validation code, and your program should reject input unless it passes validation so that you can fix exploitable holes by fixing the validation and making it more specific, rather than scanning for the signatures of known exploits.

Going further, you should always think about what operations your program needs to perform and the privileges it'll need from the host to do them before you even begin designing it, because this is the best opportunity to figure out how to write the program to use the fewest privileges possible. The principle behind this is to limit the damage that could be caused to the rest of the system if an exploitable bug was found in your code. In other words: after you've learned not to trust your input you should also learn not to trust your own programs.

The last you should learn are the basics of encryption, beginning with Kerckhoff's principle. It can be expressed as "the security should be in the key", and there are a couple of interesting points to derive from it.

The first is that you should never trust a cipher or other crypto primitive unless it is published openly and has been analyzed and tested extensively by the greater security community. There is no security in obscurity, proprietary, or newness, as far as cryptography goes. Even implementations of trusted crypto primitives can have flaws, so avoid implementations you aren't sure have been thoroughly reviewed (including your own). All new cryptosystems enter a pipeline of scrutiny that can be a decade long or more, and you want to limit yourself to the ones that come out of the end with all their known faults fixed.

The second is that if the key is weak, or stored improperly, then it's as bad as having no encryption at all. If your program needs to encrypt data, but not decrypt it, or decrypt only on rare occasions, then consider giving it only the public key of an asymmetric cipher key pair and making the decryption stage run separately with the private key secured with a good passphrase that the user must enter each time.

The more is at stake, then the more homework you need to do and the more thought you must put into the design phase of the program, all because security is the one feature that dozens, sometimes millions of uninvited people will try to break after your program has been deployed.

The vast majority of security failures traceable to code have been due to silly mistakes, most of which can be avoided by screening input, using resources conservatively, using common sense, and writing code no faster than you can think and reason about it.

"


so based on the previous:

'idempotent' should be an annotation

we should have:

---

here's an example that the JVM found hard to verify:

http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4381996

there is dynamicity in here; i suggest that we not require Oot to accept this code either (unless variables are auto-initialized)

---

" jph 2 days ago

Great pull request. Ruby makes it easy to duplicate data by calling `.dup` or `+`, and this does help with state isolation. But duplication is an expensive operation.

Ruby's standard libraries don't have much support for immutability, or deep cloning, or copy on write, or linking concatenation. There's no standard library way to ask for a snapshot of an object.

So in the early days for Ruby, an idiom was: if you're writing a method that takes a list, and you need to be sure your list doesn't change out from under you, then duplicate it, get it working, and if it becomes a bottleneck then optimize it.

reply " -- https://news.ycombinator.com/item?id=9195847

---

" How I write software

I like to think I write good code, or that I write more good code than bad code.

My favourite property of good code is boredom. Dull statements one after another, no surprises, no tricks, no special cases, and absolutely no meta-programming. Boring code is the easiest to debug, to verify, and to explain.

Boring code usually doesn’t use global state, has very precise effects, and isn’t strongly tied to the project it lives in. Boring features like single assignment, where each variable only holds one object, and not relying on changing the world around it. Boring code will only do what its asked of it and does not rely on implicit behaviours.

Implicit code can be where it is relying on unspecified but implemented behaviour of an API. For example, there is a lot of broken code which relies on the file system returning directory lists in a sorted order, which works most of the time and then fails mysteriously.

Implicit behaviours means there is more to keep in your head at any one time, and code that relies on them become harder to reason about, both locally and globally. When the behaviour changes, it will be a painful process to migrate code from one set of implicit behaviours to another (just ask anyone who has upgraded rails).

Unfortunately explicit code comes at a price, verbosity: boilerplate, repeated chunks of code to invoke a set of functions. If implicit code leads to fragility and spooky action at a distance, explicit code leads to tedious repeated effort.

It is easy to go too far with verbosity, but I will always err towards it — I agree with the Zen of Python that ‘explicit is better than implicit’. Java errs too far, although you can combine all of the bits to read lines from a file, it’s a few lines of repeated code each time. The alternative is convention over configuration, which makes the common use case easy and everything else impossible.

I try to get the best of both worlds by layering my API. The lower half of the api is java-esque: smaller components, simpler behaviours, but requiring effort to use and assemble. The upper half of the api is the humane side: built in workflows with assumptions baked in about how you will use it.

Having two layers of API is something I’ve seen in one of my favourite python libraries, requests. Requests presents a very concise and humane api which covers the majority of use cases, but underneath lies urllib3, which handles the grunt work of HTTP. It’s a lot of work for most internal libraries, but anything significant can benefit from this split between mechanism and policy. "

---

" ... Parnas’ “On the Criteria To Be Used in Decomposing Systems into Modules”–

    We propose instead that one begins with a list of difficult design decisions or design decisions which are likely to change. Each module is then designed to hide such a decision from the others.

Parnas argues that the point of modularity is not one of reuse, but one of concealment, or abstraction: hiding assumptions from the rest of the program. Another way to look at this is how easily an implementation could be grown, deleted, rewritten, or swapped with a different system altogether, without changing the rest of the system.

Unfortunately, decomposition is genuinely hard: breaking your code into pieces does not always mean that the assumptions end up in different parts: it’s very easy to build a system out of modules that tightly depend on each other. Learning how to decompose software is a hard thing to do, and you will have to make a lot of mistakes before you start to get it right.

It is a tradeoff: a module brings extra overhead, and can be harder to understand where it fits into the larger system, but can bring simplicity and easier maintenance too. Distributed systems

Decomposition, like many things in life, gets harder when you have more computers involved. You must decide how to split up the code, and also decide how to split it across the computers involved. Like with bits of a game world spread across a litany of global variables, spreading bits of state across a network is a similar world of pain and suffering.

Splitting things across a network means that the system will have to tolerate latency, and partial failure, and it is impossible to tell a slow component from a broken one. Keeping data in sync across a network while tolerating failure is an incredibly hard engineering problem known as consensus.

    In my experience, all distributed consensus algorithms are either:
    1: Paxos,
    2: Paxos with extra unnecessary cruft, or
    3: broken. - Mike Burrows

Although consensus can be avoided, the underlying problems cannot. Decomposing a system (into parts that run on different machines) is neither straightforward or easy, but far more treacherous. There are many techniques to make it easier, like statelessness, idempotence, and process supervision, and many others worth discovering too — but one technique stands out above all: uniformity.

It’s easier to handle talking to a bunch of machines if they can be expected to behave in a similar manner.

...

Exposing the asynchronous nature of a network call can seem counterintuitive to Parnas’ advice on decomposition: surely the network is hard and likely to change and therefore worth hiding? Almost. The nature of the network protocol involved, and the particular machine involved are worth hiding, but hiding that the network is unreliable does not let code deal with it effectively. " -- http://programmingisterrible.com/post/110292532528/modules-network-microservices

---

ok, kind of crazy, but this: https://joearms.github.io/2015/03/12/The_web_of_names.html clearly specifies a layered system with three kinds of names:

(HN discussion: https://news.ycombinator.com/item?id=9207576 )


the previous suggests that Oot needs a concept of a variable/document that has various revisions, as well as the concept of values (copied to ootCoreNotes2)

---

https://news.ycombinator.com/item?id=9202189

---

(already copied to plbook-plChCriteria :

[3] criticizes Golang for being "too paratactic"

"Parataxis is a literary technique ... that favors short, simple sentences, with the use of coordinating rather than subordinating conjunctions" -- https://en.wikipedia.org/wiki/Parataxis . Eg "The sun was shining brightly. We went for a walk." instead of "The sun was shining brightly, so we went for a walk."

end part in plbook)

Why could "too paratactic" be bad?

Mb readability is also expressing WHY the code does what it does; in the example, the subordinate clause makes it clear that "we went for a walk" BECAUSE "the sun was shining brightly", a fact that must be inferred in the paratactic formulation.

--

" Date: Fri, 20 Mar 2015 16:21:18 +0000 From: Sean Hammond <seanh@hypothes.is> To: dev@list.hypothes.is Subject: [h-dev] Import styles

Things I don't like about the `from package import module` that we're using everywhere:

1. Makes it hard to grep the codebase for all uses of a module. Can't just grep for "foo.bar.baz", now it might be from foo.bar import baz, from .bar import baz, from . import baz, from foo.bar import gar, dar, baz, jaz, etc.

2. Increases "travel" when trying to read code. I just see `baz`, I have to go up to the top of the file to see that that is in fact `foo.bar.baz`. Multiply that across everything we import in every file, makes code much harder to understand imo.

3. Pollutes namespaces, a module with lots of `from X import Y` has a high chance of some function, local variable, etc name clashing with something imported.

4. Understanding the meaning of an import statement requires taking into account the location of the file, hurts readability imo.

5. "from package.subpackage import module" is too close to the even worse "from package.module import object", encourages it. It's simpler to just say "no 'from' imports".

If I had my way we would always use absolute imports, just always:

import h.foo.bar

And always one import per line. Each module has only one canonical way to be imported. You can grep for "import foo.bar" to find all imports, or just "foo.bar" to find all uses not just imports. You can see right away in the code whenever h.foo.bar is being used because it says h.foo.bar right there.

To keep verbosity under control name packages and modules well and don't have too deep package hierarchies.

When all else fails you can be more concise by falling back on:

import h.foo.bar.baz.gaz as gaz

This still shows up in greps for "h.foo.bar.baz.gaz", but only the import shows up, not each individual use. It introduces a little bit of travel to understand what "gaz" is when you see it used inside the module, but for the benefit of not having to write h.foo.bar.baz.gaz every time. Worth it when the full module name is really long.

...

I'm going to frame my reply with these guiding principles in place:

from foo import bar bar.jump()

is better than

from foo.bar import jump jump()

For reasons of test patching and knowing whether things come from. You've touched on both of these.

It's totally fine to have whatever subpackages you like, inside your module. However, if there's things that are expected to be public, they should be "lifted" at each level, and exposed in `__all__`. In this way, other code can import these symbols without reaching deeply into the package and its subpackages / submodules and within the context of a package refactors are free to move things around without breaking the package's API.

If we obey the above, we should never have something like `from foo.bar.baz import booz` because `baz` would export `booz` and `bar` would export `baz` if there's any reason to believe `booz` is public API for the whole package. Therefore, we would expect to only ever see `from foo import booz`.

Let's jump in, then!

...

TL;DR: For the reasons below, and taking account of Randall's email which has arrived since I started writing this, I suggest we agree to eliminate relative imports, but stick with using `from foo.bar import baz` as an aid to readability.

" -- dev at list.hypothes.is , https://groups.google.com/a/list.hypothes.is/d/msgid/dev/B73ADBF2-4752-4D70-8D24-7661D8081D14%40hypothes.is

---

--

todo learn how J does these from [4]:

  v=: ?. 20 $ 100     NB. a random vector
  v46 55 79 52 54 39 60 57 60 94 46 78 13 18 51 92 78 60 90 62 avg v 59.2
  4 avg\ v            NB. moving average on periods of size 458 60 56 51.25 52.5 54 67.75 64.25 69.5 57.75 38.75 40 43.5 59.75 70.25 80 72.5
  m=: ?. 4 5 $ 50     NB. a random matrix
  m46 5 29 2 4 39 10 7 10 44 46 28 13 18 1 42 28 10 40 12
  avg"1 m             NB. apply avg to each rank 1 subarray (each row) of m17.2 22 21.2 26.4

Rank is a crucial concept in J. Its significance in J is similar to the significance of "select" in SQL and of "while" in C.

Here is an implementation of quicksort, from the J Dictionary:

   sel=: adverb def 'u # ['
 
   quicksort=: verb define
    if. 1 >: #y do. y
    else.
     (quicksort y <sel e),(y =sel e),quicksort y >sel e=.y{~?#y
    end.
   )

The following is an implementation of quicksort demonstrating tacit programming. Tacit programming involves composing functions together and not referring explicitly to any variables. J's support for forks and hooks dictates rules on how arguments applied to this function will be applied to its component functions.

   quicksort=: (($:@(<#[), (=#[), $:@(>#[)) ({~ ?@#)) ^: (1<#)

Sorting in J is usually accomplished using the built-in (primitive) verbs /: (Sort Up) and \: (Sort Down). User-defined sorts such as quicksort, above, typically are for illustration only.

The following expression exhibits pi with n digits and demonstrates the extended precision capabilities of J:

  n=: 50                      NB. set n as the number of digits required
  <.@o. 10x^n                 NB. extended precision 10 to the nth * pi314159265358979323846264338327950288419716939937510

---

todo also learn the two J examples in:

http://stackoverflow.com/a/2753695/171761

--

http://kukuruku.co/hub/funcprog/j-can-be-readable notes in "Hooks and Forks are Your Friends":

(f g) y ⇔ y f (g y)

x (f g) y ⇔ x f (g y)

(f g h) y ⇔ (f y) g (h y)

x (f g h) y ⇔ (x f y) g (x h y)

--

for unary hooks and forks in J, [5] gives the examples of

(= <.) "a whole number is equal to its floor"

and

mean =: sum % # "the mean of a list of numbers is the sum divided by the number-of-items"

look at the way the pronouns and determiners are used in English:

a NUMBER equal to ITS floor

ITS tells us that floor takes NUMER as its argument

avg of A LIST of numbers is THE sum divided by THE length

THE tells us that SUM and LENGTH might have context-dependent arguments; in this case the context is the LIST, which is represented by an argument to them

(also 'avg OF a list' indicates that the list is the input argument)

also, note that in J, for unary hooks and for both unary and binary forks, we have the same argument or arguments being applied in the same way throughout.

Consider the unary case: here the pointfree ('tacit' in J-speak) form is a special way to define functions of one argument; the common case for which this is syntactic sugar is when the single argument of the overall function appears multiple times in the function definition; each time it appears it needs to bind tightly to something

eg consider J's unary fork: (f g h) y ⇔ (f y) g (h y). We are defining a function that takes one input (namely, y). But this input needs to appear in multiple places in the fn defn. Each time it appears, it binds tightly (eg (f y) must be placed in parens).

So, mb we can generalize this, also making it easy to read: what we mean is really just:

fY g hY

since there is only one function input, we could instead have something even easier to type, eg:

f. g h.

the binary fork could use the same notation; it is assumed that when the overall function is binary, we just give both args to the subfunctions selected by the .:

binary-f = f. binary-g h. x binary-f y = (x f y) binary-g (x h y)

J's unary hook is similar, if you allow sections:

(.f g.) y == (y f) (g y) == y f (g y)

J's binary hooks would not work in this notation, however; we'd need to use explicit grouping and to explictly name at least one function input:

x binary-hook-fn = (x f) g. x binary-hook-fn y == (x f) (g y) == x f (g y)

instead of using ., which is highly in demand for syntax, could use ~ or _. These are both already shifted, however, so why not just use 'X' and 'Y'? Well, i think maybe we should use X and Y, but some reasons why not: (a) ~ is easier to type than other unshifted things b/c its in the corner, (b) _ makes a lot of sense as 'a hidden variable argument'

so eg:

f~ g h~

fY g hY

f_ g h_

note that "fY g hY" doesnt make quite as much sense for the case of the binary fork, where you may want XfY? g XhY?

in MLish notation, i guess you could use H f g for unary hooks and F f g h for unary forks, then define them as: H f g y = y f (g y) F f g h y = (f y) g (h y)

but this does not let us use the same H and F for binary hooks and forks, and it probably doesn't cover the longer-length constructions in http://jsoftware.com/help/learning/09.htm (to do that, H and F would have to be essentially variadic)

--

i guess as a compromise between JVM safety-through-linktime-validation and trust-the-compiler-to-ensure-safety-in-complicated-situations-that-the-runtime-doesnt-understand, we could allow the compiler to assert 'ive checked this and its good, even though you wouldnt understand the proof', and allow the runtime to just believe the compiler sometimes. This runtime behavior would require a commandline flag, though, bc it wouldnt be the default. By default, the compiler will insert unnecessary runtime checks in that sort of situation.

i presume allowing assertions 'i cant prove it but do this anyway' is much like 'unsafe' in .NET

--

"

rwmj 3 days ago

Erlang pattern matching over binary data is one of the greatest things about the language. I basically copied the idea wholesale for ocaml-bitstring[1].

[1] https://code.google.com/p/bitstring/

reply "

--

distinction between 'call stack' and 'block stack' (scope stack): eg within one called function one might have a sub-block (sub-scope), for example, the scope of a for loop (which is left when the loop terminates normally, or abnormally via a 'break'), the scope of a scoped 'let' variable declaration, the scope of a 'try' block (to which 'catch' and 'finally' handlers are attached statically (with java try/catch/finally) or dynamically (with golang defer)).

--

picat-style logic programming stuff (tabling (memoization) schema, nondeterministic choice logic programming ("here are multiple alternative paths")), should be pluggable in Oot; as should a type system, GC, etc

--

Hoon has 'kelps', we want kelps as a DSL/metaprogrammy construct; is Hoon good for this sort of metaprogramming? if so, why? is Hoon have something like 'a type system for macros', being as it is combinatorial (macro-oriented)?

--

MSDOS joke:

https://news.ycombinator.com/item?id=9302172

--

needs to be standard syntax-ish thing for 'translate this Oot-style query into an SQL query, or into a Redis query, etc"

--