proj-oot-old-150618-ootNotes9

a few insights i've had recently:

---

potential platform-specific primitives such as modular-addition-with-carry-and-overflow will probably be implemented in the std library in terms of arbitrary-precision addition; but then implementations will override these with platform-specific calls. To generalize this, a user should be able to override any accessible library function with their own FFI implementation.

---

" Decision 3: Kick RDF in the Nuts

RDF is a shitty data model. It doesn’t have native support for lists. LISTS for fuck’s sake! The key data structure that’s used by almost every programmer on this planet and RDF starts out by giving developers a big fat middle finger in that area. Blank nodes are an abomination that we need, but they are applied inconsistently in the RDF data model (you can use them in some places, but not others). When we started with JSON-LD, RDF didn’t have native graph support either. For all the “RDF data model is elegant” arguments we’ve seen over the past decade, there are just as many reasons to kick it to the curb. This is exactly what we did when we created JSON-LD, and that really pissed off a number of people that had been working on RDF for over a decade.

I personally wanted JSON-LD to be compatible with RDF, but that’s about it. You could convert JSON-LD to and from RDF and get something useful, but JSON-LD had a more sane data model where lists were a first-class construct, you had generalized graphs, and you could use JSON-LD using a simple library and standard JSON tooling. To put that in perspective, to work with RDF you typically needed a quad store, a SPARQL engine, and some hefty libraries. Your standard web developer has no interest in that toolchain because it adds more complexity to the solution than is necessary.

So screw it, we thought, let’s create a graph data model that looks and feels like JSON, RDF and the Semantic Web be damned. That’s exactly what we did and it was working out pretty well until…

...

That said, after 7+ years of being involved with Semantic Web / Linked Data, our company has never had a need for a quad store, RDF/XML, N3, NTriples, TURTLE, or SPARQL. When you chair standards groups that kick out “Semantic Web” standards, but even your company can’t stomach the technologies involved, something is wrong. That’s why my personal approach with JSON-LD just happened to be burning most of the Semantic Web technology stack (TURTLE/SPARQL/Quad Stores) to the ground and starting over. It’s not a strategy that works for everyone, but it’s the only one that worked for us, and the only way we could think of jarring the more traditional Semantic Web community out of its complacency. "

---

http://www.w3.org/TR/json-ld/#basic-concepts

http://json-ld.org/playground/

http://json-ld.org/learn.html

---

" Many of the designs for these stack computers have their roots in the Forth programming language. This is because Forth forms both a high level and assembly language for a stack machine that has two hardware stacks: one for expression evaluation/parameter passing, and one for return addresses. In a sense, the Forth language actually defines a stack based computer architecture which is emulated by the host processor while executing Forth programs. The similarities between this language and the hardware designs is not an accident. Members of the current generation of stack machines have without exception been designed and promoted by people with Forth programming backgrounds. " -- http://users.ece.cmu.edu/~koopman/stack_computers/sec1_5.html (1989)

---

"Lua is much much simpler than Javascript, which is again simpler than Python (Python is really vast)." -- https://news.ycombinator.com/item?id=5310186

"I've been working on a range of JIT compilers, but the PyPy? is the biggest one. The problem is ... that interaction between (semantics) is a headache... descriptors, metaclassses, new/old style classes, tons of builtin types, tons of builtin modules, all of it the user will expect to seamlessly integrate with the JIT compiler." -- https://news.ycombinator.com/item?id=5331454

---

evincarofautumn 551 days ago

link

I develop a compiler for ActionScript? 3. Though it’s not a great language, it does have a few distinct advantages over its cousin JavaScript?. First, type annotations. Obviously the more knowledge you have about the structure of the program, the better you can help it run well. Having a class model instead of a prototype model also helps—much as I like working with prototypes—because you can easily eliminate late binding (“hash lookups”) and allocations, two of the performance killers mentioned in the slides. The operative word there is easily. You can analyse and JIT all you want, but you cannot feasibly solve at runtime the fundamental constraints imposed by the language.

Say a compiler author needs twice as much effort and cleverness to make programs in language X run fast than for language Y. That means that—all other things being equal—implementations of X will be twice as slow as Y for the same basic quality of implementation.

---

    "This function handles sensitive information, and the compiler must ensure that upon return all system state which has been used implicitly by the function has been sanitized." 

While I am not a compiler developer, I don't think this is an entirely unreasonable feature request: Ensuring that registers are sanitized can be done via existing support for calling conventions by declaring that every register is callee-save, and sanitizing the stack should be easy given that that compiler knows precisely how much space it has allocated.

With such a feature added to the C language, it will finally be possible — in combination with memset_s from C11 — to write code which obtains cryptographic keys, uses them without leaking them into other parts of the system state, and then wipes them from memory so that a future system compromise can't reveal the keys. People talk a lot about forward secrecy; it's time to do something about it.

pslam 19 hours ago

link

Part 2 is correct in that trying to zero memory to "cover your tracks" is an indication that You're Doing It Wrong, but I disagree that this is a language issue.

Even if you hand-wrote some assembly, carefully managing where data is stored, wiping registers after use, you still end up information leakage. Typically the CPU cache hierarchy is going to end up with some copies of keys and plaintext. You know that? OK, then did you know that typically a "cache invalidate" operation doesn't actually zero its data SRAMs, and just resets the tag SRAMs? There are instructions on most platforms to read these back (if you're at the right privilege level). Timing attacks are also possible unless you hand-wrote that assembly knowing exactly which platform it's going to run on. Intel et al have a habit of making things like multiply-add have a "fast path" depending on the input values, so you end up leaking the magnitude of inputs.

Leaving aside timing attacks (which are just an algorithm and instruction selection problem), the right solution is isolation. Often people go for physical isolation: hardware security modules (HSMs). A much less expensive solution is sandboxing: stick these functions in their own process, with a thin channel of communication. If you want to blow away all its state, then wipe every page that was allocated to it.

Trying to tackle this without platform support is futile. Even if you have language support. I've always frowned at attempts to make userland crypto libraries "cover their tracks" because it's an attempt to protect a process from itself. That engineering effort would have been better spent making some actual, hardware supported separation, such as process isolation.

reply

acqq 1 hour ago

link

The "right privilege level" allows you to see anything that happens during the execution of the lower privilege levels. I can even single-step your application with the right privilege level. So the crypto services have to run at the high privilege level and ideally your applications should leave even the key management to the "higher privilege levels." That way attacking the application can leak the data, but not the key, that is, you can still have the "perfect forward secrecy" from the point of the view of the application. So you have to trust the OS and the hardware and implement all the tricky things on that level. Trying to solve anything like that on the language level doesn't seem to be the right direction of the attacking the problem.

reply

tgflynn 15 hours ago

link

So is it correct to say that if a process does not want to leak information to other processes with different user ID's running under the same kernel that a necessary (but not necessarily sufficient, due to things like timing attacks) condition is for it to ensure that any allocated memory is zero'd before being free'd ?

I wonder if current VM implementations are doing this systematically.

It seems like a kernel API to request "secure" memory and then have the kernel ensure zeroing would be useful. Without this I'm wondering if it's even possible for a process to ensure that physical memory is zero'd, since it can only work with virtual memory.

reply

pslam 12 hours ago

link

All kernels I know of zero all memory they hand over to user processes. It's been part of basic security for quite some time - exactly for this kind of thing. It's usually done on allocation, not free - it doesn't really matter which way around, but doing it "lazily" can often be better performance.

reply

---

told phill about my Views idea. His feeling was that's it's useless b/c you can do the same thing with multiple inheritance. I'm not sure if there is anything views can do that multiple inheritance can't, i'm still thinking about that. If not, then my arguments are:

---

next-gen systems that people mention/that i've heard of:

https://news.ycombinator.com/item?id=8324578

inferno, plan 9, minix, "Oberon variants, specially System 3 release with its gadgets.", Amoeba

---

sergiosgc 1 day ago

link

> I cherish Microsoft for its work in Windows, it is the only mainstream OS that can save us from UNIX monoculture.

I tend to simplistically reduce this choice between:

Both choices have the potential for great results. In practice, however, the documentation and composability in the everything is a file paradigm has shown to produce better results. Unix ecosystems are more alive; it is easier to replace system parts and create operating systems out of your own choice of replaceable layers. The monolithic nature of Windows is a reflection on the failure of the object oriented approach to operating system interfaces.

Mind you, I don't think it is an intrinsic fault. It may be that it's a fault of this implementation of an OO operating system.

This is the main reason I dislike what the Gnome folks are pushing down Linux's throat. The Gnome way is, today, at odds with the Unix way.

reply

fragmede 1 day ago

link

That's an over-simplification, as when you get down to it, there are plenty of things that don't use the file API on UNIX, or at least have convenient wrappers so you're not blindly calling 'ioctl' left and right. I'm thinking specifically of the socket API for bind/connect/accept, the inotify interface, with inotify_* functions, the (less commonly used these days) serial port API, with the tc* and cf* functions.

> It may be that it's a fault of this implementation of an OO operating system.

I'm not convinced it's just this implementation. When everything's an object, there's additional overhead as a programmer trying to use the API.

File based API - Open a file using the same 'open' call you've always used, and then call 'read' or 'write' (or even 'ioctl') on the file descriptor. Call 'close' when done. There may be multiple files you need to access to get the particular behavior you want.

Object based API - Browse around the docs until you find that particular subsystem's API. Read a bunch. Finally instantiate the object you need. Set all of the objects various properties. Call CreateObject?(object, options). Wait, go back and create a options object. Set all of the properties on that object as well. Try that until you figure out you needed to call CreateObjectEx? instead. Hit your head on the desk for a couple hours until you figure out that you also need to call RunObject?, and that one of the parameters on the options object was set wrong.

As a programmer, the file-based API is just a layer of indirection to make things easier, and shouldn't be considered a limiting factor.

Then again, there's this article about Event Tracing on Windows, that makes me think it's just this iteration of OO-based operating system, and a better designed API would do us all a favor.

http://mollyrocket.com/casey/stream_0029.html

reply

taeric 1 day ago

link

That is more easily reduced to

That is, the "object" way sounds more promising and unlimited. I think, in practice, it is that unlimited nature that actually hurts.

Though, I hate the "paradox of choice" and I think I am saying it reduces to that. :(

reply

jude- 1 day ago

link

I don't think it's the paradox of choice at work. I think instead that the set of file operations happens to permit arbitrary interaction with an object. Specifically, file operations already represent:

I would argue that any object (OS-level or otherwise) can be represented as a filesystem. The advantage of doing so is that the client interface stays the same in all cases. I think it's this constraint on the interface that makes file operations and filesystems so appealing in the long run.

reply

pjmlp 1 day ago

link

> Both choices have the potential for great results. In practice, however, the documentation and composability in the everything is a file paradigm has shown to produce better results.

Until you start doing IPC between processes, IOCTL, heavy graphics programming and there goes the abstraction.

reply

ahomescu1 23 hours ago

link

> - Everything is an object (the windows way)

I wouldn't say there is a "Windows way", considering that current Windows is actually a system composed of several different layered components:

1) The NT kernel (which is pretty nicely designed) its own ideas of what "objects" are and all the ObXXX? functions.

2) Win32 which is a horrible mess from the Windows 3.1 days and maybe earlier.

3) COM and flavors, which actually came later with its own brand of objects.

4) The .NET framework.

It doesn't seem to me that "Windows objects" are really unified in spirit and design across all these components (not to mention that Win32 isn't object-oriented at all).

reply

metafex 1 day ago

link

> Yes, UNIX has a bunch of nice ideas, but that is all. Ouch, that actually hurts a little. Yes, the kernel team at Microsoft is the real deal, but I'll not go into that.

As to "a bunch of nice ideas", I'd suggest you take a look at the concepts that Plan 9 from Bell Labs brings to the table. The everything is a file concept there is something I wish would have taken over the more classic approach to devices, networking, etc. Want to open a remote shell on another server? Easy as opening a file handle. Accessing the CPU and running programs? Same story. Now if you call that just a nice idea, I rest my case ;-)

Microsoft even explored some of those concepts in their research OSes like Singularity and Midori. There has to be something about those ideas that they appeal even to MS, that's just my view though.

reply

pjmlp 1 day ago

link

Quoting Rob Pike on his Slashdot interview:

We really are using a 1970s era operating system well past its sell-by date. We get a lot done, and we have fun, but let's face it, the fundamental design of Unix is older than many of the readers of Slashdot, while lots of different, great ideas about computing and networks have been developed in the last 30 years. Using Unix is the computing equivalent of listening only to music by David Cassidy.

reply

jacquesm 1 day ago

link

I think the main reason why 'everything is a file' is so powerful is because it comes with a built in security model.

reply

pjmlp 1 day ago

link

Plan 9 != UNIX

reply

metafex 1 day ago

link

Yeah, I know, it's the unix concept thought further :)

reply

gillianseed 1 day ago

link

>I cherish Microsoft for its work in Windows, it is the only mainstream OS that can save us from UNIX monoculture.

What is Windows doing differently than UNIX monolithic kernels? Drivers and filesystems run in NT kernel space.

reply

pjmlp 1 day ago

link

reply

gillianseed 6 hours ago

link

>Moving their kernel to C++

Not sure this matters given the extreme low-level nature of kernel and driver coding. Certainly not enough to warrant a rewrite, which I guess is a result of Microsoft more or less dropping support for modern C in their compiler by not even supporting C99 fully.

>Experimenting with new OS architectures

Certainly Singularity was an interesting experiment, sadly and typical of Microsoft mentality, once they ruled it out for production use and passed it off for others to play with, they chose to do so under such a restrictive license (shared source) that no one will do something interesting with it.

As for drawbridge, the same ideas regarding sandboxing are being worked on in the Linux camp, and from what I understand also in solutions like Capsicum on FreeBSD?, from the looks of it, *NIX is way ahead of Microsoft in this area.

>Continue the multimedia OS culture of Atari ST, Amiga systems

Not sure what you mean here unless you are talking about coming with a GUI out-of-the-box ?

Your typical current Linux distro is more of a 'multimedia OS' than Atari ST or Amiga ever was (owned them both), which includes BeOS?, the one which actually carried that moniker.

reply

vezzy-fnord 1 day ago

link

You're saying it as if having a bunch of nice ideas that have been practically implemented is a negligible thing.

To add on to what a previous commenter said, Plan 9 is proof that Unix's nice ideas are more than a few. In fact, the ultimate irony is that some of the most fascinating OS design has emerged from being more Unix than Unix itself.

Plan 9 has also become pretty much the only breath of fresh air from an architectural standpoint in Linux, what with procfs and all.

reply

pjmlp 1 day ago

link

Except Plan 9 is not UNIX. It is an evolution of UNIX.

I seriously doubt I could use such abstractions for high performance 3D computing.

reply

---

http://akkartik.name/post/wart-layers has the idea of a program built out of (an ordered sequence of) layers which each include their own tests; (the sum of a prefix of) layers are like versions, and the result of stacking any prefix of layers is a program which passes its own tests.

(i guess these layers are not quite like chronological versions because you can make changes to an underlying layer after you have added a new layer on top of it)

---

Avatar Sae Hirak • a year ago

I really like this idea (akkartik's wart-layers idea, above) I'm going to experiment with adding it to Nulan and see what happens. I suspect hyper-static scope will get in the way, though. We'll see.

1 • Reply • Share ›

        −
    Avatar
    Kartik Agaram Mod Sae Hirak • a year ago
    Thanks! Yeah, tell me how it goes. I too want to try it with a high-level language, and js would be a good candidate. I wouldn't be surprised if the metaprogramming features of a HLL obviate the need for my patch-based directive format.
    •
    Reply
    •
    Share ›

---

it seems to me that akkartik's wart-layers idea is a special case of plug-in architecture, and that language support for plug-in architecture is better

---

related to akkartik's wart-layers idea:

aspect-oriented programing (AOP)

http://www.ruby-doc.org/core-2.0.0/Module.html

---

http://akkartik.name/post/tracing-tests

the idea is that instead of unit tests, the tests should log 'facts' that it computes to a 'trace', and then search the 'trace' to make sure the correct 'facts' were computed. The benefit is that the program can be refactored without changing the tests too much (b/c it doesn't matter where in the control flow the fact was added to the trace, just that it was added before the test checked it)

sounds like a pretty good idea


" Can you imagine indenting code as follows in python?

    i=0, j=1
        k, l = 34, 65
            [i, k] = [k, i]

Each of these assignments corresponds to a different special form in Common Lisp:

    (let* ((i 0)
           (j 1))
      (multiple-value-bind (k l) (values 34 65)
        (destructuring-bind (i k) '(k i)
          ...)))

Verbose and visually cluttered. When I realized returning multiple values was turning into such a pain it was turning me off values, I created the bind macro in response:

    (bind (i 0)
          (j 1)
          (:mv (k l) (values 34 65))
          (:db (i k) '(k i))
      :do
          ...)

Much better0,1. " -- http://akkartik.name/lisp.html

---

Wart's solution for the apply-with-macros problem:

http://arclanguage.org/item?id=16378

---

paul graham also had the idea of unifying array lookup and functions:

" A more serious problem is the diffuseness of prefix notation. For expert hackers, that really is a problem. No one wants to write (aref a x y) when they could write a[x,y].

In this particular case there is a way to finesse our way out of the problem. If we treat data structures as if they were functions on indexes, we could write (a x y) instead, which is even shorter than the Perl form. Similar tricks may shorten other types of expressions. "

---

" One of the most egregiously unlispy pieces of syntax in Common Lisp occurs in format strings; format is a language in its own right, and that language is not Lisp. If there were a plan for introducing more syntax into Lisp, format specifiers might be able to be included in it. It would be a good thing if macros could generate format specifiers the way they generate any other kind of code.

An eminent Lisp hacker told me that his copy of CLTL falls open to the section format. Mine too. This probably indicates room for improvement. It may also mean that programs do a lot of I/O. " -- http://www.paulgraham.com/popular.html

---

" Users are interested in response time. But another kind of efficiency will be increasingly important: the number of simultaneous users you can support per processor. Many of the interesting applications written in the near future will be server-based, and the number of users per server is the critical question for anyone hosting such applications. In the capital cost of a business offering a server-based application, this is the divisor. ... In some applications, the processor will be the limiting factor, and execution speed will be the most important thing to optimize. But often memory will be the limit; the number of simultaneous users will be determined by the amount of memory you need for each user's data. The language can help here too. Good support for threads will enable all the users to share a single heap. It may also help to have persistent objects and/or language level support for lazy loading. " -- http://www.paulgraham.com/popular.html

---

" It could be an even bigger win to have core language support for server-based applications. For example, explicit support for programs with multiple users, or data ownership at the level of type tags. ... Lisp is a natural fit for server-based applications. Lexical closures provide a way to get the effect of subroutines when the ui is just a series of web pages. S-expressions map nicely onto html, and macros are good at generating it. There need to be better tools for writing server-based applications, and there needs to be a new Lisp, and the two would work very well together.

" -- http://www.paulgraham.com/popular.html

---

---

http://arclanguage.org/item?id=15587

	Issues with extensible def
	3 points by akkartik 972 days ago | 1 comment
	I've been playing for some time now with writing all my code in a style of stepwise refinement. For example, here's how I extend (http://arclanguage.org/item?id=13790) len to understand arc's queues:
  def len(x) :case (isa x queue)
    rep.x.2

This can be a really elegant style for separating concerns. All my code about queues is in one place, even if it updates primitives. But there are costs:

a) Performance. What used to be a simple macro call can now need to be expanded multiple times. Especially if you build the language from the ground up in this style, the increase in calls has a compounding effect (http://arclanguage.org/item?id=15392). It's all very well to say 'design the language without regard to performance', but it's five orders of magnitude slower than anything else out there.

I've been working on a partial-evaluator to 'inline' a lot of this redundant work, but it's been slow going. Check out http://github.com/akkartik/wart/blob/00f18c9235/070inline.test. I'm not yet at the point where (if a b) uses the compiled if rather than the macro to support more than 2 branches.

b) Infinite loops. Especially when dealing with core language features like assignment and conditions and equality, it's easy to overload too much. See, for example, how I implement isa: http://github.com/akkartik/wart/blob/00f18c9235/030.wart#L49.

So far I've been rationalizing these problems as transients that I have to write a test case for and debug once, and never deal with again. However it's getting to the point where it's really hard to spot the infinite loop, to insert a print in just the right place to see the repetition. Most recently, I found myself trying to overload if to support pattern matching, and realized that if the :case clause ever calls if, even indirectly, I end up with an infinite loop (http://github.com/akkartik/wart/commit/67129351d3). And the language doesn't even do all that much yet!

One idea to deal with this: I've been mulling changing the semantics so that new :case clauses on existing functions aren't used in code already loaded. It would improve performance and also hugely mitigate the headache of debugging infinite loops. But it feels like a compromise of precisely the sort that I've been railing against.[1]

Anyways, just wanted to toss this out there to get your reactions.

---

[1] Rather surprisingly, I can't find a decent link for my rant. Was it all in my head? The best I can find is http://www.arclanguage.org/item?id=13263. Or maybe http://arclanguage.org/item?id=12057. Hmm, let me try to state it.

A lot of the benefit of lisp has traditionally come from the repl. And a lot of the benefit of the repl comes from the fact that you can type incorrect code in, and you won't be bothered by the toolchain until you try to run it. Functions can be defined in any order, called in other function bodies before they're defined, and lisp won't complain. Dynamic types and late binding are specialcases of this ethos.

The value of this approach isn't just that you can be disorganized when sketching out a new program. Eliminating machine-driven constraints affords freedom to organize code. Organizing code for readability is hard. So hard that most complex programs in any language are poorly organized. They start out with structure following behavior, and then that mapping gradually rots. Axes of organization that made sense before cease to do so, or are swamped by the needs of a new aspect. Definitions in a file are more intimately connected to distant definitions than to nearby ones, and the reorganization never quite catches up. Existing programmers can handle this gradual semantic drift, but it increases the barrier to new programmers contributing to the project. Given how reluctant most of us are to even try, every constraint added by our tools makes it less likely that we'll reorganize everytime the emphasis of the codebase switches directions. Many's the time I've tried to reorganize the header files in a C codebase and just given up after a few hours because it was just too hard to manually solve the graph constraint satisfaction problem that would make the compiler happy.

So the fact that lisp started out with very few constraints is super super key to my mind. But most of the world doesn't seem to think so. Both lisp and scheme have over time added features that constrain code organization. Languages ahead of their time, they have stolen package and module systems from the static languages around them (partly to simplify the lives of compiler writers). This has happened so gradually that I think there isn't a realization of just what has been lost.

Wart employs an init-style convention for deciding what files to load, and in what order (http://arclanguage.org/item?id=11869). That idea stems from the same considerations. Renaming a file should require: step 1, renaming the file. There ought to be no step 2. That's how easy I want reorganization to be. And I want it to be as true as possible at all levels of granularity.

Richard Gabriel: "If I look at any small part of it, I can see what is going on -- I don’t need to refer to other parts to understand what something is doing. This tells me that the abstractions make sense by themselves -- they are whole. If I look at any large part in overview, I can see what is going on -- I don’t need to know all the details to get it. It is like a fractal, in which every level of detail is as locally coherent and as well thought-out as any other level." ("The quality without a name," http://dreamsongs.com/Files/PatternsOfSoftware.pdf)

---

Oh, another variant of this rant, if you're really bored: http://news.ycombinator.com/item?id=2329680. Does it even seem like the same rant? Maybe it really is all in my mind.

1 point by akkartik 971 days ago

link

Ah, I solved the infinite loop!! I'm glad I took the time to write this out here.

Let's start with these definitions:

  (def foo() 34)
  (def bar() (foo))

I'd like to make sure that bar always calls the same foo, even if foo is later overridden.

  wart> (bar)
  34
  wart> (def foo() 35)
  wart> (bar)
  35 ; Somehow should still return 34

To do so, I save the current binding of foo when bar is defined, and restore it inside the body.

  (let $foo foo             ; lexical load-time binding
    (def bar()
      (making foo $foo      ; restore foo for the entire dynamic scope of this body
        (foo))))
  wart> (def foo() 36)
  wart> (bar)
  34

(making is wart's way of creating a new dynamic binding.)

<rant>

I'm permitting foo to be extended but freezing its definition in a specific call-site. Compare this with racket's approach, which is to freeze the definition of foo and prevent any extensions at all call-sites. Racket's approach is similar to Java's final keyword.

</rant>

I wish I could save this pattern as a macro:

  (freezing foo
    (def bar()
      (foo)))

But freezing would need to 'reach into' its subexpressions. Hmm, I think I can bake this into the implementation of extensible mac. Too bad I won't be able to use stepwise refinement in implementing _that_..

Update: I've fixed mac so :case expressions can't trigger infinite loops: http://github.com/akkartik/wart/commit/a7fb204296. HOWEVER: this commit makes my tests 8 times slower. It's a shattering illustration of how these calls can compound.


" Another highlight for me: idoh shared why he prefers arc to racket:

1. Less constrained defmacro macros that do what you mean. This is made possible by ignoring worries about hygiene.

2. A webserver that transparently reflects changes at the repl. This is made possible because mutable state isn't thread-private by default.

It was a welcome reminder of the fundamentals. "

in Oot, i guess #2 won't be a problem because 'thread local' is just a constraint that won't apply in DEBUG mode, and hence won't apply to the repl?

---

" Lisp historically suffers from the problem of having too many variants of basic stuff (http://arclanguage.org/item?id=12803). But the problem isn't that it's impossible to find the one true semantic for equality or coercion, it's that the language designers didn't put their foot down and pick one. Don't give me two different names with subtle variations, give me one strong name.

This is related to (in tension with, but not contradictory to) extensibility. Common lisp gives us three variations of equality. Different methods for string manipulation react differently to symbols: coercing symbols to strings works but (string 'a) dies. And neither equality nor coercion nor any primitives are extensible. This is exactly wrong-headed: give me one equality, make the string primitives all handle symbols uniformly -- and give me the empowering mechanisms to change the behavior when I need to.

I got rid of is from wart a few weeks ago. "

2 points by rocketnia 1315 days ago

link

The key to me is to make coercions extensible. Then if I ever need a coercion to behave differently I can just wrap one of the types and write a new coercion.

You could manually type-wrap your arguments when calling basic utilities, but that seems verbose almost all the time. ^_- If you're talking about putting the wrapper on the target type, like (rep:as special-coersion-wrapper x), that's still more verbose than special-coercion.x.

Something I've said for a while is that individual coercion functions like 'int and 'testify should be the norm. For every type 'coerce handles, there could be a global function that handled that type instead. The only thing this strategy seems to overlook in practice is the ability to convert a value back to its original type, and even that coercion-and-back can be less leaky as a single function (like the one I mention at http://arclanguage.org/item?id=13584).

---

Don't give me two different names with subtle variations, give me one strong name.

The difference between 'testify and 'conversionify (http://arclanguage.org/item?id=13678) isn't subtle, and neither is the difference between 'pr and 'write (or the more proper coercions [tostring:pr _] and [tostring:write _]). The purpose of a coercion isn't entirely explained by "I want something of type X." In these cases, it's "I want a Y of type X," and the Y is what's different.

Perhaps you can make a 'test type to turn 'testify into a coercion and an 'external-representation type so that 'write can be implemented in terms of 'pr. Maybe you can even replace the 'cons type with a 'list type to avoid the awkward (coerce "" 'cons) semantics, with (type nil) returning 'list. At that point I'll have fewer specific complaints. :)

On another note, I suspect in a land of polymorphism through extensibility, it makes more sense to test the type of something using an extensible predicate. If that's common, it'll probably make more sense to coerce to one of those predicates than to any specific type. Maybe 'coerce really should work like that? It could satisfy the contract that (as a b) always either errors out or returns something that satisfies the predicate a. This isn't to say I believe in this approach, but I hope it's food for thought for you. ^_^

---

I got rid of is from wart a few weeks ago.

Well, did you undefine 'eq too? Without a test for reference equality, there'll probably be gotchas when it comes to things like cyclic data structures, uninterned symbols, and weak references.


2 points by akkartik 1314 days ago

link

definitely food for thought, thanks :)

  Me: "Don't give me two different names with subtle variations,
  give me one strong name."
  You: "The difference between 'testify and 'conversionify isn't subtle.."

Yeah I wasn't talking about your previous suggestions. I was talking about eq vs eql vs equal, or about string vs coerce 'string. I was talking about subtle variations in names.

Yes I still use eq. It's useful in some cases, no question, but not so often that it merits thinking about what to call it. If I were creating a runtime from scratch, I'd give it a long name, like pointer-equal or something. If I find a better use for the name I'll override it without hesitation.

Names that take up prime real estate are like defaults. Having a bunch of similar-sounding, equally memorable words that do almost the same thing is akin to having large, noisy menus of preferences. They make the language less ergonomic, they make it harder to fit into people's heads. "What does upcase do to a list again? Does it use this coerce or that one?"

If the default doesn't work for someone of course they'll make their own up. This is lisp. They're empowered. I don't need to consider that case. My job as a language designer is to be opinionated and provide strong defaults.

I'm only arguing the general case for coercions. It's possible that we need multiple kinds of coersions in specific cases, and that these need to have prime real estate as well.


1 point by rocketnia 1314 days ago

link

Okay, agreed on pretty much all accounts. ^^

---

If I were creating a runtime from scratch, I'd give [is] a long name, like pointer-equal or something.

Yeah, I think it's a bit unfortunate that 'iso has a longer name than 'is. I've thought about calling them '== and '=== instead, or even '== and 'is for maximum brevity, but those are more confusing in the sense you're talking about, in that their names would be interchangeable. I can't think of a good, short name for 'iso that couldn't be mistaken for the strictest kind of equality available. "Isomorphic," "equivalent," "similar," and maybe even "congruent" could work, but "iso" is about as far as they can be abbreviated.

...Well, maybe "qv" would work. XD It has no vowels though. Thinking about Inform 7 and rkts-lang makes me strive for names that are at least pronounceable; I remember Emily Short mentioning on her blog about how a near-English language is easier for English typers to type even when it's verbose, and rkts posting here about how it's nice to be able to speak about code verbally. I think it's come up a few times in discussions about the name 'cdr too. ...And I'm digressing so much. XD;;;;

---

I'm only arguing the general case for coercions.

This is the one spot I don't know I agree with, but only because I don't know what you mean. What's this "general case" again?


---

" The unique features of oyster all come from four properties: homoiconicity, first-class fexprs, composable closures, and a lack of fixed syntax.

    Oyster is homoiconic. That means that code is expressed using the same kind of data structures that it manipulates. If javascript was written in JSON, it would be homoiconic. If python was written as a nested set of python object-literals, it would be homoiconic. LISP is written as nested lisp lists; it is homoiconic. Homoiconicity allows programs to easily manipulate themselves and each other, without having to implement parsers and ASTs and other bureaucracy. (I was going to link to the Wikipedia page on this topic, but first I have to rewrite it because it’s truly terrible.)
    Oyster’s basic functional unit is a FEXPR. BY ‘basic functional unit’ I mean to say that, in the same position that an object-oriented language has objects and methods, and a functional language has functions, oyster has fexprs. A fexpr is a function whose arguments are not evaluated first. A fexpr is handed code for each of its arguments, which it can manipulate, evaluate, or ignore. Fexprs are a single tool which can implement both function-like behavior and macro-like behavior. In practice, oyster fexprs express in their signature whether they’d like individual arguments to be evaluated or not — this makes writing function-y behavior just as straightforward as writing traditional functions.
    Oyster is magically scoped. OK, I’m using pretty biased language here, but this is one of the cool parts, that I’ve never seen anywhere else. Fexprs take unevaluated code as arguments — but oyster gives those code-fragments closures, so that the symbols contained within will continue to have the same binding that they had where they were written. Macros written using oyster fexpers look like common lisp’s macros, but they are hygenic by default. Code objects that have been closed over can still be divided, composed, and otherwise manipulated, while retaining this closure property. ‘Leaks’ can also be explicitly created in closures, allowing individual symbols access to the evaluation environment: dynamic scoping on request — so you can easily write anaphoric if and other leaky macros."

http://arclanguage.org/item?id=16826

" Wart update 5 points by akkartik 704 days ago

34 comments
	1. '=' is now equality.

2. Assignment is '<-'.

3. 'or=' is now 'default..:to'.

4. Inequality is '~='.

5. The comment character is now '#'. This is consistent with more languages, and feels more unix-y with the shebang and all.

Here's how pos currently looks in wart:

  1. return the index of x in seq, or nil if not found def pos(x seq n) default n :to 0 if seq if (car.seq = x) n (pos x cdr.seq n+1)

To try it out:

  $ git clone http://github.com/akkartik/wart.git
  $ cd wart   # edit; thanks fallintothis!
  $ git checkout f0e3d726eb
  $ ./wart
  wart>"

" 3) Your vimrc.vim really isn't the right way to distribute language-specific features. See

  :h syn-files
  :h 44.11
  :h ftplugin

"

---

http://arclanguage.org/item?id=10693

" Two recent discoveries: diskvar, defcache 6 points by akkartik 1771 days ago

5 comments
	I've found I reinvented the wheel (and poorly) on a couple of features without knowing they already existed. Neither is in the tutorial or the docs at arcfn, so here they are:

1. diskvar - easy persistence. Declare the file to save a var to with diskvar, and you can then todisk it to save.

2. defcache - defmemo with periodic recomputation.

"

---

http://arclanguage.org/item?id=11103

Loading files repeatedly in the repl 6 points by akkartik 1714 days ago

2 comments
	When debugging I find myself repeatedly loading files into the repl. Here are a few helpers to smooth the rough edges off that use-case.

1. Reduce keystrokes to reload.

  (def l(f)
    (load:+ stringify.f ".arc"))
  ; stringify = coerce _ 'string
  arc> l!server ; => (load "server.arc")

I only ever use this on the repl.

2. Don't lose global state when reloading a file.

  (mac init args
    `(unless (bound ',(car args))
       (= ,@args)))

I replace top-level = with init within the file

3. Don't lose closure state when reloading a file.

  (mac ifcall(var)
    `(if (bound ',var)
       (,var)))
  ; Example usage within a file: server-thread* doesn't lose
  ; its value across file reloads
  (let server-thread* (ifcall server-thread)
    (def start-server((o port 8080))
      (stop-server)
      (= server-thread* (new-thread (fn() (asv port)))))
    (def stop-server()
      (if server-thread* (kill-thread server-thread*)))
    (def server-thread()
      server-thread*))

4. Don't accidentally print a huge data structure to screen for several minutes when you only care about side-effects

  arc> (no:each (k v) very-large-table (do-something))

---

	A confession of stupidity
	6 points by akkartik 1634 days ago | 10 comments
	Y'all may have watched me complain about performance for the last couple of days. I want to issue a mea culpa and say it was all my fault. I still love you arc, I'm not going to switch to SBCL just yet.

Buried in my 3000 lines of arc was this innocuous-looking 2-liner:

  (def most-recent-unread(user feed)
    (most doc-timestamp (rem [read? user _] feed-docs.feed)))

I was basically sifting through 10-100 element lists over and over again. Worse, doc-timestamp lazily loads and caches metadata files. As I randomly selected feeds for the user I was doing 10-100 disk reads per request.

A little reorganization keeps the lists always sorted, so I can just:

  (def most-recent-unread(user feed)
    (car (rem [read? user _] feed-docs.feed)))

This one-line change takes my server from 50% timeouts at 6 users to 15% timeouts at 100 concurrent users.

You can assess the level of optimization in a codebase by the size of the 'longest pole in the tent'. By that heuristic this past week is basically a testament to my incompetence. The good news with incompetence: it lets me find 10x speedups like this. That (hopefully) won't last.

Thanks to aw for hammering away at me to look at the mote in my own code before blaming my tools (http://arclanguage.org/item?id=11515). I could have saved myself a world of pain[1] if I'd just remembered these manual-profiling tools I wrote a year ago:

  (= times* (table))
  (mac deftimed(name args . body)
    `(do
       (def ,(symize (stringify name) "_core") ,args
          ,@body)
       (def ,name ,args
        (let t0 (msec)
          (ret ans ,(cons (symize (stringify name) "_core") args)
            (update-time ,(stringify name) t0))))))
  (proc update-time(name t0) ; call directly in recursive functions
    (or= times*.name (list 0 0))
    (with ((a b)  times*.name
           timing (- (msec) t0))
      (= times*.name
         (list
           (+ a timing)
           (+ b 1)))))
  (def print-times()
    (prn "gc " (current-gc-milliseconds))
    (each (name time) (tablist times*)
      (prn name " " time)))

(To profile a function replace def with deftimed. Be careful using it with recursive functions.)

This isn't the first time I've 'forgotten' to profile before optimizing. Hopefully it'll be the last.

[1] 2 days of experiments with ab, 2 days of looking at erlang/lfe/termite/tornado/unicorn, 1 fun day of rewriting in SBCL, and 1 day of moving to a new data structure I turned out to not need.

---

" You can't make software clean simply by making all the ‘code’ in it more clean. What we really ought to be thinking about is readable programs. Functions aren't readable in isolation, at least not in the most important way. The biggest aid to a function's readability is to convey where it fits in the larger program.

Nowhere is this more apparent than with names. All the above articles and more emphasize the value of names. But they all focus on naming conventions and rules of thumb to evaluate the quality of a single name in isolation. In practice, a series of locally well-chosen names gradually end up in overall cacophony. A program with a small harmonious vocabulary of names consistently used is hugely effective... "


"

sqs 4 days ago

link

Yes times 1,000!

This is what we're trying to address at Sourcegraph (https://sourcegraph.com/). 80% of programming is about reading and understanding code, not writing code.

Why don't existing tools focus more on helping people read and understand code--and, more broadly, collaborate on a development team? Things like:

So far, most of the innovation in programming tools has been in editors or frameworks, not in collaboration and search tools for programmers.

While there are great editors and frameworks, the lopsidedness is unfortunate because making it easier for programmers to learn and reuse existing code and techniques, and to collaborate on projects, can have a much bigger impact than those other kinds of tools. That's because, in my experience, the limiting factors on a solo developer's productivity are the editors and frameworks she uses, but the limiting factor on a development team's productivity is communication (misunderstanding how to use things, reinventing the wheel, creating conflicting systems, not syncing release timelines, etc.).

reply "

---

" "The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

Actually, there is still scope. After all, functions still create a new scope. It's more correct to say that, with hyper-static scope, every time you create a variable, it creates a new scope:

  box foo = 1     # set foo to 1
  def bar -> foo  # a function that returns foo
  box foo = 2     # set foo to 2
  bar;            # call the function

In Arc, the call to "bar" would return 2. In Nulan, it returns 1. The reason is because the function "bar" is still referring to the old variable "foo". The new variable "foo" shadowed the old variable, rather than replacing it like it would in Arc. That's what hyper-static scope means.

From the compiler's perspective, every time you call "box", it creates a new box. Thus, even though both variables are called "foo", they're separate boxes. In Arc, the two variables "foo" would be the same box. "

---

http://www.orionsarm.com/eg-article/4609cba1c1178

---

" Unfortunately, there is no equivalent to IPython and there will never be, since the language does not have support for docstrings, nor the introspection facilities of Python: you would need to switch to Common Lisp with SLIME to find something comparable or even better.

All the Scheme implementations I tried are inferior to Python for what concerns introspection and debugging capabilities. Tracebacks and error messages are not very informative. Sometimes, you cannot even get the number of the line where the error occurred; the reason is that Scheme code can be macro-generated and the notion of line number may become foggy. On the other hand, I must say that in the five years I have being using Scheme (admittedly for toying and not for large projects) I have seen steady improvement in this area.

To show you the difference between a Scheme traceback and a Python traceback, here is an example with PLT Scheme, the most complete Scheme implementation and perhaps the one with the best error management:

$rlwrap mzscheme Welcome to MzScheme? v4.1 [3m], Copyright (c) 2004-2008 PLT Scheme Inc. > (define (inv x) (/ 1 x)) > (inv 0) /: division by zero

Type "help", "copyright", "credits" or "license" for more information. >>> def inv(x): return 1/x ... >>> inv(0) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in inv ZeroDivisionError?: integer division or modulo by zero

I should mention however that PLT is meant to be run inside its own IDE, DrScheme?. DrScheme? highlights the line with the error and includes a debugger. However such functionalities are not that common in the Scheme world and in my own experience it is much more difficult to debug a Scheme program than a Python program.

The documentation system is also very limited as compared to Python: there is no equivalent to pydoc, no help functionality from the REPL, the concept of docstring is missing from the language. The road to Scheme is long and uphill; from the point of view of the tools and reliability of the implementations you will be probably better off with Common Lisp. However, in my personal opinion, even Common Lisp is by far less productive than Python for the typical usage of an enterprise programmer.

"

---

this guy speaks of cultural probs: http://blog.jacius.info/2012/05/29/a-personal-lisp-crisis/


the guys who wrote SICP also wrote a book on classical mechanics and made a scheme symbolic math library for it:

http://www.cs.rochester.edu/~gildea/guile-scmutils/

http://mitpress.mit.edu/sites/default/files/titles/content/sicm/book-Z-H-1.html#titlepage

---

" David Barbour retweeted Tony Arcieri @bascule · Sep 19

Popular langs at @strangeloop_stl this year: Clojure, Scala, Haskell, OCaml, Coq, Idris, and the new kid on the block: Rust "

---

i dont want significant whitespace b/c it makes copy/pasting to internet forums annoying

so ppl say, something like gofmt makes it easy to have a uniform format for readability.

we also need something for writability so that you don't have to write being/end all the time.

can use : for begin, like Python.

for end, IDE support could allow <DEL> to create/move the 'end', just as when editing Python in Emacs

---

arenaninja 15 minutes ago

link

Sweet, sweet console.table()! I've never been happy with the way that console.log works for objects/arrays, I'm eager to use this one

reply

azinman2 11 hours ago

link

Ok debug logging into a table is probably one of the best improvements to logging I've seen in a long time. I kinda want this in every programming language. There are so many useful things about it, especially the ability to then randomly sort it at will!

reply

jamii 10 hours ago

link

You might enjoy http://db.cs.berkeley.edu/papers/eurosys10-boom.pdf . They put everything into tables (debug logs, profiling info, server state, monitoring statistics, messages sent/received etc) and then debug the system by writing queries on the tables.

reply

---

" One problem is that Python, being a highly dynamic language, it (Python) supports introspection at many levels, including some implementation-specific levels, like access to bytecode in CPython, which has no equivalent in Jython, IronPython? or other implementations. Because of the language's dynamic and introspective features, there is often no real distinction between a module's API and its implementation. While this is an occasional source of frustration for Python users (see e.g. the recent discussion about asyncore on python-dev), in most cases it works quite well, and often APIs can be simpler because of certain dynamic features of the language. For example, there are several ways that dynamic attribute lookup can enhance an API: automatic delegation is just one of the common patterns that it enables; command dispatch is another. "

i guess Oot could also support compile-time, static 'introspection'?

---

" I should mention that I have some experience in this area: Google's App Engine (to which I currently contribute most of my time) provides a "secure" variant of Python that supports a subset of the standard library. I'm putting "secure" in scare quotes here, because App Engine's security needs are a bit different than those typically proposed by the capability community: an entire Python application is a single security domain, and security is provided by successively harder barriers at the C/Python boundary, the user/kernel boundary, and the virtual machine boundary. There is no support for secure communication between mutually distrusting processes, and the supervisor is implemented in C++ (crucial parts of it live in a different process).

In the App Engine case, the dialect of the Python language supported is completely identical to that implemented by CPython. The only differences are at the library level: you cannot write to the filesystem, you cannot create sockets or pipes, you cannot create threads or processes, and certain built-in modules that would support backdoors have been disabled (in a few cases, only the insecure APIs of a module have been disabled, retaining some useful APIs that are deemed safe). All these are eminently reasonable constraints given the goal of App Engine. And yet almost every one of these restrictions has caused severe pain for some of our users. "

---

Date: Tue, 23 Sep 2014 08:09:24 -0700 (PDT) From: Dennis Allison Subject: [EE CS Colloq] Faults, scaling and Erlang concurrency * 4:15PM, Wed September 24, 2014 in Gates B01

             Stanford EE Computer Systems Colloquium
              4:15PM, Wednesday, September 24, 2014
     NEC Auditorium, Gates Computer Science Building Room B3
                   http://ee380.stanford.edu[1]

Topic: Faults, scaling and Erlang concurrency

Speaker: Joe Armstrong Ericsson

About the talk:

This talk shows the intimate relationship between faults and scaling.

We argue that systems that are designed for fault-tolerance will be easy to scale. Achieving fault-tolerance requires things like non-shared memory, which as a side effect makes them easy to scale.

We discuss the history of fault-tolerant systems and define six underlying principle that any system must have in order to achieve a reasonable measure of fault tolerance.

We show how these six principles are implemented in Erlang.

Slides:

There is no downloadable version of the slides for this talk available at this time.

About the speaker:

Joe Armstrong designed and implemented the first version of Erlang at the Ericsson Computer Science Lab in 1986.

He has written several Erlang books.

Joe has a PhD? in computer science from the Royal Institute of Technology in Stockholm, Sweden and is an expert in the construction of fault tolerant systems. Joe was the chief software architect of the project which produced the Erlang/OTP system. He has worked as an entrepreneur in one of the first Erlang startups (Bluetail) and has worked for 30 years in industry and research.

He is currently Adjunct Professor of Computer Science at the Royal Institute of Technology in Stockholm, and is a senior engineer at Ericsson.

Contact information: Joe Armstrong Ericsson

Embedded Links: [ 1 ] http://ee380.stanford.edu

---

Elided excerpt from [1]:

in [2] i wrote:

" Note that Haskell is implemented a bit differently from many languages; this stems from Haskell's properties:

It is therefore assumed that Haskell programs will, at runtime, very frequently be dealing with unevaluated thunks, with closures which represent unevaluated or partially evaluated functions, and with the application of functions whose arity was unknown at compile time. "

http://www.haskell.org/haskellwiki/Ministg

so, this suggests the place of expressions are included in oot assembly; not only because we could have an AST instead of a linear sequence of VM instructions, but also because (for the same reason that lazy languages are like a reduction of expression graphs), thunks must contain expressions. also, should think about how to generalize/unify this with dynamically passing around references to imperative code.

one piece of the puzzle that is missing is how to combine delimited continuation and friends (eg metaprogramming based on direct manipulation of the call stack) with haskell-style graph reduction this may also be related to the puzzle of how to have tail calls but also support good stack traces; as ministg points out, "Lazy evaluation means that the evaluation stack used by the STG machine is not so useful for diagnosing uncaught exceptions. The main issue is that the evaluation context in which thunks are created can be different to the evaluation context in which thunks are evaluated. The same problem does not occur in eager languages because the construction of (saturated) function applications and their evaluation happens atomically, that is, in the same evaluation context. ". In fact, you might say that in such a language there are not 2, but 4, stack contexts for each function: the lexical (source code) ancestors of the function definition in the AST, the lexical (source code) ancestors of the function application in the AST, the ancestors in the call stack at the runtime time of the function's creation, and the ancestors in the call stack at the runtime time of the function's calling. should also look at di franco's effort to combine call-stack metaprogramming with constraint propagation

the idea of haskell data constructors as a type-within-a-type is interesting and possibly a fundamental way to generalize/simplify. So now we seem to have at least 3 types of types: atomic types (Hoon's odors), composite types (C structs, Python objects, Python and Haskell and Lisp lists, Python dicts, Hoon's structural typing), and the choice of data constructor within a Haskell type (eg is this List a Cons or a NilList?? is this Maybe a Nothing or a Just?) in addition, there are interface types vs representation types, which may be an orthogonal axis

why is a g machine graph not tree (recursion, but why?)? http://stackoverflow.com/questions/11921683/understanding-stg says "Haskell's pervasive recursion mean that Haskell expressions form general graphs, hence graph-reduction and not tree-reduction" but i'm not sure i understand that, wouldn't we just be working with a lazily generated (possibly infinite) tree?

another primitive that we may need to deal with is fexprs, that is, (first-class) functions that don't evaluate their arguments first: https://en.wikipedia.org/wiki/Fexpr (how is this different from any function in the context of a lazy language? should we just generalize this to strictness annotations on each argument?)

APPLY COW WAIT GETTABLE? SETTABLE?

---

Date: Mon, 22 Sep 2014 20:43:41 -0700 From: Michael Bernstein To: pcd-seminar Subject: HCI Seminar 9/26, Noah Goodman — Stories from CoCoLab?: Probabilistic programs, cognitive modeling, and smart web pages

Noah Goodman, Stanford University Stories from CoCoLab?: Probabilistic programs, cognitive modeling, and smart web pages

September 26, 2014, 12:50-2:05pm, Gates B01 · Open to the public CS547 Human-Computer Interaction Seminar (Seminar on People, Computers, and Design) http://hci.st/seminar http://cs547.stanford.edu/speaker.php?date=2014-09-26

Probabilistic programming languages (PPLs) enable formal, high-level specification of probabilistic models and exploration of universal inference strategies. This talk will describe probabilistic programming languages, via a new Javascript-based language call WebPPL?. Additionally, a few applications to modeling human language understanding, and to constructing smart web pages.

Noah D. Goodman is Assistant Professor of Psychology, Linguistics (by courtesy), and Computer Science (by courtesy) at Stanford University. He studies the computational basis of human thought, merging behavioral experiments with formal methods from statistics and programming languages. He received his Ph.D. in mathematics from the University of Texas at Austin in 2003. In 2005 he entered cognitive science, working as Postdoc and Research Scientist at MIT. In 2010 he moved to Stanford where he runs the Computation and Cognition Lab. CoCoLab? studies higher-level human cognition including language understanding, social reasoning, and concept learning; the lab also works on applications of these ideas and enabling technologies such as probabilistic programming languages.

---

http://dippl.org/

The Design and Implementation of Probabilistic Programming Languages Noah D. Goodman and Andreas Stuhlmüller

includes The WebPPL? language

http://dippl.org/chapters/02-webppl.html

" With random sampling

WebPPL? is not just a subset of Javascript: is is a subset augmented with the ability to represent and manipulate probability distributions. Elementary Random Primitives (ERPs) are the basic object type that represents distributions. Under the hood an ERP e has a method e.sample that returns a sample from the distribution, a method e.score that returns the log-probability of a possible sampled value, and (optionally) a method e.support that returns the support of the distribution. However, these methods should not be called directly – in order for inference operators (described later) to work ERPs should be used through the WebPPL? keywords sample, factor, and so on.

You can sample from an ERP with the sample operator. For example, using the built-in bernoulliERP:

sample(bernoulliERP, [0.5])

true

There are a set of pre-defined ERPs including bernoulliERP, randomIntegerERP, etc. (Since sample(bernoulliERP, [p]) is very common it is aliased to flip(p). Similarly randomInteger, and so on.) It is also possible to define new ERPs directly, but most ERPs you will use will be either built in or built as the marginal distribution of some computation, via inference functions (see below).

With only the ability to sample from primitive distributions and do deterministic computation, the language is already universal! This is due to the ability to construct stochastically recursive functions. For instance we can define a geometric distribution in terms of a bernoulli:

var geometric = function(p) {

  return flip(p)?1+geometric(p):1

}

geometric(0.5)

1

And inference

WebPPL? is equipped with a variety of implementations of marginalization: the operation of normalizing a (sub-)computation to construct the marginal distribution on return values. These marginalization functions (which we will generally call inference functions) take a random computation represented as a function with no arguments and return an ERP that captures the marginal distribution on return values. How they get this marginal ERP differs between inference functions, and is the topic of most of most of the tutorial.

As an example consider a simple binomial distribution: the number of times that three fair coin tosses come up heads:

var binomial = function(){

  var a = sample(bernoulliERP, [0.5])
  var b = sample(bernoulliERP, [0.5])
  var c = sample(bernoulliERP, [0.5])
  return a + b + c}

var binomialERP = Enumerate(binomial)

print(binomialERP)

The distribution on return values from binomial() and sample(binomialERP) are the same – but binomialERP has already collapsed out the intermediate random choices to represent this distribution as a primitive.

What if we wanted to adjust the above binomial computation to favor executions in which a or b was true? The factor keyword re-weights an execution by adding the given number to the log-probability of that execution. For instance:

var funnybinomial = function(){

  var a = sample(bernoulliERP, [0.5])
  var b = sample(bernoulliERP, [0.5])
  var c = sample(bernoulliERP, [0.5])
  factor( (a|b) ? 0 : -2)
  return a + b + c}

var funnybinomialERP = Enumerate(funnybinomial)

print(funnybinomialERP)

It is easier to build useful models (that, for instance, condition on data) with factor. But factor by itself doesn’t do anything – it interacts with marginalization functions that normalize the computation they are applied to. For this reason running a computation with factor in it at the top level – that is, not inside a marginalization operator – results in an error. Try running funnybinomial directly….

WebPPL? has several inference operators, including Enumerate and ParticleFilter?. These are all implemented by providing a co-routine that receives the current continuation at sample and factor statements. "


to make emulation easier, mb provide c-like switch and also a computed goto_like switch (or is just computed goto itself enough?)