notes-computer-programming-hoon-hoonMotivations

My take on Hoon's motivation

Here's my take on the motivations for Hoon (not necessarily the rest of Urbit). I don't know Hoon, although i'm learning it, so this may be wrong.

The reason for Hoon is to have a language with the properties: (a) based on a core with a tiny spec (Nock) (because Urbit hot-patches itself with code over the network, so in order to specify Urbit's network protocol completely, the complete operational semantics of Nock must be included in the protocol spec, and so an extremely small spec was desired), (b) the core can 'eval' without overhead (because hot-patching), (c) purely functional, (d) does not (natively) support cyclic data (because cyclic data is harder to serialize), (e) good at validating untyped data; defining a custom type in Hoon is the same as defining a validation/normalization function that attempts to cast untyped data into the type.

About Nock: Nock is a purely functional 'assembly language'. One of its design goals is to have an extremely short specification; its spec fits on one page (see section 1.1 of [1] ). Nock may be seen as elaboration of SK combinator calculus in the same way that traditional assembly language may be seen as an elaboration of Turing machines. Nock has two data types (integers and binary trees) and 7 primitive instructions: S and K from SK combinator calculus; integer successor; cons; test for structural equality; test whether a value is an integer or a binary tree; and selection of one node from out of a binary tree (given the tree and an address).

Hoon is intended to be the 'C' to Nock's 'assembly language'. It is supposed to be a thin layer above Nock, 'a glorified macro assembler' rather than a fundamentally higher level of abstraction. So Hoon is an answer to the question "In a world where our machine model was SK combinator calculus instead of Turing machines, what would be the analog of C?".

As for the crazy names: orthogonal to the above, the creator of Hoon made a design choice w/r/t naming things. He believes that (a) it's confusing to have two things which are close but not quite the same be given the same name; and (b) the analogs to Turing-machine-land programming constructs in SK-combinator-machine-land tend to be close but not quite the same; therefore he invents an entirely new vocabulary for Hoon. Furthermore, he thinks (c) names in programming languages should be short, one syllable, natural language words; and (d) it's unimportant if a newbie has to memorize many such words; therefore the new vocabulary has words like 'arm', 'leg', 'kelp', rather than longer descriptive names. Furthermore, he chose to (e) reject alphanumeric keywords; Hoon's built-in keywords should all be punctuation. The creator claims that these design choices are validated by the evidence that he's talked to a number of people who have learned Hoon, and who after learning it, agree that it is neither crazy nor difficult.

My personal opinion is that Nock is beautiful, and I'm interested in learning Hoon just to see what 'the C of combinator calculus' might look like.

Quotes about Hoon's motivations from its creator(s)

" Hoon, a typed functional language

(note: in the following quote, i have replaced "Watt" with "Hoon")

" Hoon is a functional specification language. Its niche is the precise definition of arbitrary functions, especially in network protocol and file-format standards. Today, these standards (such as RFCs) are normally written in English and pseudocode.

...

If the layout function in HTML 5 is defined in Java, its semantics depend on the entire Java environment (not just the JVM) against which the code runs. ...it retains many installation dependencies...the JVM ... is anything but trivial...

...

Since the Nock spec is extremely small... it should be extremely precise.

...

If two Nock interpreters disagree, one is buggy, and it is trivial to figure out which - no "standards lawyering" involved. And since since all Hoon programs (Hoon itself included) execute exclusively in Nock, a metacircular operating environment built on Nock should retain Nock's precision at all layers.

"Standards" is a broad category. The specific use case for Hoon is the relatively new problem of dissemination or "content-centric" networking protocols \citep{namedcontent}. A DN protocol is one whose semantics discard routing information, such as source IP. Hence, data must identify and authenticate itself.

Dissemination protocols are especially suited to functional specification. With irrelevant routing information discarded, the semantics of a dissemination client can be defined, uniformly across all nodes, as a single standard function on the list of packets received. Obviously, if this function is not precise, the protocol is ill-specified.

Many languages, even Javascript, may be well-specified enough to define useful and interesting dissemination protocols. For some simple functions, the old English-and-pseudocode RFC approach is serviceable as well. However, the challenge increases considerably if the protocol function is metacircular - that is, it extends its semantics with code delivered over the network.

Javascript has "eval," but (as in most languages), it is best regarded as a toy. You could also try this trick in Haskell. But how well would it work? Metacircularity in most languages is a curiosity and corner case. Programmers are routinely warned not to use it, and for good reason. Practical metacircularity is a particular strength of Nock, or any functional assembly language. Dynamic loading and/or virtualization works, at least logically, as in an OS.

We hypothesize that a metacircular functional dissemination protocol can serve as a complete, general-purpose system software environment - there is no problem it cannot solve, in theory or in practice. Such a protocol can be described as an "operating function" or \emph{OF} - the functional analog of an imperative operating system. "

-- https://github.com/cgyarvin/urbit/blob/master/Spec/watt/sss10.tex

"Hoon is a strict, typed, functional language that compiles itself to Nock. The Hoon compiler is 4000 lines of Hoon. Adding standard libraries, the self-compiling kernel is 8000 lines.

Hoon has no particular familial relationship to other languages you may know. It uses its own type inference algorithm and is as different from Haskell as from Lisp. Hoon syntax is also completely unfamiliar. Hoon has the same relationship to Nock that C has to assembly — as thin a layer as possible. It is possible to learn Hoon without Nock, but it's probably not a good idea.

As a functional systems language, Hoon is especially good at metaprogramming, self-virtualization, hotpatching; marshalling and validating untyped data; decoding and encoding binary message formats. Hoon is designed for event programming, so there is no concurrency model. " -- http://urbit.org/

"Hoon is a high-level language which defines itself in Nock. Its self-compiling kernel, 7000 lines of code, specifies Hoon unambiguously; there is no Hoon spec. Hoon can be classified as a pure, strict higher-order static type-inferred functional language, with co/contra/bivariance and genericity. However, Hoon does not use lambda calculus, unification, or other constructs from “PL theory.” Hoon also excels at handling and validating untyped data, a common task on teh Internets. Its syntax is entirely novel and initially quite frightening." -- http://web.archive.org/web/20131005165939/http://www.urbit.org/2013/08/22/Chapter-0-intro.html

https://github.com/cgyarvin/urbit/tree/master/doc/book

" If I can summarize Hoon’s goal, it’s to be the C of functional programming. If you’re not so arthritic that you learned to code in Turbo Pascal, you may never fully appreciate the metaphor.

All languages in the Algol procedural family, including both C and Pascal, map straightforwardly onto a conventional CPU. But Pascal and C handle this mapping very differently. Pascal and C both have pointers and arrays, but Pascal works hard to treat both pointers and arrays as mathematical abstractions. C drops the abstraction; it makes no bones about the fact that a pointer is a memory address.

To a Pascal purist, to anyone who thinks mathematically, this seemed hideous. C isn’t really a high-level language at all - it’s a glorified macro assembler. Mankind retreats to the cave. But to programmers who are not natural mathematicians, whose minds are mechanical rather than abstract, C is a lifesaver. Since most mathematicians are also good mechanical thinkers, whereas very few people are naturally adept at abstraction, C slew and pillaged the once promising empire of Pascal.

There are two broad families of functional language available today: Haskell/ML, and Lisp. The Haskell family is relatively abstract; the Lisp family, relatively concrete. Perhaps Lisp is about as abstract as Pascal; Haskell is far more abstract. Both rest on the fundamental abstraction of functional programming, the lambda calculus, and more generally the foundational metamathematics of the late 19th and early 20th centuries.

Hoon has nothing to do with any of this stuff. It has functions and types, or what appear to be functions and types. On closer inspection, they are not abstractions at all, just glorified Nock macros.

If we compare these concrete patterns to the genuine abstractions of Haskell, we see that - as with Pascal and C - Hoon is roughly as expressive as Haskell. Haskell has higher-order type inference; Hoon has “higher-order” “type” “inference.” Some Haskell extensions have dependent types - Hoon has “refined” “types.” Hoon, like Lisp, unlike Haskell, is also very comfortable with typeless data; it should be, because it has no types, only “types.” The Hoon features and the Haskell abstractions have nothing in common - except that they solve the same problems for you, the programmer. In short, Hoon next to Haskell is a white shark next to a killer whale. The apparent resemblance is strictly superficial.

So we could describe Hoon as a pure, strict, higher-order typed functional language. But don’t do this in front of a Haskell purist, unless you put quotes around “typed,” “functional,” and possibly even “language.” We could also say “object-oriented,” with the same scare quotes for the cult of Eiffel.

Knowing Pascal made it harder, not easier, to learn C. Knowing Haskell or Lisp makes it harder to learn Hoon. Indeed, knowing either would have made it impossible for me to write Hoon. I do know C, of course, and the spirit of K&R is all over Hoon. Or so I’d like to think. Just as C is little more than a macro assembler for machine code, Hoon is little more than a macro assembler for Nock.

The most basic difference between Hoon and other languages is that Hoon is defined in Hoon. There is no formal Hoon spec - just a self-compiling compiler written in Hoon. The target of this compiler is, of course, Nock. Thus Hoon is as precisely defined as Nock, which is quite precisely indeed.

This would be true regardless of the size of Hoon in Hoon, but Hoon in Hoon is in fact quite small. The Hoon kernel is 7000 lines; it gzips to 25K. But this includes not only the self-compiling compiler, but also all the standard libraries it needs. The compiler alone is 2500 lines, including a very intricate “monadic” parser, a non-Hindley-Milner “type inference” engine, and a Nock code generator. This reflects both the internal simplicity of Hoon and its expressiveness. If you know these 2500 lines, and an expert should, you know Hoon.

On the other hand, the apparent complexity of Hoon is very high. When you open a Hoon file, you are confronted with an enormous avalanche of barely structured line noise. Again this reminds us of C, which makes no attempt at the kind of abstract prettiness we expect from a Pascal or a Haskell. Learning Hoon involves learning nearly 100 ASCII digraph “runes.”

Is this a harsh learning curve? Of course it is. On the other hand, it is not a mathematical task, but a mechanical one. It is trivial compared to the task of learning the Chinese alphabet, memorizing the Qu’ran, etc, all rote mental tasks routinely performed by normal human 11-year-olds. If you have an 11-year-old who understands the Hindley-Milner algorithm, you have a remarkable young mathematician.

A practical programming language is first and foremost a UI for programmers - meaning human programmers. Concrete languages beat abstract ones because they play to the strengths of the human brain, and avoid its weaknesses. Functional programming is traditionally reserved for the topmost echelon of natural talent. I’d like to think that anyone who can learn to fix a Chevy can learn to write a Hoon function. We’ll see if that’s true.

A programming language is called a language for a reason - it should activate the human linguistic lobes. Learning Hoon is like learning a language very alien to your first, such as Chinese from English. Before you know Hoon, it looks like squiggles. Once you know Hoon, and the rote task of syntax processing is hardwired, you look at your screen and see the function. Or, at least, I do - I hope you can too. " -- http://web.archive.org/web/20131005165939/http://www.urbit.org/2013/08/22/Chapter-0-intro.html

---

more on the 'personal cloud computer' and why it needs a new network and new system software (start reading at "But wait! Can this actually happen?"; above that, he describes the idea of every end-user actually hosting their own Facebook, Google, etc node on a VPS via bash, in order to show that it would be a good idea except that sysadmining is too hard):

http://unqualified-reservations.blogspot.com/2011/10/personal-cloud-computing-in-2020-or-not.html


" Architecturally, Urbit is an opaque computing and communication layer above Unix and the Internet. To the user, it's a new decentralized network where you own and control your own general-purpose personal server, or "planet." ... How can we put users back in control of their own computing?

Most people still have a general-purpose home computer, but it's atrophying into a client. Their critical data is all in the cloud. Technically, of course, that's ideal. Data centers are pretty good at being data centers.

But in the cloud, all users have is a herd of special-purpose appliances, not one of which is a general-purpose computer. Do users want their own general-purpose personal cloud computer? If so, why don't they have one now? How might we change this? ... "Personal server" is a phrase only a marketing department could love. We prefer to say: your planet. Your planet is your digital identity, your network address, your filesystem and your application server. Every byte on it is yours; every instruction it runs is under your control.

Most people should park their planets in the cloud, because the cloud works better. But a planet is not a planet unless it's independent. A host without contractual guarantees of absolute privacy and unconditional migration is not a host, but a trap. ... Take the web apps you use today. Imagine you trust them completely. Imagine any app can use any data from any other app, just because both accounts are you. Imagine you can "sidegrade" any app by moving its data safely to a compatible competitor. Imagine all your data is organized into a personal namespace, and you can compute your own functions on that namespace.

In this world, no app developer has any way to hold your data hostage. Forget how this works technically (it doesn't). How does it change the apps?

Actually, the rules of this thought-experiment world are so different that few of the same apps exist. Other people's apps are fundamentally different from your own apps. They're not "yours" because you developed them -- they're your apps because you can fire the developer without any pain point. You are not a hostage, so the power dynamic changes. Which changes the app.

For example: with other people's apps, when you want to shop on the Internets, you point your browser at amazon.com or use the Google bar as a full-text store. With your own apps, you're more likely to point your browser at your own shopping assistant. This program, which works entirely for you and is not slipping anyone else a cut, uses APIs to sync inventory data and send purchase orders.

Could you write this web app today? Sure. It would be a store. The difference between apps you control and apps you don't is the difference between a shopping assistant and a store. It would be absurd if a shopping assistant paid its developer a percentage of all transactions. It would be absurd if a store didn't. The general task is the same, but every detail is different.

Ultimately, the planet is a different user experience because you trust the computer more. A program running on someone else's computer can promise it's working only for you. This promise is generally false and you can't enforce it. When a program on your computer makes the same promise, it's generally true and you can enforce it. Control changes the solution because control produces trust and trust changes the problem. ... The market hasn't invalidated the abstract idea of the planet. It's invalidated the concrete product of the planet we can actually build on the system software we actually have.

In 1978, a computer was a VAX. A VAX cost $50K and was the size of a fridge. By 1988, it would cost $5K and fit on your desk. But if a computer is a VAX, however small or cheap, there is no such thing as a PC. And if a planet is an AWS box, there is no such thing as a planet.

The system software stack that 2015 inherited -- two '70s designs, Unix and the Internet -- remains a viable platform for "1:n" industrial servers. Maybe it's not a viable platform for "n:1" personal servers? Just as VAX/VMS was not a viable operating system for the PC? ... A clean-slate redesign seems like the obvious path to the levels of simplicity we'll need in a viable planet. Moreover, it's actually easier to redesign Unix and the Internet than Unix or the Internet. Computing and communication are not separate concerns; if we design the network and OS as one system, we avoid all kinds of duplications and impedance mismatches. ... A simpler OS

One common reaction to the personal-server proposition: "my mother is not a Linux system administrator." Neither is mine. She does own an iPhone, however. Which is also a general-purpose computer. A usability target: a planet should be as easy to manage as an iPhone. ... A sane network

When we look at the reasons we can't have a nice planet, Unix is a small part of the problem. The main problem is the Internet.

There's a reason what we call "social networks" on the Internet are actually centralized systems -- social servers. For a "1:n" application, social integration - communication between two users of the same application - is trivial. Two users are two rows in the same database.

When we shift to a "n:1" model, this same integration becomes a distributed systems problem. If we're building a tictactoe app in a "1:n" design, our game is a single data structure in which moves are side effects. If we're building the same app on a network of "n:1" model, our game is a distributed system in which moves are network messages.

Building and managing distributed Internet systems is not easy. It's nontrivial to build and manage a centralized API. Deploying a new global peer-to-peer protocol is a serious endeavor.

But this is what we have to do for our tictactoe app. ... Here are some major features we think any adequate planet needs. They're obviously all features of Urbit.

Repeatablе computing

Any non-portable planet is locked in to its host. That's bad. You can have all the legal guarantees you like of unconditional migration. Freedom means nothing if there's nowhere to run to. Some of today's silos are happy to give you a tarball of your own data, but what would you do with it?

The strongest way to ensure portability is a deterministic, frozen, non-extensible execution model. Every host runs exactly the same computation on the same image and input, for all time. When you move that image, the only thing it notices is that its IP address has changed.

We could imagine a planet with an unfrozen spec, which had some kind of backward-compatible upgrade process. But with a frozen spec, there is no state outside the planet itself, no input not input to the planet itself, and no way of building a planet on one host that another host can't compute correctly.

Of course every computer is deterministic at the CPU level, but CPU-level determinism can't in practice record and replay its computation history. A computer which is deterministic at the semantic level can. Call it "repeatable computing."

Orthogonal persistence

It's unclear why we'd expose the transience semantics of the hardware memory hierarchy to either the programmer or the user. When we do so, we develop two different models for managing data: "programming languages" and "databases." Mapping between these models, eg "ORM," is the quintessence of boilerplate.

A simple pattern for orthogonal persistence without a separate database is "prevalence": a checkpoint plus a log of events since the checkpoint. Every event is an ACID transaction. In fact, most databases use this pattern internally, but their state transition function is not a general-purpose interpreter. ... A simple typed functional language

Given the level of integration we're expecting in this design, it's silly to think we could get away without a new language. There's no room in the case for glue. Every part has to fit.

The main obstacle to functional language adoption is that functional programming is math, and most human beings are really bad at math. Even most programmers are bad at math. Their intuition of computation is mechanical, not mathematical.

A pure, higher-order, typed, strict language with mechanical intuition and no mathematical roots seems best positioned to defeat this obstacle. Its inference algorithm should be almost but not quite as strong as Hindley-Milner unification, perhaps inferring "forward but not backward."

We'd also like two other features from our types. One, a type should define a subset of values against a generic data model, the way a DTD defines a set of XML values. Two, defining a type should mean defining an executable function, whose range is the type, that verifies or normalizes a generic value. Why these features? See the next section... High-level communication

A planet could certainly use a network type descriptor that was like a MIME type, if a MIME type was an executable specification and could validate incoming content automatically. After ORM, manual data validation must be the second leading cause of boilerplate. If we have a language in which a type is also a validator, the automatic validation problem seems solvable. We can get to something very like a typed RPC. ... Semantic drivers

One unattractive feature of a pure interpreter is that it exacts an inescapable performance tax -- since an interpreter is always slower than native code. This violates the prime directive of OS architecture: the programmer must never pay for any service that doesn't get used. Impure interpreters partly solve this problem with a foreign-function interface, which lets programmers move inner loops into native code and also make system calls. An FFI is obviously unacceptable in a deterministic computer.

A pure alternative to the FFI is a semantic registry in which functions, system or application, can declare their semantics in a global namespace. A smart interpreter can recognize these hints, match them to a checksum of known good code, and run a native driver that executes the function efficiently. This separates policy (pure algorithm as executable specification) from mechanism (native code or even hardware). ... Why not a planet built on JS or JVM?

Many programmers might accept our reasoning at the OS level, but get stuck on Urbit's decision not to reuse existing languages or interpreters. Why not JS, JVM, Scheme, Haskell...? The planet is isolated from the old layer, but can't it reuse mature designs?

One easy answer is that, if we're going to be replacing Unix and the Internet, or at least tiling over them, rewriting a bit of code is a small price to pay for doing it right. Even learning a new programming language is a small price to pay. And an essential aspect of "doing it right" is a system of components that fit together perfectly; we need all the simplicity wins we can get.

But these are big, hand-waving arguments. It takes more than this kind of rhetoric to justify reinventing the wheel. Let's look at a few details, trying not to get ahead of ourselves.

In the list above, only JS and the JVM were ever designed to be isolated. The others are tools for making POSIX system calls. Isolation in JS and the JVM is a client thing. It is quite far from clear what "node.js with browser-style isolation" would even mean. And who still uses Java applets?

Let's take a closer look at the JS/JVM options - not as the only interpreters in the world, just as good examples. Here are some problems we'd need them to solve, but they don't solve - not, at least, out of the box.

First: repeatability. JS and the JVM are not frozen, but warm; they release new versions with backward compatibility. This means they have "current version" state outside the planet proper. Not lethal but not good, either.

When pure, JS and then JVM are at least nominally deterministic, but they are also used mainly on transient data. It's not clear that the the actual implementations and specifications are built for the lifecycle of a planet - which must never miscompute a single bit. (ECC servers are definitely recommended.)

Second: orthogonal persistence. Historically, successful OP systems are very rare. Designing the language and OS as one unit seems intuitively required.

One design decision that helps enormously with OP is an acyclic data model. Acyclic data structures are enormously easier to serialize, to specify and validate, and of course to garbage-collect. Acyclic databases are far more common than cyclic ("network" or "object") databases. Cyclic languages are more common than acyclic languages -- but pure functional languages are acyclic, so we know acyclic programming can work.

(It's worth mentioning existing image-oriented execution environments - like Smalltalk and its descendants, or even the Lisp machine family. These architectures (surely Urbit's closest relatives) could in theory be adapted to use as orthogonally persistent databases, but in practice are not designed for it. For one thing, they're all cyclic. More broadly, the assumption that the image is a GUI client in RAM is deeply ingrained.)

Third: since a planet is a server and a server is a real OS, its interpreter should be able to efficiently virtualize itself. There are two kinds of interpreter: the kind that can run an instance of itself as a VM, and the kind that can't.

JS can almost virtualize itself with eval, but eval is a toy. (One of those fun but dangerous toys -- like lawn darts.) And while it's not at all the same thing, the JVM can load applets -- or at least, in 1997 it could...

(To use some Urbit concepts we haven't met yet: with semantic drivers (which don't exist in JS or the JVM, although asm.js is a sort of substitute), we don't even abandon all the world's JS or JVM code by using Urbit. Rather, we can implement the JS or JVM specifications in Hoon, then jet-propel them with practical Unix JS or JVM engines, letting us run JS or Java libraries.)

Could we address any or all of these problems in the context of JS, the JVM or any other existing interpreter? We could. This does not seem likely to produce results either better or sooner than building the right thing from scratch....

" -- http://urbit.org/docs/theory/whitepaper

todo for me: read the rest of that, starting with http://urbit.org/docs/theory/whitepaper#-definition

---

why is 0 True and 1 False in Hoon?:

vidarh 447 days ago

And this is close to the reason why you often see zero used a "success" in languages like C: There are many possible error values.

So it's not very original at all.

urbit 445 days ago

That's correct. :-)

-- https://news.ycombinator.com/item?id=8578414

however cyarvin regrets it:

542458 446 days ago

parent

But on POSIX systems, there's a reason for that. There's only one success (Since "success" should do the same thing every time), but many types of errors which can be indicated by the return value. You could argue that 1 should be success, and >1 should be failure, but that's a minor quibble.

Conversely, here it's just because "it's different". I feel that this is a bit if a shame - some of the other parts of the project appear quite interesting, but making fundamental decisions in downright wrong ways just to mess with expectations comes across as silly to say the least. Why deliberately increase the learning barrier and drive people away?

urbit 445 days ago

Yeah, it's probably not one of our better decisions.

and motivation not to do it thatway:

loqi 444 days ago

It's been sorta mentioned elsewhere on the thread, but there is another (IMO simpler) mathematical intuition behind 0:1 :: false:true that doesn't involve any lambda fundamentalism. It's the algebraic analogy disjunction:conjunction :: addition:multiplication :: union:intersection :: ... which also turns up pretty often in computing.

For instance, if you've got anything like regular expressions, then you've got something with a structure where the "unit" (trivial match) is an identity for sequencing, and the "zero" (failed match) is an identity for disjunction and a zero for sequencing. It's not exactly a formal argument for preferring booleans to loobeans, but a failed regex match sure feels like a "false" to me.

I don't doubt that it's not worth changing at this point, but don't throw the semiring baby out with the lambda bathwater.

---

urbit 447 days ago

Excellent question!

One, as we know from Church-Turing equivalence, lambda and infinitely many other models of computing have the same expressive power. That doesn't mean they have the same practical utility, though.

Lambda in its Lisp incarnation is actually a rather poor substrate for something like a Lisp machine, I think, because it doesn't layer very well. You can define quite a simple Lisp model, but when you want to turn it into a practical Lisp, you don't add another layer - you grow hair on top of the existing layer. You grow a little hair, you get Scheme; you grow a lot of hair, you get Common Lisp.

I've never seen a lambda model (Qi/Shen perhaps a partial exception, but even there the underlying model is not very simple) that layers a complex language on a simple kernel. I think this is because lambda defines abstractions like symbol tables and function calls, which are user-level language features, in the computational model. The bells and whistles get mixed up with the nice clean math.

Another example is the fact that a modern OS should present itself to the programmer as a single-level store, meaning effectively an ACID programming language in which every event is a transaction. So, you're not constantly moving data across an impedance mismatch from transient to persistent storage, each having its own very different type system and data model.

But, if you're building persistently stored data structures designed to snapshot efficiently and remain maintainable, you really want your data model to be acyclic and not require GC. This goes in a very different direction from almost all the dynamic language work of the last 50 years.

Or, for instance, if your system is designed to work and play well in a network world, it really ought to be able to be good at sending typed data over the network. And validating it when it gets to the other side. Your type system ought to be able to do the same job as an XML DTD or JSON schema or whatever. Well... this wasn't exactly a design requirement when people designed, say, Haskell.

I could go on - there's a lot of stuff like this that is built the way it is because it made very good sense in the 60s, 70s, 80s or 90s. But the requirements really have changed, I think.

---

"

urbit 447 days ago

Excellent question!

One, as we know from Church-Turing equivalence, lambda and infinitely many other models of computing have the same expressive power. That doesn't mean they have the same practical utility, though.

Lambda in its Lisp incarnation is actually a rather poor substrate for something like a Lisp machine, I think, because it doesn't layer very well. You can define quite a simple Lisp model, but when you want to turn it into a practical Lisp, you don't add another layer - you grow hair on top of the existing layer. You grow a little hair, you get Scheme; you grow a lot of hair, you get Common Lisp.

I've never seen a lambda model (Qi/Shen perhaps a partial exception, but even there the underlying model is not very simple) that layers a complex language on a simple kernel. I think this is because lambda defines abstractions like symbol tables and function calls, which are user-level language features, in the computational model. The bells and whistles get mixed up with the nice clean math.

Another example is the fact that a modern OS should present itself to the programmer as a single-level store, meaning effectively an ACID programming language in which every event is a transaction. So, you're not constantly moving data across an impedance mismatch from transient to persistent storage, each having its own very different type system and data model.

But, if you're building persistently stored data structures designed to snapshot efficiently and remain maintainable, you really want your data model to be acyclic and not require GC. This goes in a very different direction from almost all the dynamic language work of the last 50 years.

Or, for instance, if your system is designed to work and play well in a network world, it really ought to be able to be good at sending typed data over the network. And validating it when it gets to the other side. Your type system ought to be able to do the same job as an XML DTD or JSON schema or whatever. Well... this wasn't exactly a design requirement when people designed, say, Haskell.

I could go on - there's a lot of stuff like this that is built the way it is because it made very good sense in the 60s, 70s, 80s or 90s. But the requirements really have changed, I think. "

---

" urbit 446 days ago

parent

Perhaps it makes a bit more sense if it's explained that the original goal was to keep the whole codebase (outside of the C support layer, which adds no semantics) at 10Kloc. Unfortunately this is now rapidly slipping toward 20.

Of course, Kloc is a deceptive measure of algorithmic simplicity in a functional language, if compared to procedural. Also there is about another 15Kloc of C in the interpreter and another 10K that wraps an event layer on libuv.

Bear in mind that Urbit can also be seen as an ACID database on the log-and-snapshot principle - events are transactions. You can of course export application state to some other storage system, but you don't have to. (If you change the source code, your app reloads - you do have to write a function that maps your old state to your new state, though.) So there is a lot of strange futzing around with mmap at the bottom.

"

---