proj-oot-old-150618-ootSyntaxNotes2

we want to be able to declare some functions as associative, etc. Perhaps it would be best if, just like we want operator precedence to be easily readable based on the symbols used, we make some conventions for things like associativity too? some ideas:

symmetric multichar ops with even # of chars are commutative and/or symmetric ops with +, * are assoc, ops with -, / are the inverses of the comparable +, * ops with = are reflexive, transitive, comparison ops and if not. symmetric and contains only one of <,> then are antisymmetric (note: antisymmetry seems like it may mean that either such operators operate directly, not on equivalence classes, or that equivalence must be redefined whenever these are; do we want that? seems too complicated, this link between different operators. otoh hand, if you just make sure that neither <= nor >= holds within an equivalence class, you're good, so mb this is ok (later: did i mean < and > within the class?)) ops with < or > are transitive does 'with' mean starting with?? startting with or ending with? or, ( starting AND ending) OR (oddnumbered and middle character)

with these rules, ++ would be commutative. so 'append' cant be ++. so would really like to make addition ++, and make '+' append.

hmm.. but otoh that's the kind of thing that ppl would forget if they are away from the language for awhile.

also, initially i was thinking this would save us some syntax as we could not have any other way to declare things associative, commutative, etc, but now i think that's too implicit.

---

maybe the language would be simpler without variadic args? or does it not matter if we already have defaults?

---

if we use ' for exception/maybe/option/nullableTypes tower wrapping/unwrapping, then we can't use it for quoting string literals ('hello'), or for a shortcut for keywords or quoting ('hello)

it's unclear whether it would be better to use ' for strings and use " for something else, or to use ' for the maybe tower

---

need a convenient per-argument syntax for strict, recursive strict, lazy annotations

---

review the old pseudo-oot in [1]

---

Haskell: difference between . (compose functions) and $ (sorta like pipe, but with order reversed)

http://stackoverflow.com/questions/940382/haskell-difference-between-dot-and-dollar-sign

---

i go back and forth on this, but currently i feel that there should be a maximum limit on non-keyword (positional) arguments in called functions, perhaps 3. This is to make code easier to read. However, this requires us to have a separate syntax for keyword arguments and defaultable arguments, unlike Python. The obvious syntax is 'x/ ' for a defaultless keyword argument.

(note: why 3? http://en.wikipedia.org/wiki/Arity and a quick Google search for the terms 'ternary function' and 'quaternary function' convinced me that 3-ary is the last common one)

---

general form for ternary operators: arg1 ?arg2? arg3 where '?' is some punctuation character, possibly custom defined

(or, should we permit only matching delimiters for this, eg arg1 <arg2> arg3)

---

want auto initializers like in C++, so you don't have to do the tedious thing you have to do in Python:

class C(): def __init__(param1, param2): self.param1 = param1 self.param2 = param2

---

similarly, when you are composing a class from multiple superclasses, sometimes upon some method being called, you need to call ALL of the superclasses implementations of this method (or at least, the methods from those superclasses which 'care') e.g.

class A(): def idle(self): do_important_thing_1()

class B(): def idle(self): do_important_thing_2()

class C(A, B): def idle(self): A.idle(self) B.idle(self)

would be nice to have language support for this to reduce the tedium of writing C's idle(), while also making it less error-prone (eg forgetting to change the implementation of C.idle when you swap-in/swap-out various superclasses; imagine that each class has 20 things like idle(), now you see how one or two could get messed up)

---

in fact, class construction from methods, class construction via inheritance, class construction via composition, class construction via auto-calling all relevant superclass methods from each method as noted in the previous section, could all be just operators, e.g. functions taking classes (and other parameters, such as lists of methods) to classes. This allows new custom class construction operators. As opposed to only having language support for one or two ways of creating classes, e.g. Python (although in Python you could easily make 'class operators' too, and you also have metaclass programming in Python)

---

could make it mandatory that if there are two vars in the same scope, and one's name is of the form "x", and the other "xs", that the latter's type is a list or graph of the former

and/or, this could have something to do with automapping lists; eg even if 'x' is an automapping list, then 'xs' disables the automapping

should all lists be automapping by default?

---

i'm thinking of using $x instead of ?x for metavariables. $ would still be interpolation. This would also be antiquote. So in general, '$' would mean that something is 'more of a variable' than the default in its context; in the ordinary code context, where identifiers are already variables, it's a metavariable; in the context of a non-raw string, $ means interpolation; in the context of a quote, $ is antiquote; in the context of a type signature, $x is a type variable.

(i feel that giving type variables a variable-indicating sigil prefix will be easier for newbies to grok than Haskell's way of just having lowercase identifiers which are assumed to be variables rather than constant types)

---

i like how AtScript? specifies a syntax but not a semantics for types

---

i dislike how AtScript? uses list<int> for generics. Generics are just type functions. Use 'list int'.

---

Haskell-ish @atbinding seems useful. The idea is that you put that within a pattern expression, and it lets you bind the subexpression at that point in the AST to a value.

We should support this both for pattern expressions (type expressions) and also for ordinary value computations.

---

AtScript? and Python use @annotation for annotations. i think this is what we'd use UPPERCASE for.

---

so my new idea for Capitalized and UPPERCASE is as follows. The previous idea was that UPPERCASE was meaningless identifiers, whereas Capitalized was something where metaprogramming was allowed to reflect on the capitalized string.

My new idea is that you can reflect on either of them, and the difference is that UPPERCASE is purely an annotation attached to the AST, whereas Capitalized is an ordinary node (identifier) within the AST, but one which metaprogramming can reflect upon and alter.

UPPERCASE is really just shorthand for ^UPPERCASE, which is itself just short for ^{UPPERCASE}. In the general case, eg when the annotation is a multiword expression, you have to use the long form, with ^{}

---

for languages in which '::' is EOL type annotation (and we might use it this way), :: is just a special case of the general case of annotations. That is, it's a node on the metalevel pointing to a node on the value level, such that the edge type (the type of the arrow which is pointing) is 'type annotation'.

Perhaps a :: x y is just short for a^{type-annotation: x y}.

note that the thing on the right of the :: is just an expression whose result is of type 'type'


since we have reified graphs here, we need general syntax for annotating edges of the AST, not just nodes.

one idea is ^{} for node annotation, and ^^{} for edge annotations. Eg:

^{annotation on the whole functionA definition}def ^{annotation on the functionA name}functionA^{annotation on the functionA definition argspec} (x, y) ^{annotation on the functionA definition block}{ ^^{annotation on the edge/connection between the functionA definition block, and the statement 'return x+y'} return x+y }

---

if type variables without the '$' prefix are just ordinary identifiers, then you can use them to abbreviate long types. You could say that type variable assignments in enclosing lexical scopes are available for use in type expressions.

eg:

x = [0] xtype = list int def functionA { y = [root:x] :: Tree xtype }

note that in the above example, rather than syntactically putting the 'xtype = list int' into some separate 'type annotation' segment, it's just right there along with the value computations.

this may or may not be a bad idea. Currently, i feel that a sufficiently smart compiler should be able to deduce which assignments are statically known types, and separate them out itself if it wants to.

note that type variables WITH the $ prefix are pattern metavariables, eg 'Tree $x' is a type (pattern) that matches any type of the form Tree(x), for any x.

---

still thinking we might want syntax for at least the enumeration special case of dependent types (namely, the type of an integer whose value is guaranteed to be below some ceiling). How about int<ceiling; eg int<5 is an integer which is always one of 0,1,2,3,4.

---

futhermore, in order to be able to have a type for non-empty lists (which is useful for maybe-like constructs), we need a way to declare the type of something which is only generated from one of the many constructors of some other type.

Eg if the constructors of a list are EmptyList? and Cons, then we need to be able to say 'x = the type of something of the form Cons(y,z)'. So how about just 'Cons($y, $z)'

Note that here we see pattern variables $x and $y that don't refer to values of type 'type'. Does that mean that we need a different syntax for this and for Tree($x), where x must be a type? Could just use '$$' instead of '$'; Tree($$x) as a shortcut for 'Tree($x), such that $x isa type'.

which reminds us that we'll need a syntax to put 'such thats' into our patterns. Really, i guess here we're finally getting into the logic programming.

---

note a common thread in the above that type expressions are just ordinary expressions with values of type 'type'. The syntax '::' is just syntactic sugar for a more general annotation construct with edge color 'type annotation'

---

i guess the &var syntax is pretty good:

&y = &x pythonic assignment, such these are both aliases for the same value, and changing one changes the other z = &y 'freeze' the mutable value $y via copy-on-write into 'immutable value typed' variable z z2 = z copy-on-write one immutable value into another z2 = q note that it's the values which are immutable, not the variables &y = z2 assign the value in 'immutable value typed' variable z2 into mutable alias variable y. This will change the value observed in &x, also. z = &f(x,y) call the potentially impure (from the vantage point of this statemask) function 'f' z = g(x,y) call the pure (from the vantage point of this statemask) function 'g'

---

hmm in some sense the usage of '$' in string interpolation and its usage in a pattern such as Tree($$x) are opposite meanings. In the former case, we are saying 'substitute this in now', and in the latter, we are saying 'DONT substitute this in now'. Which makes an argument for ?x metavariables in patterns, instead of $x.

---

i like the perl6 0..^4 as a shortcut for 0,1,2,3. We'd have to choose a different character, tho. Mb postfix -? '0..4-'. But we wanted to use postfix - as a generic marker.. How about --?

---

{} vs (): in addition to the difference that ()s are auto-closed at paragraph ends and {}s are not, perhaps only {}s open new scopes. Eg all variables declarations within a {} are in the same scope. Eg metaprogramming/DSLs cannot be applied to an area smaller than a {}.

---

perhaps Capitalized can support BOTH keywords (i mean Ruby keywords, eg cases of an Enum), and metaprogrammed identifiers, as follows:

by default, Capitalized is keywords, but a metaprogramming DSL is allowed to call them identifiers (actually, that sounds backwards; another way to say what i meant: Keywords are by default keyword identifiers (:keyword in ruby) but can be matched by metaprogramming)

single letter caps (eg 'X', 'Y'), and single letter caps followed by numbers (eg X2) are always identifiers.

note: although metaprogrammed keywords must be Capitalized, builtin language keywords are lowercase

are UPPERCASE labels by default but metaprogrammable? do they have namespaces in them (LIBRARYA:TRANSACTION) (same for keywords?)?

maybe we want to have colon-delimited namespaces for all identifiers (does this help with the Haskell no-namespacing-of-typeclass-ops problem or the haskell no-namespaces-of-field-accessors problem?)


@ vs #: if we use @ for Haskell-style at-binding then we need another character for Views. How about infix attached #? eg prefix # is len (#lst), unattached # is len (eg f = #;); attached infix # is view selection (eg x = y#list).

Then postfix # can be unboxed and unlifted: x#

---

if postfix # means 'strict' (unlifted), then postfix -# could mean lazy:

x# (strict) x-# (lazy)

could also use !s of course:

!x (strict) ~!x (lazy)

! could also have something to do with impurity, blocking calls, parallelization, eager evaluation as opposed to lazy, Golang-style channels

also, we might want to reserve one or two infix connected sigils for metaprogramming, and one or two prefix sigils too (mb the same ones). Eg mb we want to reserve 'a#b' for definitions by DSLs

---

i guess the impact of LL(1) instead of LL might be that you can have 2-character syntactical thingees, eg '-#'?

---

instead of ??x for type var x, could use 't?x' (riffing off of Python raw string syntax, r"string")

---

the reason i am interested in Haskell-style at-binding is not because i think it's so necessary to capture part of a pattern in identifiers; you can do that with multiline patterns, eg

a = ($x)* b = (a,(3, a))

etc

rather, i'm interested because i see it as a more general mechanism, a way to attach an annotation to a node of the AST whose edge color/semantics is 'name' or 'node label'. In other words, i see this as the syntax for labeling nodes in data constructors.

---

need node labels, edge labels, port labels; how about @, @@, @@@?

---

mb since $ and ? are opposites, just use $ and $- ?

---

i like this; we are beginning to flesh out how graphs with annotation are central to Oot, and how other things such as type annotations and at-bindings are merely special cases of this

---

a line whose first character is # must be a comment, at least at the beginning, so that we can have a bunch of #!/usr/bin ish lines

---

don't forget that we want a syntax for our 'typeclassy' types that is as concise as haskell's data declaration syntax:

  www​.andres-loeh.de/OpenDatatypes.pdf compare fig 10 to fig 3
  [http://www.informatik.uni-marburg.de/~kos/papers/gpce06.pdf Software Extension and Integration with Type Classes]: compare  figs 7 and 8 to fig 6; and fig. 10 to fig. 9 (fig 7 vs fig 6 is done just below)
  (in each comparison, we want the functionality of the former with the conciseness of the latter)

looks like this can be done pretty easily by just providing shorter syntactic sugar for Haskell constructs involving the words Instance, Class, etc, and by using something like the data declaration syntax to declare a portion of the open data declarations involving 'class __ instance of __' etc

eg something like 'data Exp = Lit Int

Add Exp Exp' needs to mean (fig 7 vs fig 6 of 2nd paper):

class Exp x data Lit = Lit Int data (Exp x, Exp y) => Add x y = Add x y instance Exp Lit instance (Exp x, Exp y) => Exp (Add x y)

and something like (fig 7 vs fig 6 of 2nd paper):

eval :: Exp −> Int eval (Lit i) = i eval (Add l r) = eval l + eval r

needs to mean:

class Exp x => Eval x where eval :: x −> Int instance Eval Lit where eval (Lit i) = i instance (Eval x, Eval y) => Eval (Add x y) where eval (Add x y) = eval x + eval y

---

but we may still need something like haskell's separate data declaration syntax to declare view-specific closed 'constructor' lists

---

so i guess the current thinking on arithmetic syntax is:

+, -, * are arithmetic but only when unconnected to their arguments (separated by whitespace) ++, --, are other things (mb is ^?) division is 'div' / is -> is comment-to-EOL

  1. comments are only on the initial lines of the program, and the # must be at the beginning of the line x# elsewhere is a footnote x#: elsewhere is a footnote definition
  2. x elsewhere is len(x) x#a is view a of x

i think that's overloading # too much.. #x for len is fine, but then we should only have one other of footnote, view

could use infix & for view, right now we only have prefix & for by-reference/mutable/(address-of?)

---

this blog points out that you should usually not use boolean positional arguments in APIs, prefer something more readable like a keyword argument: http://ariya.ofilabs.com/2011/08/hall-of-api-shame-boolean-trap.html

this also makes me think that perhaps we should not let ppl 'positionalize' keyword arguments, because there could be eg a boolean keyword argument whose meaning is hard to guess without seeing the keyword. However, we don't want to discourage ppl from giving keywords to their arguments. Mb we should just bar positionalization of keyword arguments with defaults? And we are already banning positionalization of more than 3 args, right?

---

javascript ES6 has something called 'destructured parameters' which sound like our generic-pattern-matching-general-case-of-keyword-argument-and-default-argument-argspec; see http://rauchg.com/2015/ecmascript-6/#destructuring

---

js es6 has string interpolation syntax for expressions, not just identifiers, via ${} . sounds good to me.

---

more on js es6 destructuring, and how it matches what we want pretty well:

"

andrewstuart2 2 days ago

I really like destructuring, but I worry about arrays becoming popular for multiple returns. If I'm using some other module, I really don't want to have to remember the order in which the parameters come, or be forced to check documentation or implementation to know for sure.

With an object, I can use a console log or a debug break point and immediately see the key value pairs (hopefully well-named) without having to check the function implementation or docs to see in which order the authors decided to return (or yield, etc.) values.

Can we just agree now not to use array destructuring as an excuse for returning multiple disparate values in an array? Or at least in only some idiomatic form? [error, value] would actually make sense to me, as it follows popular Node conventions.

reply

pornel 2 days ago

It's not a problem if you use objects, and the syntax is essentially the same:

    const {error, value} = (function(){ 
        return {value, error};
    })();

reply

andrewstuart2 2 days ago

Yeah, that's definitely the alternative I'm advocating for.

I've seen many an ES6 blogger decide to use arrays in their examples. It's not quite so grievous in the context of a throwaway example, but for real world usage, it might be more costly to figure out a month or two down the road.

reply

bzbarsky 2 days ago

I'm not sure why you think local variable names are mattering here...

What's happening is that this expresion:

    const { error, value } = obj;

where obj is an object will assign obj.error to error and obj.value to value.

You could name your locals something else if you wanted, of course:

    const { error: myError, value: myValue } = obj;

will assign obj.error to myError and obj.value to myValue. There just happens to be a shorter syntax for the common

   const { error: error, value: value } = obj;

case.

reply

jrochkind1 2 days ago

Ah, it's a slice, I get it now, thanks.

reply

"

---

so i'd like some syntactic sugar for that 'type annotation' (sugar for annotation of type 'type annotation'). '::' comes to mind. ':' is taken as a generic syntactic label, the AST equivalent of named parameter passing, eg 'then:', 'else:', etc.

It would be nice if it were easier to type than ::. What are the alternatives? Let's review again the unshifted punctuation:

`-=;,'./\

none of these seem too promising as single chars, except mb `or .: ` mb

assignment

; EOL , tuple/optional binding, grouping ' maybe/option types . mb GET with tight grouping (f.x = (f x)) / -> \ escape next character

for double chars, mb ``, --, , ..:

`` mb -- mb, but i was hoping this is decrement

equivalence

;; EOl on next dimension ,, grouping mb .. mb, but mb ellipses/range operator comment-to-EOL
escaped '\'

so far the only good candidates seem to be:

so far looks like :: is winning

---

you should be able to declare the types of fn arguments within the argspec, rather than only annotating the whole fn a la haskell; eg def f(a :: int, b :: int) :: c should be equiv to f(a,b) :: a -> b -> c

---

f(a,b :: int) is a golang shortcut for f(a :: int, b :: int). Do we want to do this? Probably not; it appears to add complexity.

---

golang has a '%T' println specifier which prints the type of something, useful for debugging (eg https://tour.golang.org/basics/11 )

---

are golang println multiple args (if given) auto space separated? https://tour.golang.org/basics/15

even if not, we might want to do that, that's cool

---

since labels are now first-class, might need syntax to 'quote' labels, eg turn them into literals that can be assigned to variables

---

implicit 'return' at the end of a fn, like Ruby

(if you want to return multiple values, can u use commas in the implicit return line? i think so)

---

only allow up to 3 unnamed return vals, just like only up to 3 unnamed positional input args

---

in the presence of currying, you cant rly enforce max 3 positional args :( but u can still try... eg disallow the form "f w x y z", force the user to do something like "g = f w x y; g z"

---

if, in shell scripts, pipes and files are like channels in other HLLs, then whats the advantage of using text files and pipes? you can compose channels in a HLL (easily? probably..)

mb part of it is syntactical ease of serialization; want to save any (textual) 'data structure' to a file? just do "... > /tmp/savefile.txt".

Similarly, want to read in any (textual) data structure to a file? Just do "cat /tmp/savefile.txt > ...".

These are easier than "import cPickle; cPickle.dump(..., open(filename, 'w'))" and "import cPickle; cPickle.load(open(filename, 'rb'))

..."

---

one reason to just make both and "" quotes is to make it easy to quote one-liners in shell scripts and other languages; depending on if the quoting language is quoting with " or ', it would be nice to be able to use the other one within the one-liner, to avoid having to escape ' or " within the one-liner.

nah... ' is really valuable b/c it's unshifted; shell users can just live with having to escape stuff ('"'"' for ' and \" for ")

---

https://tour.golang.org/flowcontrol/11 : when 'switch' is not given an argument, it is the same as 'switch (True)', which makes it useful as an if-then-elsif-elsif-elsif-... construct

---

SPIN Promela uses bare expressions as syntax not for assertions (as we were thinking), but to block until a condition is true. Eg the statement "finish == 2;" would cause this thread to block until finish == 2.

---

SPIN Promela has some other interesting constructs (copied from plChProofLangs):

'do' (do-od): a looping construct. Presumably loops until the command 'break' is executed. Like some forms of 'switch', it has multiple branches (delimited by '::'), and each branch has a guard (separated from the branch code by '->'). I think if multiple branch guards are True, then a non-deterministic choice is made. I think if all other branch guards are false, the 'else' guard is True.

bare expressions for blocking: when a bare expression, such as 'finish == 2', is encountered, the thread blocks until the expression returns True

'assert': assertions work as usual, but in addition, since Promela is a language for SPIN the model-checker, assertions are recognized by SPIN

'[]' is 'always': eg '[]!deadlock' means 'it is always true that deadlock is false'

---

i like SPIN Promela's 'do'. It mananges to combine (non-deterministic) 'switch' and 'while' into a single thing (although it should be generalized to Go's switch, eg where a value is given in the head and the guards try to match that value; if the head value is omitted then the default is 'True', in which case the guards are boolean expressions that open when the result of their expression is True)

note again that in Oot we want a modal operator to specify whether a non-deterministic switch where multiple guards are open is Any (one open guard is non-deterministically chosen and the corresponding branch is executed) or All (all branches corresponding to all open guards are concurrently executed)

(but in Oot we dislike keywords, right, so can we think of some punctuation for 'break'?)

actually, i guess Promela 'do' is just a composition of 'while' and 'switch', so stick with separate 'while' and 'switch'

---

http://spinroot.com/gerard/pdf/P10.pdf suggests loops with upper bounds, eg making everything a 'for' loop (sorta; actually they allow 'while' loops but with an attached upper bound). This suggests that Oot should indeed have a 'for' loop, to make it obvious when there is a simple fixed bound (rather than just a 'while' loop). i don't see the need for C-style for loops where the loop termination condition and increment operation are custom; just use 'while' for that.

(do we want to use Python's way of actually only having a foreach? but in that case i certainly want more concise syntactic sugar for 'arange(10)')

---

some uses for 'regions' (region annotations):

and related things:

---

we want some punctuation syntactic sugar for assertions

---

we can do better than Perl by making @ARGV even easier to use in oneliners, and by making it easy to do 'perl -e $a = $ARGV[1]; $a =~ /bob(alice.*)/; print $a;' type stuff (i havent debugged that, it may be wrong, but u get the idea) (altho mb there is already a Perl shortcut for that)

---

y'know, now i kinda want to use to separate guards from branches in a 'switch', or other DSL-defined constructs; because this is sort of like '/' (both are shorter versions of '->' in some other languages)

but then we'd need to choose another EOL commenting syntax

could do ';;'; but i kind of want to save ';;' as a separator that automatically puts {} around the things on either side of it, eg

a ;; b == {a} {b}

but do we really need that? we already have:

if: condition then: stuff elif: other stuff else: other stuff

maybe we should fall back to using '--' for comments; it's not as good looking (because it's easier to miss than ) but maybe it's better, given the above concerns

but then we can't use -- as a synonym for 'dec' (decrement)

---

void main() or not? i guess we want 'not', to reduce boilerplate (and friction for newbies); but otoh we want static typing, so conceptually we probably want all subfunctions to all be defined at the same time, rather than interleaved with statements of toplevel (implicit main()). So let's just say that: the proper form of the program is one in which all statements in the toplevel, aside from fn defns and imports, are below the last fn definition and import; and this form is equivalent to interleaving main().

--

should take a look at https://en.wikibooks.org/wiki/Perl_6_Programming/Meta_Operators

looks like in Perl6, [f] lst is reduce f over list lst, f1f2f=3+>> 3 . i find i would have used this a lot in my numpy code.

also, since we ARE currying, a separate 'zip' metaoperator will be useful. Also, i think sequential reduce is probably useful.

so we want: map, map/elementwise vs scalar, reduce, sequential reduce, zipWith

--

perl6 probably is onto something when they make the map and reduce metaoperators matched/paired delimiters, rather than single characters. But we dont want to blow too much matched delimiter syntax on this. How about we use our parameterized parens, eg something like:

e( )e for elementwise m( )m for map r( )r for reduce or something like that.. and then also provide syntax for the individual case, eg e+f (or e.f, whatever punctuation char is chosen) as short for e(f)e, etc

eh, e( )e is too long to type, mb just e( )

--

i guess if J-style forks ((V0 V1 V2) Ny is the same as (V0(Ny)) V1 (V2(Ny))) are so common that they are a syntactic default in J, then we should have syntax for them too.

--

J has something called 'whilst' which is like repeat..until except with the condition at the beginning, like 'while':

http://jsoftware.com/help/dictionary/cwhile.htm

i think we should have that

--

we could reserve upper-case single letters for all sorts of language-defined things, rather than just as magic variables

this would let us use them for metaoperators, etc

--

for the most part, we want punctuation to only be for things with syntactic significance, not for ordinary identifiers

but there are some things that are really common, like plus, minus, length, that we might want to assign punctuation to. in addition, it is conventional to use punctuation for +,-,*, get, set

if we wanted to go hardcore ideolog, though, we could insist that uppercase single letters must be used for non-syntactic things in the base language

some languages use punctuation to indicate identifiers which are binary operators. do we want to do that?

i think we do want to let users define custom punctuation binary ops (but no custom precedence)

so, at least, we should clearly note which punctuation characters have syntactic significance, and which are merely ordinary identifiers.

--

.a (prefix a) might be short for "lambda x . x.a", to make the meaning related to "x.a"

--

ok if we want 'if' to also be 'switch', and we want 'switch' to be a special case of a general mechanism for things with a head expression and args, each arm with a guard expression, and keywords break and continue and else, then how do we do this with our ':' syntax for named arguments?

if expr stuff

if expr stuff else stuffe

if switch_expr case1 stuff1 case2 stuff2 else stuffe

if expr1 stuff1 expr2 stuff2 else stuffe

so here, in addition to positional and named arguments, 'if' is being given a varagin of labeled arguments, where each block is labeled by its guard

couldn't we just use : and let 'if' take a varargin dict?

if expr: stuff

if expr: stuff else: stuffe

if switch_expr case1: stuff1 case2: stuff2 else: stuffe

if expr1: stuff1 expr2: stuff2 else: stuffe

how do we know when the 'if' is done? we need to enclose something in a block or parens. Block or parens? Does the 'if' do inside the block or outside of it? How about switch_expr?

the most obvious thing to do seems to be to leave the if and the switch_expr outside of the block:

if {expr: stuff}

if {expr: stuff else: stuffe }

if switch_expr { case1: stuff1 case2: stuff2 else: stuffe }

if {expr1: stuff1 expr2: stuff2 else: stuffe }

but this is annoyingly verbose for the common case of a one-liner if {expr: stuff}. So mb allow parens or blocks?

if expr: stuff (== (if expr: stuff) because lines are auto-enclosed in parens)

if {expr: stuff else: stuffe }

if switch_expr { case1: stuff1 case2: stuff2 else: stuffe }

if {expr1: stuff1 expr2: stuff2 else: stuffe }

but ppl will make mistakes where they say

if expr: stuff else: stuffe

other stuff that isnt meant to be part of the if

so terminate this when you come to a paragraph break (a block) (i we're already doing this because paragraph breaks are implicit blocks)

and what about:

if switch_expr

  case1: stuff1
  case2: stuff2
  else: stuffe
  other stuff that isnt meant to be part of the switch

so make it an error to have an if that ranges over multiple blocks without an explicit {} (i guess we're already doing this because it would compile to:

{if switch_expr}

  {case1: stuff1}
  {case2: stuff2}
  {else: stuffe}
  {other stuff that isnt meant to be part of the switch}

and then the case1: in the second block would be an argument without a function

but wait if parens terminate then

if expr: stuff else: stuffe

if switch_expr case1: stuff1 case2: stuff2 else: stuffe

if expr1: stuff1 expr2: stuff2 else: stuffe

wont be legal either

yknow, i think that's fine. It has to be:

if (expr: stuff else: stuffe

if (switch_expr case1: stuff1 case2: stuff2 else: stuffe

if (expr1: stuff1 expr2: stuff2 else: stuffe

another alternative is to make 'if' short for '(if', have the block encompass the if (like in Haskell/MLish and Lisp syntax), and then just have ppl put ifs in their own paragraph if they dont wanna bother to close. Or, is that the same thing as what we meant earlier by 'if::'?

or we could use :if

if expr: stuff stuff contd

if expr: stuff stuff contd else: stuffe stuffe contd

if switch_expr case1: stuff1 stuff1 contd case2: stuff2 stuff2 contd else: stuffe stuffe contd

if expr1: stuff1 stuff1 contd expr2: stuff2 stuff2 contd else: stuffe stuffe contd

todo look at that other long analysis i did of that sort of stuff and reconcile it with this one

--

probably should reserve a prefix for metaprogrammable punctuation

--

want 'kelps' as a DSL/metaprogramming syntactic construct

--

by default, within a data constructor ('[]'), spaces separate elements (commas not needed); to treat spaces as function application (as they are treated outside of data constructors), enclose in parens

--

numpy 'flipud' is implemented as (from /usr/lib/python2.7/dist-packages/numpy/lib/twodim_base.py ):

m[::-1, ...]

'fliplr' is:

m[:, ::-1]

nice... we should have negative slices, :: in slices, and ... in slices

--

saying the following in Octave:

[3 size(raster)];

is more concise than the Python version (unless i'm missing a shorter Python version, which i probably am):

hstack([3, shape(raster)])

--

allow assertions of complexity (eg, 'compiler, trust me, i promise that this function is O(x^2) in that input'), as well as statically proven complexity annotations (eg, 'compiler, this function is O(x^2) in that input, please prove that'). However, since we have pluggable type systems, sometimes the type system being used won't be powerful enough to prove a complexity annotation, in which case it could be treated as an assertion. Similarly, some systems won't be able to check all types of assertions at runtime either (how do you check that something was O(x^2) from a single run?), so many of these assertions turn into black-box annotations that have no actual effect.

--

there should be syntax (or at least a facility) to say 'call this function, passing in same parameter arguments that were given to the current function, except for the following overrides: ...'. This would be useful in recursive calls, and also in wrappers). Hoon has a rune for this

--

do we want special syntax for

do we want punctuation syntatic prefix/suffix for:

--

in golang, there is the variable declaration construct: var name typeannotation

you can also group many names with the same annotation:

  var name1, name2 typeannotation

oot should generalize this construct for DSLs:

--

in Golang, if a map's type is declared to have values of a certain type, you can omit the type name when constructing those values within the map literal:

https://tour.golang.org/moretypes/17

--

in Rust:

Mr_T_ 8 hours ago

The 'String', '~str' dichotomy is really unfortunate.

reply

pcwalton 7 hours ago

Presumably you mean String/&str? (~str hasn't existed for a long time.)

On the contrary, the lack of the string/string-view distinction is widely considered a flaw in the C++ standard libraries. C++14 introduces std::string_view to correct it. Rust's String/str distinction is just the same as C++14's std::string/std::string_view.

reply

frowaway001 7 hours ago

Apart from the Rust's usual rule to invent some abbreviations, but only apply them half the time.

Then "types start with an uppercase letter" except when the language creators break their own rule.

Then the fun "sometimes we use () and sometimes we use [] to do practically the same thing".

Then the various insanities with ::.

Then Generics. How many examples of languages which tried to use <> for Generics do we actually need before people learn the lesson?

I really wished some people took Rust, and wrote a new frontend to get rid of all the pointless inconsistencies and other "C did, so it has to be good"-isms (like_all_this_underscore_stuff or the ridiculous abbreviations), because many ideas of Rust are not only good but brilliant and deserve to be put into languages which don't belong to the purely-academic category.

I really wish Rust to succeed, but can we stop this approach of "3 steps forward and 2 steps backward" to language design?

reply

pcwalton 7 hours ago

> Apart from the Rust's usual rule to invent some abbreviations, but only apply them half the time.

Do you have an example?

> Then "types start with an uppercase letter" except when the language creators break their own rule.

They always start with a capital letter, except for built-in types or FFI bindings to libraries such as libc where the original types used a lowercase letter. This convention is exactly the same as Java.

> Then the fun "sometimes we use () and sometimes we use [] to do practically the same thing".

I've never heard this before. Do you have an example? () is used for function calls, grouping, and tuples, while [] is used for array literals and array indexing.

> Then the various insanities with ::.

The double-colon is very useful so that you can tell the difference between module lookup and field projection/method calling. Early Rust used . for module lookup, and it was confusing to tell at a glance whether a function was being called or a method was being invoked.

> Then Generics. How many examples of languages which tried to use <> for Generics do we actually need before people learn the lesson?

Using square brackets like Scala wouldn't actually help with the ambiguity, because it would be ambiguous with array indexing. The only syntax I've seen that actually solves the ambiguity is D's `!()`, which was deemed too unfamiliar. Angle brackets were chosen because they are familiar and aren't really any worse than the other mainstream options.

> I really wished some people took Rust, and wrote a new frontend to get rid of all the pointless inconsistencies and other "C did, so it has to be good"-isms (like_all_this_underscore_stuff or the ridiculous abbreviations)

The underscore naming convention was actually taken from Python's PEP 8. Rust doesn't really have any more abbreviations than most other languages at this point.

reply

callahad 6 hours ago

> > Then the fun "sometimes we use () and sometimes we use [] to do practically the same thing".

> I've never heard this before. Do you have an example? () is used for function calls, grouping, and tuples, while [] is used for array literals and array indexing.

Maybe referring to the ability to use either type of brace with macros?

reply

pcwalton 6 hours ago

Yeah, you can use any type of delimiter with macros, but I think that's an important feature, especially since macros can expand to declarations. Early versions of Rust required parentheses for all macro invocations, and that looked really ugly for anything that looked like a block. Delimiter interchangeability is a feature from Racket that works really well when it comes to macros and I'd hate to give it up.

reply

blaenk 5 hours ago

Agreed. They allow you to naturally create vectors with the bracket delimiter `vec![1, 2, 3]`, create spawn-like macros with the brace delimiter `spawn { code }`, and in the normal case just use parentheses `println!("testing")`

reply

alricb 2 hours ago

What about '{}'? It's a nice pair of opening/closing characters.

reply

frowaway001 3 hours ago

Considering the glorious downvoting, I won't bother comment any further except saying that these "issues" Rust devs keep pointing out have been addressed.

And now, I won't waste any further time with this cult.

reply

Daishiman 26 minutes ago

It's so sad, because I yet to see a single place where they're addressed.

reply

mbrubeck 7 hours ago

In this example the dichotomy is between String (which is guaranteed by the type system to be valid UTF-8) and OsStr? (which might be in an unknown encoding or otherwise not decodable to valid Unicode).

This is exactly when you want a systems language to require explicit conversions, rather than converting things silently and possibly losing or corrupting data.

reply

davvid 6 hours ago

rather than converting things silently and possibly losing or corrupting data

Exactly. Python3 went down the "silently converting" route, and it's not pretty[1]. I would go so far as to call it harmful.

http://lucumr.pocoo.org/2014/5/12/everything-about-unicode/

I understand the difficulty in this space; much of it is caused by forcing the Windows unicode filesystem API onto python as its world-view, rather than sticking to the traditional Unix bytes world-view. I'm unixy, so I'm completely biased, but I think adopting the Windows approach is fundamentally broken.

reply

Veedrac 5 hours ago

The problem there is overblown - it's basically all due to the idea that sys.stdin or sys.stdout might get replaced with streams that don't have a binary buffer. The simple solution is just not to do that (and it's pretty easy; instead of replacing with a StringIO?, replace it with a wrapped binary buffer). Then the code is quite simple

    import sys
    import shutil
    for filename in sys.argv[1:]:
        if filename != '-':
            try:
                f = open(filename, 'rb')
            except IOError as err:
                msg = 'cat.py: {}: {}\n'.format(filename, err)
                sys.stderr.buffer.write(msg.encode('utf8', 'surrogateescape'))
                continue
        else:
            f = sys.stdin.buffer
        with f:
            shutil.copyfileobj(f, sys.stdout.buffer)

Python's surrogateescape'd strings aren't the best solution, but I personally believe that treating unicode output streams as binary ones is even worse.

reply

Jweb_Guru 8 hours ago

There is no ~str, but when there was it was largely identical to what String is now.

reply

--

" They conclude that representing code as data is a Bad Idea and go back to writing parsers.

But they're wrong.

The reason that code represented as XML or JSON looks horrible is not because representing code as data is a bad idea, but because XML and JSON are badly designed serialization formats. And the reason they are badly designed is very simple: too much punctuation. And, in the case of XML, too much redundancy. The reason Lisp succeeds in representing code as data where other syntaxes fail is that S-expression syntax is a well-designed serialization format, and the reason it's well designed is that it is minimal. Compare:

    XML: <list><item>abc</item><item>pqr</item><item>xyz</item></list>
    JSON: ['abc', 'pqr', 'xyz'] 
    S-expression: (abc pqr xyz)

The horrible bloatedness of XML is obvious even in this simple example. The difference between JSON and S-expressions is a little more subtle, but consider: this is a valid S-expression:

    (for x in foo collect (f x))

The JSON equivalent is:

    ['for', 'x', 'in', 'foo', 'collect', ['f', 'x']]... The reason that Lisp is so cool and powerful is that the intuition that leads people to try to represent code as data is actually correct. It is an incredibly powerful lever. Among other things, it makes writing interpreters and compilers really easy, and so inventing new languages and writing interpreters and compilers for them becomes as much a part of day-to-day Lisp programming as writing parsers is business as usual in the C world. But to make it work you must start with the right syntax for representing code and data, which means you must start with a minimal syntax for representing code and data, because anything else will drown you in a sea of commas, quotes and angle brackets.

Which means you have to start with S-expressions, because they are the minimal syntax for representing hierarchical data. Think about it: to represent hierarchical data you need two syntactic elements: a token separator and a block delimiter. In S expressions, whitespace is the token separator and parens are the block delimiters. ... Other languages have different block delimiters depending on the kind of block being delimited. The C family, for example, has () for argument lists and sub-expressions, [] for arrays, {} for code blocks and dictionaries. It also uses commas and semicolons as block delimiters. If you compare apples and apples, Lisp usually has fewer block delimiters than C-like languages. Javascript in particular, where callbacks are ubiquitous, often gets mired in deep delimiter doo doo, and then it becomes a cognitive burden on the programmer to figure out the right delimiter to put in depending on the context. Lisp programmers never have to worry about such things: if you want to close a block, you type a ")". "

-- http://blog.rongarret.info/2015/05/why-lisp.html

--

"

WalterGR? 8 minutes ago

    Macros are what made lisp, not parentheses.

The parentheses are huge, though. It enables "Code is data" and "Data is code" in a powerful way.

About Lisp, people always say "Macros are great" and "Code is data" and "Data is code", but it's hard to see what they mean without good examples. I mean, you can write code that writes code in any language that has a `print` statement. And obviously code is data, so what is that aside from some pseudo-philosophic BS?

There's a lot of discussion in this thread about macros - so here's an example of Code and Data Being One, for those that are unconvinced. I hope it'll shed some light.

I have a slang dictionary website. It's backed by a database now, but it used to be statically-generated HTML. I represented the data as XML. It looked something like this:

    <Term term="slick">
      <PartOfSpeech pos="adj">
        <Definition>
          impressive.
        </Definition>
        <Definition>
          smart.
        </Definition>
      </PartOfSpeech>
    </Term>

So then I needed an XML parser to parse the data. Maybe it parsed the XML into a object model that the code would navigate and output the appropriate HTML. Or maybe the code got callbacks according node type and would output the appropriate HTML then.

XML is a pretty verbose format, so - this being a Lisp example - we could probably save some typing if we represented the data as an S-expression. The above XML would become something like this:

    (Term "slick"
      (PartOfSpeech "adj"
        (Definition
          "impressive."
        )
        (Definition
          "smart."
        )
      )
    )

But that's an idiosyncratic style.

So, let's lowercase the node types, use hyphens rather than camelCase, and remove the unnecessary line breaks. Now the data looks like this:

    (term "slick"
      (part-of-speech "adj"
        (definition "impressive.")
        (definition "smart.")))

Great. We've got our data.

Now we need to write the Lisp code to convert that S-expression data into the appropriate HTML output.

We'll need some Lisp functions to handle the nodes and their attributes (such as the definition text and part of speech) and write the HTML for them. We could use an S-expression parser library to load the data, and then walk through it and call those functions. But that's not necessarily the best way. We can simplify it by creating exactly 3 functions that take some arguments and output HTML: term, part-of-speech, and definition.

Since Lisp code is - like the data - also represented as S-expressions, once we've written those 3 functions, the data is literally executable Lisp code.

reply " -- https://news.ycombinator.com/item?id=9509037

--

Go puts compiler directives inside comments, rather than having a separate syntax for them [2] . As of this writng, Go will even run directives from 'comments' within a multiline string literal [3]

Oot needs to have separate syntaxes for compiler directives, and comments, and metadata [4]

mb should also have a sub-syntax for non-standard compiler directives and non-std metadata too?


are varargs really needed/good for anything besides saving two keystrokes by not having to enclose lists with list constructors? i notice that i get along okay in Python without using them much (i mean the *args, not the kw; i use 'vararg' keywords a lot). It seems to me that if you wanted a vararg, you could always take an argument with a list in it instead.

(maybe this is a question for LambdaTheUltimate?)


in Python, the syntax: print x, is very convenient for not printing a newline

it would be nice to have this sort of 'one character flag syntactic sugar' idea generalized; although on the other hand in this case the comma just seems to fit with its function (like a comma in a list, it means, wait, there's more coming) and so is easier to remember.

---

in python i often find myself doing eg:

{'section_number': section_number, 'section_dataset_id': section_dataset_id, }

there should be a shortcut for this so i dont have to type the variable names twice

---


Footnotes:

1.

 lst is (parallelizable) map f over lst,l1 

2.

 l2 is map f over zip(l1,l2)l1 

3.

 l2 is l1 = map f over zip(l1,l2)l1 X l2 = cross product of l1 and l2 postfix = is the metaoperator that makes += do the usual thing R is the 'reverse' metaoperator S is the 'sequential' metaoperator (like reduction, but executed sequentially, rather than in parallel) Z is the zipWith metaoperator (which in non-currying Perl is like map, but evaluated lazily) [\f] is the triangle-reduce metaoperator

some examples: https://perl6advent.wordpress.com/2010/12/06/the-x-and-z-metaoperators/

todo: read also:

they also have some ways to do a zip of two differently-sized lists and extend one or the other, before a map and a shortcut for taking the cross product and then mapping over it

i think what we need in Oot is just map and reduce. The cross product is just an ordinary fn (we dont need syntactic sugar just for mapping over the result of a cross) or for extending the zips of differently-sized lists inside a map.

note: the above is all subtly wrong b/c in a currying language, you don't apply + to (1,2)

on second thought, there is one place in which the zip extension is useful; elementwise operation with a scalar; eg what in Octave would be 'lst .+ 3' here would be something like 'map + zip(tile(3, (len(lst))), lst)'. In perl6 you have @lst