proj-jasper-jasperSyntaxNotes1

Difference between revision 14 and current revision

No diff available.

"Even Herb Sutter, of C++ fame agrees:

> One of the things Go does that I would love C++ to do is a complete left-to-right declaration syntax. That is a good thing because the left-to-right makes you end up in a place where you have no ambiguities, you can read your code in a more strait-forward way, which also makes it more toolable.

(6:00) http://channel9.msdn.com/Shows/Going+Deep/Herb-Sutter-C-Ques... "

--

http://en.wikipedia.org/wiki/Most_vexing_parse

C var decl syntax is (unless you know all the detailed rules) ambiguous because, since you can do initialization and declaration in the same line, sometimes it gets confusing, e.g.

  TimeKeeper time_keeper(Timer());

is that declaring an instance named time_keeper of class TimeKeeper? and calling its constructor, or is it "a function declaration for a function time_keeper which returns an object of type TimeKeeper? and takes a single (unnamed) argument which is a function returning type Timer (and taking no input). "?

--

could maybe do const/let and #ifdef/if distinction using a 'when to evaluate' sigil?

--

ambiguous precedence can be ignored if all relevant ops are marked associative with each other

--

" This points out that designing a syntax after a keyboard is a tricky business. Some national layouts make it clear that they are made with no consideration of programming needs what so ever. I definitely think there should be several layouts tuned to particular needs (but similar enough to still be somewhat usable by different users). Programmers use the national characters (such as åäö) less often than operators, so why are we forced to work on keyboards that have dedicated keys for (åäö) but squeeze up to 3 important operator-characters on other keys? For your information, here are some examples (plain, shift, altgr): 2 " @ 7 / { 8 ( [ 9 ) ] 0 = } + ? \ < >

¨ ^ ~ -all of them are normally "dead" meaning they get attached to other characters so you need to type them twice and then delete one to get a single one. "
-this sucks particularly if you work with a shell

"

    Here are the only special characters I can get with single key press without going to numpad ,.-'+§<
    It sucks to code with Finnish keyboard layout. Swedes probably use same. 3 extra vowels make it hard. Hmm.
    Hmmm. This discussion makes me think that should I configure full custom keyboard layout for coding.
    If getting rid of that shift is such a great advantage in programming."

"

    As a curiosity:
    The only characters that are available unshifted on (practically) all keyboards are letters a-z, digits 0-9 punctuation ,. and addition/subtraction +-
    For all other characters, there is a very large percentage of keyboards where some form of shifting is necessary.
    All the paragraphs that mention () being vastly superior to [] are therefore factually incorrect.
    So yeah, that's silly :-)"

--

since punctuation is hard to search for using std tools, perhaps require punctuation infix operator assignments to be imported individually rather than with an import *

--

mb prohibit defining symbolic operators without first defining an alphanumeric function to bind to them

--

consider scala's syntax for anonymous functions:

l.map( x => x*2 )

note that i think the same syntax is reused for function types, e.g.

(fun: List[T] => T)

--

i think golang has an LALR(1) grammar?

--

LL \subset LALR(1) \subset LR \subset? GLR

--

apparently if you have an LL(k) grammar you can parse it in linear time without backtracking by a recursive descent parser (a linear recursive descent parser without backtracing is also called a 'predictive parser'), which is intuitive and easy to write. this sounds like a good idea to make it easy for others to write macros by hand (otoh by that time they get the AST, right, so it doesn't really matter?). ANTLR also does LL(*).

so i guess we should make Jasper LL(k) if possible, or LALR(1) if not.

what is scheme? i bet it's LL(*). not sure tho.

golang also makes a big deal about being parsable without a symbol table. i guess that's important for efficiency.

--

http://programmers.stackexchange.com/questions/19541/what-are-the-main-advantages-and-disadvantages-of-ll-and-lr-parsing

--

an argument for block syntax that i don't quite understand:

" munificent 10/21/10 Hi, I'm the author of that post.

On Oct 21, 6:13 am, Corey Thomasson <cthom.li...@gmail.com> wrote: > block arguments for example, are just another form of closures, which go has

I'm aware of that. Wasn't that clear from the article? It was under the "syntax" section and specifically says that the parser will desugar it to a regular anonymous function. The goal here isn't to change semantics, it's to add some syntactic support so that scoped behavior doesn't look so awkward. I think a little syntax can go a long way towards encouraging something to be idiomatic.

--

"

C# pulled it off. There's nothing magical at all about operator overloading, with the exception of the assignment operation, which

--

bwk complains that Pascal http://www.lysator.liu.se/c/bwk-on-pascal.html has too few levels: " while (i <= XMAX) and (x[i] > 0) do ...

...

By the way, the parentheses in this code are mandatory - the language has only four levels of operator precedence, with relationals at the bottom. ... There is a paucity of operators (probably related to the paucity of precedence levels).

"

so maybe we need more than 4 levels but less than 10

list of a bunch of languages and their precedence:

http://rosettacode.org/wiki/Operator_precedence

Go has 6 levels, mb that's good

But i would say has at least 7 level:

Go's levels are:

Precedence Operator 6 all unary operators 5 * / % 1 & &^ 4 + -

^
    3             ==  !=  <  <=  >  >=
    2             &&
    1             ||
    0             things that in Go form statements, not expressions, e.g. ++

in Go, % is remainder, 2 and & and &^ are bitwise stuff,

and ^ are bitwise OR stuff, && is AND and is OR. So the levels can be described as:

6 unary 5 multiplicative and most bitwise 4 additive and OR-ish bitwise 3 comparison 2 AND 1 OR 0 things that in Go form statements, not expressions, e.g. ++

note: things with more levels are not always that bad, e.g. java is very orderly, and also needs more levels because it treats more things as 'operators':

http://introcs.cs.princeton.edu/java/11precedence/

some of the extra things that java deals with as operators are:

unary precedence: [] access array element . access object member () invoke a method

unary precedence, but above the former, also, right to left (everything else mentioned here is left to right unless it says otherwise): () cast new object creation

comparison precedence: instanceof type comparison

lowest precedence (right-to-left): assignment

" Precedence order gone awry. Sometimes the precedence order defined in a language do not conform with mathematical norms. For example, in Microsoft Excel, -a^b is interpreted as (-a)^b instead of -(a^b). So -1^2 is equal to 1 instead of -1, which is the values most mathematicians would expect. Microsoft acknowledges this quirk as a "design choice". One wonders whether the programmer was relying on the C precedence order in which unary operators have higher precedence than binary operators. This rule agrees with mathematical conventions for all C operators, but fails with the addition of the exponentiation operator. Once the order was established in Microsoft Excel 2.0, it could not easily be changed without breaking backward compatibility. "

D has 15 levels, and C++ 17, and C 15 woah:

http://stackoverflow.com/questions/2669153/d-operator-precedence-levels-version-1-0 http://en.cppreference.com/w/cpp/language/operator_precedence http://web.ics.purdue.edu/~cs240/misc/operators.html --

function vs operator precedence in haskell: basically functions bind tighter, but note that they cannot bind to an operator at all (unless it is surrounded by parens):

http://stackoverflow.com/questions/3125395/haskell-operator-vs-function-precedence

Prec- Left associative Non-associative Right associative edence operators operators operators 9 !! . 8 ^, ^^, 7 *, /, `div`, `mod`, `rem`, `quot` 6 +, - 5 :, ++ 4 ==, /=, <, <=, >, >=, `elem`, `notElem` 3 && 2

1 >>, >>= 0 $, $!, `seq`

-- http://www.haskell.org/onlinereport/decls.html#prelude-fixities

--

toread:

http://kevincantu.org/code/operators.html

http://echo.rsmw.net/n00bfaq.html

http://blog.psibi.in/2013/02/operator-precedence-and-associativity.html

https://www.fpcomplete.com/blog/2012/09/ten-things-you-should-know-about-haskell-syntax

--

arguments to annotations should work like arguments to functions, with defaults overridden by keyword args coming after positional args:

e.g. java can't do this but scala and .NET can:

" we can not mix-and-match the two styles in Java:

    @SourceURL(value = "http://coders.com/",
    mail = "support@coders.com")
    public class MyClass extends HisClass ...

Scala provides more flexibility in this respect

    @SourceURL("http://coders.com/",
    mail = "support@coders.com")
    class MyScalaClass ...

This extended syntax is consistent with .NET’s annotations and can accomodate their full capabilites "

-- http://docs.scala-lang.org/tutorials/tour/annotations.html

---

hoon's ?: for 'if' and ?- for 'case' look good

---

in clojure, syntax-quoting:

(defmacro with-open-2 [[r resource] & forms] `(let [~r ~resource] (try ~@forms (finally (.close ~r)))))

(instead of:

(defmacro with-open-1 [[r resource] & forms] (list 'let ;; clojure symbol -- an atom of code! [r resource] (concat (list 'try) forms (list (list 'finally (list '.close r))))))

)

---

nimrod's use of : for blocks is interesting; i guess that's significant indentation though?

---

the syntax of this is interesting to me, particularly the () surrounding a block whose output is desired, and the use of 'as' instead of = to bind a name to the value computed by that block

" select dashboards.name, log_counts.ct from dashboards join ( select dashboard_id, count(distinct user_id) as ct from time_on_site_logs group by dashboard_id ) as log_counts on log_counts.dashboard_id = dashboards.id order by log_counts.ct desc "

---

idea from Hoon:

In Hoon, the +-<> axis limb syntax for nested chains of head/tail is interesting, if perhaps too complicated as syntax ( http://urbit.org/doc/hoon/tut/2/ ). But perhaps we can generalize this anyways?

specify composition of head() and tail() by alternation of single characters, e.g.

++-- might mean "head head tail tail", reading right to left (Hoon works left to right and has a more complicated scheme but i prefer this)

generalize this: allow user to specify custom compositions of arbitrary things (functions? or more general?) with single letters

--

Want to say stuff like

A_(x+1, y) = 2A_(x,y)

--

wambotron 3 hours ago

link

Why use

public string $x = ;

instead of

public $x:string = ;

It seems inconsistent to me, probably because I've used AS3/Haxe.

reply

---

having an indentation error when you cut and paste a single line into an ipython console really gets me:

In [1458]: xx, yy = meshgrid(range(shape(areas)[0]), range(shape(areas)[1])) IndentationError?: unexpected indent (<ipython-input-1458-f399ae675b48>, line 1)

If you want to paste code into IPython, try the %paste and %cpaste magic functions.

--

" Python indexing is done using brackets, so you can see the difference between an indexing operation and a function call. " -- http://lorenabarba.com/blog/why-i-push-for-python/

--

there is something to Hoon's idea of uniformally using prefixes to indicate different types of literals (as opposed to e.g. Python's "3." to indicate "3.0", a floating point literal; but Python's may be easier to learn)

--

there is something to Hoon's idea of providing URL-safe literal syntaxes

--

As a rule of thumb, if grouping punctuation characters can ever appear without spaces separating them from neighboring constructs, then one should never have grouping punctuation (such as '(') double as part of an operator (e.g. '~()').

--

To take a stab at the second question, I think BNF that fits on one page https://docs.python.org/3.4/reference/grammar.html vs http://perldoc.perl.org/perlfaq7.html#Can-I-get-a-BNF%2fyacc...

This basically means Perl is very complex and its grammar can be self contradicting, such that behavior is undefined. C++ has a similar problem to a lesser extent.

reply

riffraff 1 hour ago

link

To expand on the non-syntax, perl has an incredible amount of language-level features, which may appear very weird to people who have only seen it from afar.

For example, perl formats[0] are language-level support for generating formatted text reports and charts, which is basically a whole sublanguage (much like perl regexen).

[0] http://perldoc.perl.org/perlform.html

reply

Demiurge 1 hour ago

link

That's pretty crazy. I used Perl a lot, but haven't seen that feature :)

reply

dragonwriter 29 minutes ago

link

> To take a stab at the second question, I think BNF that fits on one page

Maybe for python, but not for Ruby. Ruby is not particularly simple to parse (though it may be simpler to parse than Perl, and clearly seems to be simpler to implement -- or perhaps its just that more motivation exists to implement it.)

reply

Demiurge 11 minutes ago

link

First google result: http://www.cse.buffalo.edu/~regan/cse305/RubyBNF.pdf

I think 2 pages is not bad :) The point is, Perl is just impossible to formally define, it depends on the implementation to make arbitrary choices. This means multiple implementations are much harder, if possible.

reply

dragonwriter 3 minutes ago

link

> First google result: http://www.cse.buffalo.edu/~regan/cse305/RubyBNF.pdf

Yeah, but its not:

1) One page, or

2) Current (it claims to be for Ruby v1.4), or

3) (apparently, I can't verify this for the version of Ruby it claims to represent) Accurate [1]

[1] http://stackoverflow.com/questions/663027/ruby-grammar

But, yes, Ruby can be parsed independent of being execution, which makes means you can separate the work of a separate implementation into (1) building (or reusing) a parser, and (2) building a system to execute the result of the parsing. Being able to divide the work (and, as a result, to share the first part between different implementations) makes it easier to implement.


stcredzero 1 hour ago

link

When I looked at such things last, Python had about 29 terminals & nonterminals in its grammar. Ruby had 110. (These are numbers I remember from playing with particular parser libraries, so YMMV.) By contrast, a commercial Smalltalk with some syntax extensions had 8. I have no idea about Perl, but I'd guess it's about the same as Ruby.

reply

--

pornel 16 hours ago

link

I hope ideas will flow the other way too, and Rust adopts some sugar from Swift.

I find `if let concrete = optional` sooo much nicer than `match optional { (concrete) => , _ => {} }`.

Rust has solid semantics that covers more than Swift. OTOH Apple has put their UX magic into the language's syntax. Some syntax shortcuts, like `.ShortEnumWhenInContext?` are delightful.

The two combined will be the perfect language ;)

reply

andolanra 10 hours ago

link

You could always write this yourself with a macro:

    macro_rules! if_let {
      ($p:pat = $init:expr in $e:expr) => {
        { match $init { $p => $e,_  => {}, } }
      }
    }
    fn main() {
      let tup = (2, 3);
      if_let!{(2, x) = tup in println!("x={}", x)}; // prints x=3
      if_let!{(5, x) = tup in println!("x={}", x)}; // doesn't print
    }

It's slightly more heavyweight, but still not too bad.

reply

--

xixixao 15 hours ago

link

Funny how this just falls out of JS (in CS):

  if concrete = optional
    call concrete

reply

stormbrew 15 hours ago

link

Pretty much every language to date allows this construct. And it's often considered a bad idea in languages where = is assignment because it's so easy to get it confused with == and accidentally assign to the thing you're trying to compare to, which will usually evaluate to truthy.

The special things about the way it works in Swift are:

In JS, C, CS, Ruby, etc. you're not really doing anything useful if you assign a value to another name just for one branch of an if statement. In Swift you are.

reply

---

Io has a weird syntax where you can put some parameters to a function to the left of it, separated by a space, and others in parens to its right. I find it to be very readable for some things. Presumably it's because these are not functions, but object methods.

Io> s findSeq("is")

> 2

Io> s findSeq("test")

> 10

Io> s slice(10)

> "test"

Io> s slice(2, 10)

> "is is a "

---

i still like the idea of . reversing the ordering.

but if we are doing things in the usual ordering, e.g. f

g x for (f(g(x))), then f.g.x as interepreted by typical languages isn't switching the ordering.

so let's have it be f

g x and x.g.f

or, alternately, use . instead of

(that is, use . as Haskell's $), since is hard to type on android keyboards, except that it still binds very tightly:

f (g (x)) = f.g.x = f.(g x) = f(g(x)) = (in other languages, f[g[x]] )

  in contrast to

f g x, which if like (f(g))(x) or (in other languages, f(g,x))

  but wouldn't it be more convenient to have a looser binding like Haskell's $? e.g. for f . g x to be f(g x) instead of (f(g))(x)? not sure. 

isn't this exactly what haskell does with . anyhow?

--

using the Python style all defaulted params must be named, but the caller can choose to give any argument by name

i guess that's okay

--

mb force named arguments if more than a few parameters (e.g. only permit 3 positional params).

--

hoon has a good idea: urlsafe atom syntax

--

serial/parallel sigils?

--

--

it's desirable to be able to write chains of processing steps from left to right, like Ruby, instead of nested, like Lisp:

[1,2,3].map {

nn*n }.reject {nn%3==1 }
  is better than:

(remove-if (lambda (n) (= (mod n 3) 1)) (mapcar (lambda (n) (* n n)) '(1 2 3)))

but i'd prefer not to have everything OOP like in Ruby, because it seems silly to me that a symmetric two-argument function like addition should be defined in an asymmetric way.

so how could we do that if map and reject were just functions?

you'd just have to have syntax operators that lets you say "take the result on my left, and give it as the first argument to the function on my right". to generalize, let the user choose which argument on the right gets the thing on the left.

perhaps this is how arrows work in Haskell, i'm not sure.

so e.g., using '

' as the operator and '-' to mark where to put the argument, you'd have something like:

[1,2,3]

map - {nn*n }reject - {nn%3==1 }

note the similarity to Unix shell syntax. Why is this longer than the above Ruby code? because we're explicitly specifying at which arguments to put the incoming results. We could say that if no place is specified (by the next pipe), then put it as the first argument:

[1,2,3]

map {nn*n }reject {nn%3==1 }

now, what about Ruby's 'yield'? we don't need 'yield' if we are just passing anonymous functions, we only need it if the block coming in can 'return' in the larger scope. And, to make things as concise as possible, we may as well omit the argument lists in the anonymous lambdas and use special default variables to match positional args:

[1,2,3]

map {$1*$1 }reject {$1%3==1 }

imo that's even easier to read than Ruby!

interesting that this scheme uses two kinds of default variables: the target of the pipe (set by '-', or, by default, the first argument of the function), and the variables for the anonymous lambdas ($1, $2 etc)

note: instead of $1,$2, etc, should we use x,y,z or a,b,c?

--

mb use $, the aliasable variable sigil, for global vars too ruby uses $ for globals, i think

--

---

i guess we don't need a lot of reserved words and syntax that other languages have because we are using annotations; e.g. 'public void' are both things that we would put in annotations, but that other languages have to have base-level keywords and therefore syntax for. This makes us slightly more verbose for those things, though; maybe use capitalized letters for these? This would seem to fit with the idea of capitalized letters as distinguishable labels.

---

perhaps use single-letter capitialized letters (e.g. X, Y, A, B, C) as 'magic identifiers', e.g. variables that macros can distinguish by name?

---

we want to use / instead of : but that means using ++,--,, for arith

---

inside data constructors, we need a way to create dicts, and we want to do that by labeling arrows. So use key:value (or, more likely, key/value). But if we dont want to require commas, then either we have [key1 / value1 key2 / value2], and the / binds tightly, or we use the whitespace for grouping and have [key1/value1 key2/value2]. In the former case, if we are also using labels to delimit subblocks in blocks, e.g.:

{if condition then: stuff1 more_stuff_1 else: stuff2 more_stuff_2 }

is equivalent to

{if {condition}

then: { stuff1 more_stuff_1 }

else: { stuff2 more_stuff_2 } }

then in this case the :, or /, is binding loosely.

So, either:

note however; if else: is optional, then using then: and else: as separators by themselves makes nested if statements ambiguous ('if then if then else' could be either 'if then (if then) else' or 'if then (if then else)')

---

maybe have suffixes for unit types, e.g.

1s for one second, 1m for one meter.

date syntax? arent rel and abs dates similar things with diff. interpreations? e.g. abs date is just a relative date plus a constant reference '0' date.

---

perhaps a jasper character meta for hoonstyle atom syntaxes: a prefix sigil that means character meta until whitespace?

---

hoon's $ with partial argument matching and ^$ for return to parent of caller

---

---

sigil for 'evaluate all arguments of this function when the fn is reached' (strictify fn at callsite)

sigil for 'evaluate this asap' (like strictify, but even earlier, to get a head start)

sigil for strictify

---

bad idea: could use '.' to specify when a function with default arguments is 'closed', e.g. when the default arguments should be applied. i don't like this because then either you have to use '.' all the time (too much typing), or the caller has to know when the function they are calling has default args (one useful way to use default args is to extend already in-use functions w/o breaking old code)

---

so, how about the reverse? use '.' or something like it to 'hold open' a function with default args after all non-default args have been assigned.

hmm.. maybe instead, provide operators to take a function with default arguments and transform it into one with mandatory args. one operator could transform a list of known, named defaults to mandatories, and another could transforma all defaults to mandatories. the function would still 'remember' its old defaults, so another operator could be provided to do the reverse (e.g. you could still change your mind later and 'close' the function). in fact in this case we may as well just have a real partially applied function object?

this combination of features seems to do a lot:

in sum, what we seem to have is:

as with fexprs, the idea of having a __call__ magic method highlights the need for a 'sufficiently smart compiler' (even as part of the interpreter) that can trace/recognize which objects have no chance of having the default __call__ method overridden, so that indirection is not needed upon every function call (e.g. an increment within an inner loop should just be an assembly-language CALL to the increment function, or better still, inlined to an ADD, rather than an indirect dispatch via a pointer __CALL__ to a built-in CALL subroutine which then eventually calls the increment function). For a compiler, this would seem to involve lots of inlining; i'm not even sure how to handle this in an interpreter, perhaps with dynamic code generation? yuck (incidentally this might go well with the 'custom opcodes' idea of jasper assembly)

---

Why did they have to mess with the selector syntax?

I really loved the Smalltalk selector syntax and they seem to have gone the MacRuby? route. If they really needed to use a dot why not do something like:

object.[sel1: v1 sel2: v2]

By Matthew McCowan? at Mon, 2014-06-02 14:39

reply

---

if alpha suffixes to digits are units, e.g. 3mi for 3 miles, then we also need things like 3mi/hr for 3 miles per hour. But is it 3 mi/hr or 3 mihr? And how to parse as different from div(3mi,hr)? or is there no difference, e.g. the units are actually real identifiers that you can really divide by?

yeah, i guess e.g. 'hr' and 'mi' must also be an ordinary identifier for this to work. Maybe in order not to cause contention, we should just define these are 'hour' and 'mile', and let users define the shortcuts 'hr' and 'mi' if they want? Or maybe standardization is good..

---

so ?x for 'variable x', e.g. a random variable in statistics.

is there a way to work in ? for questions too? we could use prefix ? for variables, and suffix ? for questions:

((?x + 3) == 7)?

here, i guess 'question' means SAT (satisfaction); 'find an assignment to the variables that make the value of the content of the question TRUE'. E.g. '((?x + 3) == 7)?' is an EXPRESSION which returns the graph 'x/4' (why doesn't it just return '4'? because you might have multiple variables in some questions).

note: to answer questions we need 'knowledge domains' just like there are 'state domains', into which we can put hypothetical assumptions

---

since we want to have both raw functions and also objects, we need

(a) a syntax for function application, (b) a syntax for object field/method access, (c) a syntax for graph traversal (d) preferably, (a) and (b) and (c) are unified (e.g. a node is a function that maps edge labels to edges, an object is a node whose edge labels are field names) (e) in Python, we have the problem that when you have a pipeline where each operation is a method on the result of the previous operation, functions are composed by '.' and read from left to right, whereas when you have a pipeline where each operation is a bare function applied to the result of the previous function, functions are composed using parens and read from right to left; this is fine except if you have a pipeline where the functions you need are a mixture of the two, you have to mix left-to-right and right-to-left in the same expression. There should be lightweight syntax to allow you to make it all left-to-right or all right-to-left.

--

from [1], some syntax ideas:

(otoh, the exact lookup should probably be the assumed default rather than needing an extra character to be typed; maybe use the well-known syntax "T[3,2]" as a shortcut for "'(T 3 2)"? note that if there are *s then the ' is not applied; T[*, 2] (or just T[,2]; you could use blanks as shortcuts for stars) would be a shortcut for (T * 2), not for '(T * 2), since we don't expect a singleton set in the answer)

...

also, they aren't just sets, because they remember that the lookup is only partial, e.g. if T2 = [a b c d] and T4 = [e f g h], then T[*,0] = [T2 T4], but T[*,0][1] is not map(..[1], {T2 T4}), which would yield {b f}, nor map(..[1], [T2 T4]), which would yield [b f], but rather an answer to the remaining question, 'which row?', with the answer 'row 1', which is T4. (note; in this paragraph we've used the notation a[x] to mean an index into table a, but [x] to mean the data constructor of an array (table), and we've used ..[] to specify the indexing operation as an independent function, e.g. the '..' prefix means: ..y means 'the function f such that f(z) = zy, where y is some weird syntax such as [x]'. E.g. if f = ..[1], then f(a) = a[1]. This syntax could be useful.

--

y'know, i guess we really have to drop the mathematical convention of row before column in indices, at least if we want to keep the syntax A = [1 2 ; 3 4] for

  1 2
  3 4

because we need our constructor syntax convention to generalize to higher-dimensional arrays, e.g.

B = [1 2 ; 3 4 ;; 5 6 ; 7 8 ;; 9 10 ; 11 12]

because using the row before column syntax, A[1,0] = 3; which means that the thing which is delineated with spaces is motion in the second dimension, and the thing which is delineated with semicolons is motion in the first dimension. But we want to generalize this by having the third dimensional motion be delimited with multiple semicolons. So it's irregular: first dimension = semicolon, second dimension = spaces, third dimension = two semicolons, fourth dimension = three semicolons, etc.

For similar reasons, this means that if we have a 1-d vector, C = [1 2 3], C[1] == 2, but if this is turned into a degenerate 2-D vector, we would have C[0,1]==2, not C[1, 0]==2. Which is i guess why in Octave, it's more natural to assume that 1-D vectors are column vectors rather than row vectors (even though the language gives you a row vector if you just type [1 2 3]).

We could stick with the mathematical convention for indexing if we flipped our convention for data constructors, ie

A = [1 2 ; 3 4] for 1 3 2 4

the annoyance here is that, since [1 2] is indeed a column, not a row, vector, this means that for consistency, the debugger should be printing 1-D arrays as columns, e.g. the data element constructed as [1 2 3] should be shown as:

 1
 2
 3

but the debugger could just choose to print 1-D arrays as [1 2 3].

another place that the row before column convention causes mischief is that in plotting functions, you typically plot(x,y), that is, you specify the variable that will go on the horizontal axis before the variable that will go on the vertical axis, but this is the transpose of how raster (pixel) matrices are displayed in imshow, assuming that the raster matrices are being indexed using the mathematical convention of [row, column], e.g. [vertical, horizontal] or [y,x]. This could be fixed by having our plotting functions take plot(y,x), e.g. take the dependent variable first.

However that's still a little confusing because from elementary school, people are used to writing coordinates of graphs in the format (x,y) or rather (horizontal, vertical).

so this could all work either way. We have two choices: (a) keep the linear algebra indexing convention, have [1 2 3] be a column vector, print it as [1 2 3], but print [1 2 ; 3 4] as the matrix 1 3 2 4 and have [1 2 ; 3 4][1,0] == 2, have plot(y,x), and use the raster coordinate system y,x

or

(b) let the linear algebra indexing convention be darned, have [1 2 3] be a row vector, print it as [1 2 3] or 1 2 3, print [1 2 ; 3 4] as the matrix 1 2 3 4

and have [1 2 ; 3 4][1,0] == 2, have plot(x,y), and use the raster coordinate system x,y

(a) is more consistent with linear algebraic mathematical tradition. But (b) might be easier for most people; i can't help thinking of 'horizontal axis' as 'x-axis' and vertical as y-axis, and then think that the x-coordinate should come first, and lots more people have taken (and understood) geometry in elementary school and learned to think of the coordinate (3,4) to mean horizontal 3, vertical 4 than have taken (and understood) linear algebra in high school or college.

So i guess i'm saying we should go with (b); let T[1,0] mean 'column 1, row 0'. This will give a higher cognitive burden to people copying in algorithms from linear algebra books and papers or porting from MATLAB or Octave or Python's NumPy? (they'll have to transpose stuff sometimes when the linear algebra does not say to, and they'll have to think about when that is), but the benefit is a lower cognitive burden to the larger population of people who think in terms of (horizontal, vertical) coordinate systems and who are just indexing into multi-dimensional arrays on their own, without reference to existing linear algebra code.

---

what does Haskell's currying syntax buy you?

First, it gives you a convenient syntax for prefix partial function application; if you have a 3-argument function and want to apply the first two arguments to get a new function, just say g = f x y (instead of something like 'g = partial(f, [x y])')

Say you are supposed to create a function that takes three arguments. Let's say that you have on hand a higher-order function that takes the first two two arguments and produces a function that takes the third arguments and produces the result. With Haskell, this higher-order function IS the function you need. Without currying, you'd have to create a wrapper function that takes the three arguments, gives the first two to the first function, gets the resulting function, gives the third argument to that function, and returns the result. Eg

with currying: f = h

without currying: f x y z = (h x y) z

this latter benefit may have other benefits too, i'm not sure, see 'point-free' style.

---

discussion of partial indexing (slice indexing) into multidim tables being like partial application of functions is in [2], with some notes on 0-ary functions too

---

f 3 _ 5 cant be syntax for leaving an argument unbound (ie for 'lambda x (f 3 x 5)') bc want _ to imply a junk argument

---

trellis and prolog both could use an assertion/truth maintanence syntax

---

mb use ~ in various syntactic shortcuts/convenience constructors, like Hoon

--

to indicate the path from a to b to c, as opposed to the value at a.b.c, mb use a.b.c. or .a.b.c or .a.b.c. (ie additional period prefix, suffix, or both)

--

could use something like

f..a=3..b=4.5

to denote the path that starts with the node representing function 'f', then follows the edge that assigns dimension/keyword/role 'a' to 3, then follows the edge that assigns dimension/keyword/role 'b' to 4, then follows the edge labeled '5' (so 'f' must have been a function which takes two keyword arguments, 'a' and 'b', whose type includes integers, and returns a function which takes an argument whose type includes integers). the use of '..' rather than '.' denotes that any contiguous sequence of edge-follow commands that use '..' instead of '.' are commutative, e.g. this path commutes with f..b=4..a=3.5

NOTE: this shows that: (a) keyword arguments == dimensions in a multidimensional table/dataframe == roles in a relation == edges in a hyperedge (if the hyperedge has preferred roles 'src' and 'target' and in addition other roles) (a.i) in this case each role in the hyperedge, besides 'src' and 'target', has its own edge label, representing the value chosen for that attribute/dimension (b) an alternate representation for hyperedges with preferred roles 'src' and 'target' and in addition other roles is a path where each step on the path represents the 'partial assignment' to an edge role. This is just currying.

---

~ could also be used to indicate semi-automatic type coercion, ie no type coercion except where there is a ~, but where there is a ~, the compiler will figure out the coercion path (if and only if there is a unique path? or, we can probably use at least a few rules, eg a shorter path overrides a longer path, or possibly better, a more 'specific' path overrides a less 'specific' path (does this mean e.g. a->b->c does not override a->d->e->f, but a->b does override a->b->c?) (possibly see also semantic inheritance networks?)). ~ can be in type signatures, too, to indicate that a function is automatically coercing, eg 'print' accepts an input of type '~string', not 'string', indicating that it accepts any input that is coercable to type 'string'.

--

wart ( http://akkartik.name/post/wart ) uses the 'attached-operators-implicitly-surrounded-by-parens' rule, with left-associativity for everything else, and remarks that:

---

http://akkartik.name/post/wart

" In order to provide macros, lisp uniformly uses prefix everywhere, including for arithmetic. Wart provides infix operators in an elegant way without compromising macros.

4 * 3 # same as (* 4 3) ⇒ 12

This has historically been hard to do because of the burden of providing precedence rules for nested expressions that are both intuitive and extensible. Wart's infix operators have only one hard-coded precedence rule: operators without whitespace are evaluated before operators with whitespace.

n * n-1 # does what you expect

(The catch: you can't use infix characters like dashes in variable names, something lispers are used to.) "

more details at:

http://arclanguage.org/item?id=16775

---

" ...infix syntax. With the right kind of a.b syntax, we could write `(a b ,c) as qq.(a b uq.c), trading in some readability for the flexibility to define our own variants.[1] Would you still want to have dedicated quasiquotation syntax then? " -- http://arclanguage.org/item?id=16406

---

"...I want to argue that having syntax for quote/backquote/unquote is valuable. In wart all symbols are overrideable[1]. It uses punctuation for quote/backquote/unquote/splice to highlight that they are different from symbols, more fundamental to the programming model.

[1] The one exception is fn. Perhaps I should make it λ? Nah, that doesn't solve the problem. Hmm, the only punctuation character wart hasn't already put to use is #.. :D " -- http://arclanguage.org/item?id=16378


character meta is similar to HTML CDATA, but i guess in addition CDATA is like a HERE document


useful syntax for 'construction' of complex values using 'where':

"For example, suppose we have some parsers p1 and p2 which recognize, respectively, the non-terminals t1 and t2... the parser that recognizes t1 t2 (sequencing) could be defined as follows:

> (p1 ‘then’ p2) inp > = [((v1, v2), out2)

" -- Lenient evaluation is neither strict nor lazy citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.137.9885 by G Tremblay - ‎2000 - ‎Cited by 2 - ‎Related articles
(v1, out1) ← p1 inp; (v2, out2) ← p2 out1]

---

so, use : to open a block, like 'begin', its like '(' except it is autoclosed at the end of a block

; is another name for EOL or '\n'

;; is end-of-block, which is also '\n\n'

';;;' binds more loosely than ';;', etc.

commas group things together and ',,' binds closer than ','

e.g.

f x (g a b) z can also be written f x g,a,b z or f,x,g,,a,,b,z. Commas can have spaces around them without affecting meaning, eg f, x, g,,a,,b, z

perhaps it would be better for the first comma to be implicit?

eg. f x, g a,,b, z but that could also be written f x, g a,, b, z. so maybe commas can NOT have spaces around them? eg. f x,g a,,b,z

also since you can just say f x y z, maybe the first comma should only be allowed to be used for the next level in:

f x g,a,b z

that looks best so far

e.g.

f x g,a,b z == f (x, (g(a,b)), z) (in Python) or == f x (g a b) z

is f x g,a,b z really easier than f x (g a b) z? maybe

is f x g,h,,a,,b,c z really easier than f x (g (h a b) c) z? doesnt look like it, its too hard to distinguish , and ,, when skimming.

could use commas for comments ,, comma comment

or could use ; for EOL and ;; for comments

or could use ; for EOL and ;; for end-of-block and ;;; for comments (three characters for comments seems like too many, tho)

could use < for Haskell's $ and << for lt:

'f x < g y' instead of 'f x (g y)' or << for $ and < for lt: 'f x << g y' instead of 'f x (g y)' or could save << for matching pair for character meta (this is most likely)

so the idea is that a block can begin and end with {}, but can also begin with : or :: and end with ;;, except that ;; ends ALL blocks that were started with : or ::. : lasts until the next : or until the end-of-block, and :: goes until the end-of-block. E.g.

if a:: print 'a' if b: print 'b' print 'in fact, a AND b' if c: print 'c' print 'in fact, a AND c (but not necessarily b)' ;;

note that two empty lines are an implicit ';;'

or, we could use ':' to simultaneously be a suffix that indicates a keyword argument, and also do the above. In that case it would be:

if a then:: print 'a' if b then: print 'b' print 'in fact, a AND b' if c then: print 'c' print 'in fact, a AND c (but not necessarily b)' ;;

this might irritate people, because 'if a::' would be illegal ('if' doesn't take any keyword argument named 'a').

so we could go back to using '/' for keyword arg:

if a:: print 'a' if b: print 'b' print 'in fact, a AND b' if c then/: print 'c' print 'in fact, a AND c (but not necessarily b)' ;;

eg map f x == map fn/f list/x

still unclear what ,, would be used for. matrix dimensions? interpolation into macros? macros?

---

i guess we might want notation for multiple arc types like ->, -C>, _>, -_> mb quicker with / for arc: /, C/, _/, -_/ nah mb -> is 'longhand' for /, so back to the first proposal: ->, -C>, _>, -_>

---

i guess if we allow a distinction between attached and unattached operators, then it's easier to deal with '-'s dual role as subtraction and unary negation; -1 is unary negation, - 1 is subtraction (which needs another argument).

to expand the use of '-', we could say that in general it's 'inverse'; and the 'inverse' of a number is its negative

in fact, like =s, we could go further and say that the inverse of a number depends on context, which is given by a view. So, the default view of numbers gives their inverse as negative (additive inverse), but you could switch to a view that gives it as multiplicative inverse, etc.

---


Footnotes:

1.

  

2.

 and