Difference between revision 31 and current revision
No diff available.from Rust:
pub(crate) bar;
means that 'bar' is public, but only within the current crate.
" You can also specify a path, like this:
pub(in a::b::c) foo;
This means “usable within the hierarchy of a::b::c, but not elsewhere.” "
---
dnautics 129 days ago [-]
you should see Julia. There's a lot of syntactic sugar that borrows from the best of other languages, for example the pipe
| > operator from elixir and do...end block syntax from ruby. |
---
golang seems to use backticks for annotations?
" var d struct { Main struct { Kelvin float64 `json:"temp"` } `json:"main"` } " -- [1]
---
i had said we wanted LL(1) syntax, but then i decided LL(k) is good enough; but some ppl like LALR (eg [2]). LALR(1) (LALR without an index refers to LALR(1)) is a subset of LR(1), and LL(k) and LALR(j) are incomparable [3], although every LL(1) grammar is also LR(1) [4]. This book says "almost all LL(1) grammars excepting some contrived examples are also LALR(1);" [5]. And this one says "class of LALR(1) grammars is large enough to include grammars for most programming languages. (This does not mean that the reference grammars for most programming languages are LALR(1): they are often ambiguous.)" [6].
There's a slide in [7] labeled "Hierarchy of grammar classes" that shows some of this as a diagram.
So i guess we want our language to be both LL(k) and LALR(k); preferably LL(1) and LALR(1). [8] has some comments on how to detect this; notably, a sufficent but not necessary condition for an LL(1) language to be LALR(1) is "if all symbols with empty derivations have non-empty derivations".
this page says "there is no way to decide if a grammar can be converted to LL(1) or to LR(1) except by trying to do it - if you succeed, then it was". [www.cs.man.ac.uk/~pjj/complang/grammar.html]
these posts talk about how to convert LALR(1) grammars to LL(k):
here's a possibly related very technical post: http://etymon.blogspot.com/2006/09/why-i-prefer-lalr-parsers.html
and another: https://compilers.iecc.com/comparch/article/01-10-069
we might also consider SLR instead of LALR(1): " LALR(1) is used in most parser generators like Yacc/Bison
We will nevertheless only see SLR in details: ((i think they mean, in this slide presentation))
---
this thread gives various examples of type-dependency in C++ parsing: https://news.ycombinator.com/item?id=11148436
---
some languages have something called 'using static CLASSNAME' which is kinda like Python's 'import * from FILENAME', except that instead of importing top-level stuff from a module, any static methods within class CLASSNAME are all imported to the top-level namespace.
sounds to me like a great way of giving the benefits of top-level functions (eg for defining new arithmetic operators) while also doing the OOP way of having everything defined inside some class
---
"#region" for documentation, and IDE expansion/collapse
---
" upper-casing exported identifiers in Go packages "
---
anywhere you can have an expression or statement, you can have a block enclosed by {}s
---
" Now, I won’t claim that C has a great syntax. If we wanted something elegant, we’d probably mimic Pascal or Smalltalk. If we wanted to go full Scandinavian-furniture-minimalism, we’d do a Scheme. Those all have their virtues.
I’m surely biased, but I think Lox’s syntax is pretty clean. C’s most egregious grammar problems are around types. Dennis Ritchie had this idea called “declaration reflects use” where variable declarations mirror the operations you would have to perform on the variable to get to a value of the base type. Clever idea, but I don’t think it worked out great in practice.
Lox doesn’t have static types, so we avoid that.
What C-like syntax has instead is something you’ll find is often more valuable in a language: familiarity "
---
" Is a language built all out of parens simple... is it free of interleaving and braiding? And the answer is no; Common Lisp and Scheme are not simple is this sense, in their use of parens. Because the use of parentheses in those languages is overloaded; parens wrap calls, they wrap grouping, they wrap data structures; and that overloading is a form of complexity... We can fix that; we can just add another data structure, it doesn't make Lisp not Lisp to have more data structures. It's still a language defined in terms of its own data structures, but having more datastructures in play means that we can get rid of this overloading in this case... " -- Rich Hickey, https://www.infoq.com/presentations/Simple-Made-Easy 0:26:03. Note: the slides say more specifically how Clojure addresses this by adding another data structure; they say, "Adding a data structure for grouping, e.g. vectors, makes each simpler"
---
complaints about C:
" For starters, hiding identifiers after arbitrarily long type expressions, instead of starting a line/block/expression with a name, is an un-fixable PITA. (Sorry, AT&T, Algol had it right)
Requiring a “break” in a case statement is a botch, instead of some kind of “or” / “set” / “range” test. "
---
if Conditional
Extend if conditional with declaration, similar to for conditional.
if ( int x = f() ) ...
case Clause
Extend case clause with list and subrange.
switch ( i ) { case 1, 3, 5: ... list case 6~9: ... subrange: 6, 7, 8, 9 case 12~17, 21~26, 32~35: ... list of subranges }
switch Statement
Extend switch statement declarations and remove anomalies.
switch ( x ) { int i = 0; allow declarations only at start, local to switch body case 0: ... int j = 0; disallow, unsafe initialization case 1: { int k = 0; allow at lower nesting levels ... case 2: disallow, case in nested statements } ... }
choose Statement
Alternative switch statement with default break from a case clause.
choose ( i ) { case 1, 2, 3: ... fallthrough; explicit fall through case 5: ... implicit end of choose (switch break) case 7: ... break explicit end of choose (redundant) default: j = 3; }
Non-terminating and Labelled fallthrough
Allow fallthrough to be non-terminating in case clause or have target label providing common code. non-terminator labelled
choose ( ... ) { case 3: if ( ... ) { ... fallthru; goto case 4 } else { ... } implicit break case 4:
choose ( ... ) { case 3: ... fallthrough common; case 4: ... fallthrough common; common: below fallthrough at case level ... common code for cases 3 and 4 implicit break case 4:
choose ( ... ) { case 3: choose ( ... ) { case 4: for ( ... ) { ... fallthru common; multi-level transfer } ... } ... common: below fallthrough at case-clause level
Labelled continue / break
Extend break/continue with a target label to support static multi-level exit (like Java).
LS: switch ( ... ) { ...
... break LS; ... // terminate switch
Exception Handling
Exception handling provides dynamic name look-up and non-local transfer of control.
exception_t E {}; exception type void f(...) { ... throw E{}; ... termination ... throwResume E{}; ... resumption } try { f(...); } catch( E e ; boolean-predicate ) { termination handler recover and continue } catchResume( E e ; boolean-predicate ) { resumption handler repair and return } finally { always executed }
with Clause/Statement
Open an aggregate scope making its fields directly accessible (like Pascal).
struct S { int i, j; }; int mem( S & this ) with( this ) { with clause i = 1; this->i j = 2; this->j } int foo() { struct S1 { ... } s1; struct S2 { ... } s2; with( s1 ) { with statement access fields of s1 without qualification with( s2 ) { nesting access fields of s1 and s2 without qualification } } with( s1, s2 ) { scopes open in parallel access unambiguous fields of s1 and s2 without qualification } }
waitfor Statement
Dynamic selection of calls to mutex type.
void main() { waitfor( r1, c ) ...; waitfor( r1, c ) ...; or waitfor( r2, c ) ...; waitfor( r2, c ) { ... } or timeout( 1 ) ...; waitfor( r3, c1, c2 ) ...; or else ...; when( a > b ) waitfor( r4, c ) ...; or when ( c > d ) timeout( 2 ) ...; when ( c > 5 ) or else ...; when( a > b ) waitfor( r5, c1, c2 ) ...; or waitfor( r6, c1 ) ...; or else ...; when( a > b ) waitfor( r7, c ) ...; or waitfor( r8, c ) ...; or timeout( 2 ) ...; when( a > b ) waitfor( r8, c ) ...; or waitfor( r9, c1, c2 ) ...; or when ( c > d ) timeout( 1 ) ...; or else ...; }
Tuple
Formalized lists of elements, denoted by [ ], with parallel semantics.
int i; double x, y; int f( int, int, int ); f( 2, x, 3 + i ); technically ambiguous: argument list or comma expression? f( [ 2, x, 3 + i ] ); formalized (tuple) element list [ i, x, y ] = 3.5; i = 3.5, x = 3.5, y = 3.5 [ x, y ] = [ y, x ]; swap values [ int, double, double ] t; tuple variable
Alternative Declaration Syntax
Left-to-right declaration syntax, except bit fields. C∀: * int p; C: int * p;
References
Multi-level rebindable references, as an alternative to pointers, which significantly reduces syntactic noise.
int x = 1, y = 2, z = 3; int * p1 = &x, p2 = &p1, * p3 = &p2, pointers to x & r1 = x, && r2 = r1, &&& r3 = r2; references to x int * p4 = &z, & r4 = z;
A reference is a handle to an object, like a pointer, but is automatically dereferenced the specified number of levels. Referencing (address-of &) a reference variable cancels one of the implicit dereferences, until there are no more implicit references, after which normal expression behaviour applies.
0 / 1
Literals 0 and 1 are special in C: conditional ⇒ expr != 0 and ++/-- operators require 1.
struct S { int i, j; }; void ?{}( S * s, zero_t ) with( s ) { i = j = 0; } zero_t, no parameter name required void ?{}( S * s, one_t ) with( s ) { i = j = 1; } one_t, no parameter name required int ?!=?( S * op1, S * op2 ) { return op1->i != op2->i
| op1->j != op2->j; } |
S s0 = { 0, 1 }, s1 = { 3, 4 }; implict call: ?{}( s0, zero_t ), ?{}( s1, one_t ) if ( s0 ) rewrite: s != 0 ⇒ S temp = { 0 }; ?!=?( s, temp ) s0 = s0 + 1; rewrite: S temp = { 1 }; ?+?( s0, temp );
(my note: in the above, ?{} is syntax for constructor definition)
Postfix Function/Call
Alternative call syntax (postfix: literal argument before routine name) to convert basic literals into user literals, where ?` denotes a postfix-function name and ` denotes a postfix-function call.. postfix function constant argument call variable argument call postfix routine pointer
int ?`h( int s ); int ?`h( double s ); int ?`m( char c ); int ?`m( const char * s ); int ?`t( int a, int b, int c );
0 `h; 3.5`h; '1'`m; "123" "456"`m; [1,2,3]`t;
int i = 7; i`h; (i + 3)`h; (i + 3.5)`h;
int (* ?`p)( int i ); ?`p = ?`h; 3`p; i`p; (i + 3)`p;
Routine
Routine names within a block may be overloaded depending on the number and type of parameters and returns.
selection based on type and number of parameters void f( void ); (1) void f( char ); (2) void f( int, double ); (3) f(); select (1) f( 'a' ); select (2) f( 3, 5.2 ); select (3)
selection based on type and number of returns char f( int ); (1) double f( int ); (2) [ int, double ] f( int ); (3)
Extending Types
Existing types can be augmented; object-oriented languages require inheritance. (@ ⇒ use C-style initialization.)
timespec tv0, tv1 = 0, tv2 = { 3 }, tv3 = { 5, 100000 }; tv0 = tv1; tv1 = tv2 + tv3; if ( tv2 == tv3 ) ... tv3 = tv2 = 0;
Trait
Named collection of constraints.
trait sumable( otype T ) { void ?{}( T &, zero_t ); constructor from 0 literal T ?+?( T, T ); assortment of additions T ?+=?( T &, T ); T ++?( T & ); T ?++( T & ); };
Polymorphic Routine
Routines may have multiple type parameters each with constraints.
forall( otype T
| sumable( T ) ) polymorphic, use trait |
Polymorphic Type
Aggregate types may have multiple type parameters each with constraints.
forall( otype T
| sumable( T ) ) polymorphic, use trait |
(my note: i like the <T> syntax better for polymorphism)
Remove Definition Keyword
Keywords struct and enum are not required in a definition (like C++).
struct S { ... }; enum E { ... }; S s; "struct" before S unnecessary E e; "enum" before E unnecessary
---
"Of course, if you go the forth-lite route and have nearly completely consistent tokenization along a small set of special characters, this is much easier. Forth-lite languages can be split by whitespace and occasionally re-joined when grouping symbols like double quote are found; lisp-lite languages with only a few paired grouping symbols can easily be parsed into their own ASTs." [11]
---
"
Another important C feature we can see in this example is the presence of preprocessor macros. Macros are part of a pre-compilation step. With them it is possible to #define global variables and do some basic conditional operation (with #ifdef and #endif). All the macro comands begin with a hashtag (#). Pre-compilation happens right before compiling and copies all the calls to #defines and check #ifdef (is defined) and #ifndef (is not defined) conditionals. In our "hello world!" example above, we only insert the line 2 if GL_ES is defined, which mostly happens when the code is compiled on mobile devices and browsers.
"
---
the table "Key Features Lisp Features Python Features" [12] is great. Reread it after you have a bit of an implementation of a bit of candidate syntax.
---
CJefferson on Aug 7, 2013 [-]
Most methods in C read like an assignment,
so if you are trying to remember what order the arguments to strcpy go, it's
strcpy(x,y) is like x = y
Then remember specially that typedef is the wrong way around to the way you would like it to be :)
belovedeagle on Aug 7, 2013 [-]
The trick with typedef is it is exactly the same syntax, in all cases, as declaring a variable of that type. This is essentially the only way you're going to ever remember how to do function pointer typedefs: typedef int (function_t)(int,int); or something to that effect
---
" 1. Coding JS quite a bit these days, I greatly miss every form returning a value. 2. Python's lambda is just sad, and GvR? keeps threatening to remove even that. "
"in Lisp, you can (and it's normal to) give docstrings to variables"
---
" Donald Knuth reports that
The lack of operator priority (often called precedence or hierarchy) in the IT language was the most frequent single cause of errors by the users of that compiler.[26]"
---
todo [13] seems to claim that recursive descent cannot parse operator expressions with precedence and associativity? But [14] claims that Python in LL(1); i thought recursive descent can efficiently parse LL(1) [15]? Note that [16] suggests that it's just that recursive descent cannot EFFICIENTLY parse associativity? But [17] claims .
so which is it? Is Python LL(1) and recursive descent CAN parse operator expressions with precedence and associativity
ah, i think i got it, see https://jeffreykegler.github.io/personal/timeline_v3#h1-the_operator_issue_as_of_1968 . I bet Python used the BASIC-OP->LIST-OP type transformation, and parses expressions as lists, and adds associativity later.
Note also that http://garethrees.org/2011/07/17/grammar/ says that Python grammar cannot be expressed as an operator precedence grammar.
---
so we probably want our grammar to be expressible as an operator precedence grammar; see http://garethrees.org/2011/07/17/grammar/
---
this regex syntax from JS looks good:
const parseExpr = () => /\d/.test(peek()) ? parseNum() : parseOp();
(from [18])
---
effect name ideas: reads, modifies, requires, ensures
---
https://www.python.org/dev/peps/pep-0572/
" In most contexts where arbitrary Python expressions can be used, a named expression can appear. This is of the form NAME := expr ...
sametmax 23 hours ago [-]
I will be happy to be able to do:
while (bytes := io.get(x)):
and:
[bar(x) for z in stuff if (x := foo(z))]
Every time Python adds an expression counterpart to an existing statement (lambdas, intensions, ternary...) there is a (legit) fear it will be abused.
But experience tells that the slow and gradual pace of the language evolution combined with the readability culture of the community don't lead that way.
While we will see code review breaking materials in the wild, I believe that the syntax will mostly be used sparingly, as other features, when the specific needs arise for it.
After all, it's been, as usual, designed with this in mind: "=" and ":=" are mutually exclusive. You don't use them in the same context.
The grammar makes sure of it most of the time, and for the rare ambiguities like:
a = b
vs
(a := b)
The parenthesis will discourage pointless usage.
gshulegaard 12 hours ago [-]
I agree that I would have preferred "as"...but that said I am struggling to think of a reason this is needed.
while (bytes := io.get(x)):
Would currently be written:
bytes = io.get(x)
while bytes:And likewise:
[bar(x) for z in stuff if (x := foo(z))]
is equivalently:
[bar(foo(z)) for z in stuff if foo(z)]
Perhaps this is just my personal opinion but I don't really think the ":=" (or "as" for that matter) adds much in the way of clarity or functionality. I guess at the end of the day I am neutral about this addition...but if there isn't a clear upside I usually think it's better to have less rather than add more.
reply
Dunnorandom 12 hours ago [-]
The first example would actually be equivalent to something like
while True:
bytes = io.get(x)
if not bytes:
break
...---
" Parametric type constructors are now always called with the same syntax as they are declared. This eliminates an obscure but confusing corner of language syntax. " [19]
---
mb .symbol instead of :symbol like in clojure, or instead of SYMBOL like i currently have (on the one hand, capitalized words take two extra keypresses; on the other hand, they are not chorded and they are easy to read; on the other hand, my colemak seems to remap 'caps lock' to 'delete', making it hard for me to type capitalized words (except by holding down shift, which is too much chording); on the other hand, there's probably some way to change that; on the other hand, i could use caps lock for escape from viper mode in emacs; on the other hand, i don't do that now, so why worry about it)
---
if we automatically put a ';' at the end of each line, then we can't do stuff like:
x.a() .b();
alternatives:
x.a(). b()
(Python does not do this; a line like x.a(). ending with '.' is a syntax error)
i think i like the third option. Keep lines open for unmatched grouping constructs like parens, but also if it ends in a '.'.
---
in Rust, 'break' can return a value
(also, like many langs, in Rust 'if' is an expression)
---
in Rust, when a variable name is the same as a field name, you can construct a struct succinctly, eg:
" fn build_user(email: String, username: String) -> User { User { email: email, username: username, active: true, sign_in_count: 1, } } "
" fn build_user(email: String, username: String) -> User { User { email, username, active: true, sign_in_count: 1, } } " -- [20]
---
Rust's struct update syntax (that is, creating a second immutable struct instance by copying some of the fields of an older instance and then changing other of the fields) is succinct:
" let user2 = User { email: String::from("another@example.com"), username: String::from("anotherusername567"), ..user1 }; " -- [21]
---
Rust has 'tuple structs' to allow creating distinct named types for tuples (nominative typing), e.g.:
" struct Color(i32, i32, i32); struct Point(i32, i32, i32);
let black = Color(0, 0, 0); let origin = Point(0, 0, 0); " -- [22]
---
" I had some extended notes here about "less-mainstream paradigms" and/or "things I wouldn't even recommend pursuing", but on reflection, I think it's kinda a bummer to draw too much attention to them. So I'll just leave it at a short list: actors, software transactional memory, lazy evaluation, backtracking, memoizing, "graphical" and/or two-dimensional languages, and user-extensible syntax. If someone's considering basing a language on those, I'd .. somewhat warn against it. Not because I didn't want them to work -- heck, I've tried to make a few work quite hard! -- but in practice, the cost:benefit ratio doesn't seem to turn out really well. Or hasn't when I've tried, or in (most) languages I've seen. " [23]
---
i don't think we want variadic functions in Oot, but if i change my mind, see the section 'Arity' in [24].
---
" Another neat feature is positional destructuring, commonly used with vectors....head and tail destructuring of sequences...
[x & xs] [1 2 3]
Here, x is 1 and xs is [2 3]. ... You can also destructure more than one binding before the tail like so:
[x y & zs] [1 2 3]
Here, x is 1, y is 2, and zs is [3]. " [25]
---
" {:keys [foo] :or {foo 1} :as m}
foo will have the map’s value for key :foo, or 1 if the map doesn’t have a :foo key, and m is bound to the entire map. " [26]
---
hamstergene on Jan 12, 2015 [-]
Certainly not for that reason. Actually, as a non-native English speaker, I consider unicode identifiers one of the most useless features for a language.
There are reasons why unicode identifiers have never been widely used, and probably will never be, even though they've been around for more than a decade:
1. You're unlikely to be allowed to use them. Unicode identifiers stagger international collaboration. Do you want your Chinese colleague to commit some hieroglyphical identifiers into your code, and then an Arabic colleague to add right-to-left curvatures? I doubt that :) They are unwanted in outsourcing, freelancing, open-source, any company which does foreign hire, they are a problem when asking on StackOverflow?, etc.
2. There is just no problem to use English in the first place. Programmers already use English every day. 80-100% of information (documentation, QA, discussion forums) on any programming topic is in English; native-language sources are lacking at best, often nonexistent. If person is a professional programmer, it's way too late for them to have a couple of words translated.
3. They are not as appealing as you might think. Many languages don't work the same way as English. For example, in expression `print(line)` the word `line` may have to have different morphological form than in `var line = ...`, and yet another form for `if '_' in line`; one form for all uses is very unnatural and requires getting used to (and if you're going to adjust anyway, why not adjust to English then).
4. Mixed-language text is simply harder to type when second alphabet is not latin. It's like 3x harder when foreign words need to be inserted. I have even seen colleagues discussing code in chat using English just for that reason. And since at least language keywords and standard libraries are already using English, mixed is what it gonna be. A person must have some real serious trouble with English to tolerate that.
Even in strictly local projects where all comments are in native language, unicode identifiers are rarely used.
P.S. When I was reading Swift book about unicode identifiers, I immediately thought: "first paragraph of every Swift coding conventions on the planet is going to be, don't use unicode identifiers".
---
ausjke 30 days ago [-]
I'm learning Go, just a naive question, why does Go put the variable type at the end of declaration, is this an absolute need? no other widely usage language does that, and it just feels odd to me.
klodolph 30 days ago [-]
C++ is the weird one out
mmastrac 30 days ago [-]
Pascal did it:
procedure SetColor
(const NewColor: TColor; const NewFPColor: TFPColor); virtual;Rust does it too:
fn do_twice(f: fn(i32) -> i32, arg: i32) -> i32
Go is similar, but less symbols:
f func(func(int,int) int, int) func(int, int) int
matt_kantor 30 days ago [-]
Also Scala, Haskell, TypeScript?