proj-plbook-plChCtmLangs

Table of Contents for Programming Languages: a survey

C

Because it is so well-known and well-liked, C gets its own chapter.

See chapter on chapter on C.

C++

"In C++’s case, the Big Idea was high level object orientation without loss of performance compared to C." [1] (my note: and backwards compatibility with C; and close-to-the-metal low-level operation)

" C++ is a general-purpose programming language with a bias towards systems programming that

    is a better C
    supports data abstraction
    supports object-oriented programming
    supports generic programming " -- http://www.stroustrup.com/C++11FAQ.html#aims

" ...C++’s fundamental strengths:

Attributes:

Pros:

Cons:

Good for:

Best practices and style guides:

References:

Retrospectives:

People:

Tours and tutorials:

Books:

Comparisons:

Examples of good code:

Misc:

Types:

C++ Features

References

References are kind of like pointers, but with a few differences. Some of these are:

For example, say you want to write a '++' operator for a type that doesn't have one. You have to use references for two reasons:

Links:

C++ Opinions

Favorite parts:

Critical:

Other:

    def adder(amount):
        return lambda x: x + amount

C++

    auto adder(int amount) {
        return [=](int x){ return x + amount; };
    }" about the [=]: "turns out there are various forms of Lambda (of course there wouldn't just be one in C++), for ex:
    [a, &b], [this], [=], [&], []

Depending on how the variables are captured. " "The stuff in square brackets is the capture list. Unlike python, which implicitly captures the full outer scope by reference, C++ allows/requires you to specify which outer variables you want to reference in the lambda, and whether they are captured by value ("=") or reference ("&"). This allows C++ to implement the lambda with a simple static function in most cases, where python needs a bunch of tracking data to preserve the outer stack from collection." 00 [26]

Why it compiles slow:

C++ Gotchas

"the major insecurities in C++ code:

C++ Popular libraries outside of stdlib

C++ tools

Linters:

Formatters:

D

Tutorials:

Pros:

Opinions:

BetterC variant

runtimeless restricted variant of D, "BetterC?":

https://dlang.org/spec/betterc.html

https://dlang.org/blog/2017/08/23/d-as-a-better-c/

" Retained Features

Nearly the full language remains available. Highlights include:

    Unrestricted use of compile-time features
    Full metaprogramming facilities
    Nested functions, nested structs, delegates and lambdas
    Member functions, constructors, destructors, operating overloading, etc.
    The full module system
    Array slicing, and array bounds checking
    RAII (yes, it can work without exceptions)
    scope(exit)
    Memory safety protections
    Interfacing with C++
    COM classes and C++ classes
    assert failures are directed to the C runtime library
    switch with strings
    final switch
    unittest

...

Unavailable Features

D features not available with BetterC?:

    Garbage Collection
    TypeInfo and ModuleInfo
    Classes
    Built-in threading (e.g. core.thread)
    Dynamic arrays (though slices of static arrays work) and associative arrays
    Exceptions
    synchronized and core.sync
    Static module constructors or destructors

"

"What may be initially most important to C programmers is memory safety in the form of array overflow checking, no more stray pointers into expired stack frames, and guaranteed initialization of locals. This is followed by what is expected in a modern language — modules, function overloading, constructors, member functions, Unicode, nested functions, dynamic closures, Compile Time Function Execution, automated documentation generation, highly advanced metaprogramming, and Design by Introspection." [49]

Rust

"Rust combines low-level control over performance with high-level convenience and safety guarantees. Better yet, it achieves these goals without requiring a garbage collector or runtime, making it possible to use Rust libraries as a “drop-in replacement” for C....What makes Rust different from other languages is its type system, which represents a refinement and codification of “best practices” that have been hammered out by generations of C and C++ programmers." -- [50]

"Rust is a language that allows you to build high level abstractions, but without giving up low-level control – that is, control of how data is represented in memory, control of which threading model you want to use etc. Rust is a language that can usually detect, during compilation, the worst parallelism and memory management errors (such as accessing data on different threads without synchronization, or using data after they have been deallocated), but gives you a hatch escape in the case you really know what you’re doing. Rust is a language that, because it has no runtime, can be used to integrate with any runtime; you can write a native extension in Rust that is called by a program node.js, or by a python program, or by a program in ruby, lua etc. and, however, you can script a program in Rust using these languages." -- Elias Gabriel Amaral da Silva

"...what Rust represents:

Pros:

Tutorials and books:

Best practices and tips:

reply

Old/probably out of date but otherwise look good:

Features:

" some of my favorite (other) parts about the language, in no particular order.

Native types in Rust are intelligently named: i32, u32, f32, f64, and so on. These are, indisputably, the correct names for native types.

Destructuring assignment: awesome. All languages which don’t have it should adopt it. Simple example:

let ((a, b), c) = ((1, 2), 3); a, b, and c are now bound

let (d, (e, f)) = ((4, 5), 6) compile-time error: mismatched types

More complex example:

What is a rectangle? struct Rectangle { origin: (u32, u32), size: (u32, u32) }

Create an instance let r1 = Rect { origin: (1, 2), size: (6, 5) };

Now use destructuring assignment let Rect { origin: (x, y), size: (width, height) } = r1; x, y, width, and height are now bound

The match keyword: also awesome, basically a switch with destructuring assignment.

Like if, match is an expression, not a statement, so it can be used as an rvalue. But unlike if, match doesn’t suffer from the phantom-else problem. The compiler uses type-checking to guarantee that the match will match something — or speaking more precisely, the compiler will complain and error out if a no-match situation is possible. (You can always use a bare underscore _ as a catchall expression.)

The match keyword is an excellent language feature, but there are a couple of shortcomings that prevent my full enjoyment of it. The first is that the only way to express equality in the destructured assignments is to use guards. That is, you can’t do this:

match (1, 2, 3) { (a, a, a) => “equal!”, _ => “not equal!”, }

Instead, you have to do this:

match (1, 2, 3) { (a, b, c) if a == b && b == c => “equal!”, _ => “not equal!”, }

Erlang allows the former pattern, which makes for much more succinct code than requiring separate assignments for things that end up being the same anyway. It would be handy if Rust offered a similar facility.

My second qualm with Mr. Match Keyword is the way the compiler determines completeness. I said before “the compiler will complain and error out if a no-match situation is possible,” but it would be better if that statement read if and only if. Rust uses something called Algebraic Data Types to analyze the match patterns, which sounds fancy and I only sort-of understand it. But in its analysis, the compiler only looks at types and discrete enumerations; it cannot, for example, tell whether every possible integer value has been considered. This construction, for instance, results in a compiler error:

match 100 { y if y > 0 => “positive”, y if y == 0 => “zero”, y if y < 0 => “negative”, };

The pattern list looks pretty exhaustive to me, but Rust wouldn’t know it. I’m that sure someone who is versed in type theory will send me an email explain how what I want is impossible unless P=NP, or something like that, but all I’m saying is, it’d be a nice feature to have. Are “Algebraic Data Values” a thing? They should be.

It’s a small touch, but Rust lets you nest function definitions, like so:

fn a_function() { fn b_function() { fn c_function() { } c_function(); works } b_function(); works c_function(); error: unresolved name `c_function` }

Neat, huh? With other languages, I’m never quite sure where to put helper functions. ... Mutability rules: also great. Variables are immutable by default, and mutable with the mut keyword. It took me a little while to come to grips with the mutable reference operator &mut, but &mut and I now have a kind of respectful understanding, I think. Data, by the way, inherits the mutability of its enclosing structure. This is in contrast to C, where I feel like I have to write const in about 8 different places just to be double-extra sure, only to have someone else’s cast operator make a mockery of all my precautions.

Functions in Rust are dispatched statically, if the actual data type is known as compile-time, or dynamically, if only the interface is known. (Interfaces in Rust are called “traits”.) As an added bonus, there’s a feature called “type erasure” so you can force a dynamic dispatch to prevent compiling lots of pointlessly specialized functions. This is a good compromise between flexibility and performance, while remaining more or less transparent to the typical user.

Is resource acquisition the same thing as initialization? I’m not sure, but C++ programmers will appreciate Rust’s capacity for RAII-style programming, the happy place where all your memory is freed and all your file descriptors are closed in the course of normal object deallocation. You don’t need to explicitly close or free most things in Rust, even in a deferred manner as in Go, because the Rust compiler figures it out in advance. The RAII pattern works well here because (like C++) Rust doesn’t have a garbage collector, so you won’t have open file descriptors floating around all week waiting for Garbage Pickup Day. " [57]

Opinions:

Example of: " For example, this code:

fn bar() -> i32 { 5 }

fn foo() -> &'static i32 { &bar() }

gives this error:

error[E0716]: temporary value dropped while borrowed --> src/lib.rs:6:6

67
&bar()
^^^^^ creates a temporary which is freed while still in use
}
- temporary value is freed at the end of this statement
  = note: borrowed value must be valid for the static lifetime...

bar() produces a value, and so &bar() would produce a reference to a value on foo()‘s stack. Returning it would be a dangling pointer. In a system without automatic memory management, this would cause a dangling pointer. " -- [77]

    fn x() -> Vec<i32> {
        let mut x = (0..3).collect();
        x.sort(); // Calling any method of Vec
        x // Cannot infer that `x` is `Vec<i32>` because a method was called
    }

Comparisons:

Dev blogs:

Best practices:

Retrospectives:

Gotchas:

enum MyValue? { Digit(i32) }

fn main() { let x = MyValue::Digit(10); let y = x; let z = x; }

The reason is that z might (later) mess with x, leaving y in an invalid state (a consequence of Rust’s strict memory checking — more on that later). Fair enough. But then changing the top of the file to:

  1. [derive(Copy, Clone)] enum MyValue? { Digit(i32) }

makes the program compile. Now, instead of binding y to the value represented by x, the assignment operator copies the value of x to y. So z can have no possible effect on y, and the memory-safety gods are happy.

To me it seems a little strange that a pragma on the enum affects its assignment semantics. It makes it difficult to reason about a program without first reading all the definitions. The binding-copy duality, by the way, is another artifact of Rust’s mixed-paradigm heritage. Whereas functional languages tend to use bindings to map variable names to their values in the invisible spirit world of no-copy immutable data, imperative languages take names at face value, and assignment is always a copy (copying a pointer, perhaps, but still a copy). By attempting to straddle both worlds, Rust rather inelegantly overloads the assignment operator to mean either binding or copying. " [89] this is partially addressed in [90]

Rust Features

Ownership types

Links:

Older discussions (Rust has probably changed a lot since these):

Lifetimes (related to ownership types)

Procedural Macros

Rust custom derive and procedural macros: https://doc.rust-lang.org/book/procedural-macros.html

Rust Internals and implementations

Core data structures: todo

Number representations

Integers

Floating points todo

array representation

variable-length lists: todo

multidimensional arrays: todo

limits on sizes of the above

string representation

Representation of structures with fields

Rust tests

Rust variants

The core language in the RustBelt? paper:

http://delivery.acm.org/10.1145/3160000/3158154/popl18-p202.pdf?ip=209.134.92.9&id=3158154&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1517989892_1c96ad8d14150a8f14d214e5ae345f1b

Rust Links

Objective-C

Attributes:

Pros:

Used in:

tutorials:

Best practices and style guides:

Opinions:


Less popular close-to-the-metal languages

Nim

http://nim-lang.org/

Formerly Nimrod.

" Nim (formerly known as "Nimrod") is a statically typed, imperative programming language that tries to give the programmer ultimate power without compromises on runtime efficiency. This means it focuses on compile-time mechanisms in all their various forms.

Beneath a nice infix/indentation based syntax with a powerful (AST based, hygienic) macro system lies a semantic model that supports a soft realtime GC on thread local heaps. Asynchronous message passing is used between threads, so no "stop the world" mechanism is necessary. An unsafe shared memory heap is also provided for the increased efficiency that results from that model. "

" you could use Nim[1] which compiles to C. If you, like myself, dislike C, Nim can be used as a "better C" (you can disable the Nim GC and forget about its standard library, then use Nim as a thin layer over C with modules, generics and other nice goodies) " [96]

Tutorials

Opinions and comparisons

PLC languages

There are a few standard languages for Programmable Logic Controllers (a type of simple processor used for industrial automation). The relevant standard is ISO 61131. The languages are:

Links:

Quaint

Quaint is a small, clean descendent of C with first-class resumable functions "The main goal of Quaint is to demonstrate the viability of a new idiom for "sane" concurrent programming that bears resemblance to promises and generators". It is implemented via a VM ('QVM').

Some things that it attempts to do better than C:

Features

Concurrency model:

" In essence, every function can be called either synchronously (just as in C) or asynchronously via a quaint ((promise)) object. The called function does not have any explicit "yield" statements, but merely only optional "wait labels" that mark eventual suspension points in which the caller might be interested in. The caller decides whether to wait with a timeout or wait until reaching a certain "wait label". The caller can also query the quaint to see whether it has lastly passed a particular wait label. The callee can have optional noint blocks of statements during which it cannot be interrupted even if the caller is waiting with a timeout and that timeout has elapsed.

Quaint is entirely single-threaded – asynchronous functions run only when the caller is blocked on a wait statement or on a "run-till-end & reap-value" operator. During all other times, the resumable function is in a "frozen" state which holds its execution context and stack. "

-- https://github.com/bbu/quaint-lang

Three loop statements:

Variable declaration syntax: eg "a: byte = 0;". Note: ':' is the 'typecast' operator.

eg

a: int; // int a
p: ptr(int); // int *p
q: ptr(ptr(long)); // long **q
arr: int[20]; // int arr[20]
arrp: ptr[20](int); // int *arrp[20]
parr: ptr(int[20]); // int (*parr)[20]
fp: fptr(a: int, b: int): long; // long (*fp)(int a, int b)

Function syntax example:

f1(a: uint, b: uint): ulong
{
    const addend: ulong = 5:ulong;
    return a:ulong + b:ulong + addend;
}

" User-defined types can be defined at file scope via the type statement:

type mytype: struct(
    member_one: ptr(quaint(byte)),
    member_two: vptr,
    member_three: u32,
    member_four: fptr(x: int[12]): u64
);

"

Types:

Operators:

binary:

AND, OR
^ (bitwise AND, OR, XOR)

unary:

Built-in functions:

i)(8163264), pnl

Concurrency model:

Promises ("quaints"). A promise can be null. The type of a promise contains its return value and is written eg 'quaint(int)' (a promise with no return is of type 'quaint()'). A promise is initialized from a function invocation expression with the '~' operator; this creates it but does not start/run it. If '~' is applied to a value rather than a function, it creates a (concepually) 0-ary function returning that value.

The '@' query operator takes a promise and a label and returns 1 if the promise has passed that label, or 0 otherwise. The label reference syntax is eg 'my_function::a_label'; the @ syntax is eg 'q@my_function::a_label'. @ applied to a null promise returns 0.

A label has the syntax eg '[label]'. Duplicate wait labels are allowed in a function; if either of them is passed, the label 'fires'. There are two special built-in labels: 'start' and 'end'.

The 'wait' operator runs a promise, suspending the caller (everything is single-threaded here). Optionally, 'wait' can suspend the promise and return control to the caller when a condition is met; the condition can either be a timeout, or whether the promise has passed a label (see also '@', above), or whether the promise is itself waiting on something marked 'blocking' (not yet implemented). Wait does not return anything; it only has side-effects.

The '*' operator 'reaps' a promise, deallocating it and returning its return value, after first 'waiting' until it completes. "This operator can be thought of as a counterpart of the ~ quantify operator. The former allocates the quaint, the latter releases it. The VM does not release any orphaned quaints, so a failure to release the quaint via * causes a memory leak in your program." "When applied over a null quaint, * returns a zero value of the appropriate subtype or nothing in case of a void subtype."

'noint' (nointerrupt) blocks (noint {...}) may be used to indicate that some portion of callee code may not be preempted (suspended due to the condition of the caller's 'wait' being met). This is similar to a 'critical section' (although Quaint is single-threaded). "This is usually useful when modifying some global state variables which may be shared with the caller:"

Currently lacking features compared to C (almost a direct quote from the README):

Future directions:

"

Anti-features (things that will NOT be added):

Implementation details

QVM virtual machine main addressing modes (from [102]): "

QVM virtual machine instruction set (from https://github.com/bbu/quaint-lang/blob/master/src/codegen.h ):

AST

AST node types (from https://github.com/bbu/quaint-lang/blob/master/src/ast.h ):

Zig

http://ziglang.org/

"

    Manual memory management. Memory allocation failure is handled correctly. Edge cases matter!
    Zig competes with C instead of depending on it. The Zig Standard Library does not depend on libc.
    Small, simple language. Focus on debugging your application rather than debugging your knowledge of your programming language.
    A fresh take on error handling that resembles what well-written C error handling looks like, minus the boilerplate and verbosity.
    Debug mode optimizes for fast compilation time and crashing with a stack trace when undefined behavior would happen.
    ReleaseFast mode produces heavily optimized code. What other projects call "Link Time Optimization" Zig does automatically.
    ReleaseSafe mode produces optimized code but keeps safety checks enabled. Disable safety checks in the bottlenecks of your code.
    Generic data structures and functions.
    Compile-time reflection and compile-time code execution.
    Import .h files and directly use C types, variables, and functions.
    Export functions, variables, and types for C code to depend on. Automatically generate .h files.
    Nullable type instead of null pointers.
    Order independent top level declarations.
    Friendly toward package maintainers. Reproducible build, bootstrapping process carefully documented. Issues filed by package maintainers are considered especially important.
    Cross-compiling is a first-class use case.
    No preprocessor. Instead Zig has a few carefully designed features that provide a way to accomplish things you might do with a preprocessor."

" Why Zig When There is Already CPP, D, and Rust?

No hidden control flow

If Zig code doesn't look like it's jumping away to call a function, then it isn't. This means you can be sure that the following code calls only foo() and then bar(), and this is guaranteed without needing to know the types of anything:

var a = b + c.d; foo(); bar();

...

No hidden allocations

More generally, have a hands-off approach when it comes to heap allocation. There is no new keyword or any other language feature that uses a heap allocator (e.g. string concatenation operator[1]).

...

First-class support for no standard library

Zig has an entirely optional standard library that only gets compiled into your program if you use it. Zig has equal support for either linking against libc or not linking against it ...

A Portable Language for Libraries ...

A Package Manager and Build System for Existing Projects ...

Simplicity Zig has no macros and no metaprogramming, yet still is powerful enough to express complex programs in a clear, non-repetitive way. Even Rust which has macros special cases fmt.Print!, rather than just a simple function. Meanwhile in Zig, the equivalent function is implemented in the standard library with no sort of meta programming/macros.

When you look at Zig code, everything is a simple expression or a function call. There is no operator overloading, property methods, runtime dispatch, macros, or hidden control flow. Zig is going for all the beautiful simplicity of C, minus the pitfalls. " -- https://github.com/ziglang/zig/wiki/Why-Zig-When-There-is-Already-CPP,-D,-and-Rust%3F

---

Virgil

http://compilers.cs.ucla.edu/virgil/overview.html

Types:

" Virgil provides three basic primitive types. These primitive types are value types, and quantities of these types are never passed by reference.

    int - a signed, 32-bit integer type with arithmetic operations
    char - an 8-bit quantity for representing ASCII characters
    boolean - a true / false type for representing conditions

Additionally, Virgil provides the array type constructor [] that can construct the array type T[] from any type T. ... unlike Java, Virgil arrays are not objects, and are not covariantly typed [1]. "

OOP: " Virgil is a class-based language that is most closely related to Java, C++, and C#. Like Java, Virgil provides single inheritance between classes "

" Compile-time Initialization

The most significant feature of Virgil that is not in other mainstream languages is the concept of initialization time. To avoid the need for a large runtime system that dynamically manages heap memory and performs garbage collection, Virgil does not allow applications to allocate memory from the heap at runtime. Instead, the Virgil compiler allows the application to run initialization routines at compilation time, while the program is being compiled. ... When the application's initialization routines terminate... The reachable heap is then compiled directly into the program binary and is immediately available to the program at runtime. "

first-class functions ('delegates'):

" Delegates

which is a first-class value that represents a reference to a method. A delegate in Virgil may be bound to a component method or to an instance method of a particular object; either kind can be used interchangeably, provided the argument and return types match. Delegate types in Virgil are declared using the function type constructor. For example, function(int): int represents the type of a function that takes a single integer as an argument and returns an integer. ... Unlike C#, the use of delegate values, either creation or application, does not require allocating memory from the heap. Delegates are implemented as a tuple of a pointer to the bound object and a pointer to the delegate's code.


Footnotes:

1.