proj-oot-ootNotes15


" things rust shipped without Jul. 3rd, 2015 08:26 am graydon2: (Default) [personal profile] graydon2 Well-known things I'm very proud that rust shipped 1.0 without:

    null pointers
    array overruns
    data races
    wild pointers
    uninitialized, yet addressable memory
    unions that allow access to the wrong field

Less-well-known things I'm very proud that rust shipped 1.0 without:

    a shared root namespace
    variables with runtime "before main" static initialization (the .ctors section)
    a compilation model that relies on textual inclusion (#include) or textual elision (#ifdef)
    a compilation model that relies on the order of declarations (possible caveat: macros)
    accidental identifier capture in macros
    random-access strings
    UTF-16 or UCS-2 support anywhere outside windows API compatibility routines
    signed character types
    (hah! vertical tab escapes (as recently discussed) along with the escapes for bell and form-feed)
    "accidental octal" from leading zeroes
    goto (not even as a reserved word)
    dangling else (or misgrouped control structure bodies of any sort)
    case fallthrough
    a == operator you can easily typo as = and still compile
    a === operator, or any set of easily-confused equality operators
    silent coercions between boolean and anything else
    silent coercions between enums and integers
    silent arithmetic coercions, promotions
    implementation-dependent sign for the result of % with negative dividend
    bitwise operators with lower precedence than comparison operators
    auto-increment operators
    a poor-quality default hash function
    pointer-heavy default containers

Next time you're in a conversation about language design and someone sighs, shakes their head and tells you that sad legacy design choices are just the burden of the past and we're helpless to avoid repeating them, try to remember that this is not so. " -- http://graydon2.dreamwidth.org/218040.html

discussion: https://news.ycombinator.com/item?id=9827051


" Packages are Ciao files that contain syntax and compilation rules and that are loaded by the compiler as plugins and unloaded when compilation finishes. Packages only modify the syntax and semantics of the module from where they are loaded, and therefore, other modules can use packages introducing incompatible syntax/semantics without clashing. " -- http://clip.dia.fi.upm.es/papers/hermenegildo11:ciao-design-tplp.pdf


what does this mean?:

" Semantically, the extension is related to logic-functional languages like Curry (Hanus et al. ) but relies on flattening and resolution, using freeze/2 for lazy evaluation, instead of narrowing. " -- http://clip.dia.fi.upm.es/papers/hermenegildo11:ciao-design-tplp.pdf

---

---

andrewchambers 17 hours ago

The thing I like about clojurescript is the fact that it has a linker and library system which makes sense.

reply


fact(N) := N=0 ? 1 10

N>0 ? N * fact(--N).

fact is written using a disjunction (marked by “

”) of guards (delimited by “ ? ”), which together commit the system to the first matching choice.

?- use_package(functional). ?- X = ~nrev([1,2,3]). X = [3,2,1] ?- [3,2,1] = ~nrev(X). X = [1,2,3]

Since, in general, functional notation is just syntax, and thus, no directionality is implied, the second query to nrev/2 just instantiates its argument.

---

:- module(someprops, _, [functional, hiord]). color := red

list := []list_of(T) := []sorted := []sorted([X,Y:- module(someprops, _, []). color(red). color(blue). color(green). list([]). list([_:- use_module(engine(hiord_rt)). list_of(_, []). list_of(T, [Xsorted([]). sorted([_]). sorted([X,Y
blue green.
[_list].
[~T list_of(T)].
[_].
Z]) :- X @< Y, sorted([YZ]).
T]) :- list(T).
Xs]) :- call(T, X), list_of(T, Xs).
Z]) :- X @< Y, sorted([YZ]).

Fig. 3. Examples in Ciao functional notation and state of translation after applying the functional and hiord packages.

-- http://clip.dia.fi.upm.es/papers/hermenegildo11:ciao-design-tplp.pdf

todo: i dont understand:

list_of(T) := []

[~T list_of(T)].

:- use_module(engine(hiord_rt)). list_of(_, []). list_of(T, [X

Xs]) :- call(T, X), list_of(T, Xs).

---

" Other logic programming flavors: alternatively to the above, by not loading the classic Prolog package(s) the user can restrict a given module to use only pure logic 226 M. V. Hermenegildo et al. programming, without any of Prolog’s impure features. 2 " -- http://clip.dia.fi.upm.es/papers/hermenegildo11:ciao-design-tplp.pdf

---

Additional computation rules: in addition to the usual depth-first, left-to-right exe- cution of Prolog, other computation rules such as breadth-first, iterative deepening, tabling (see later), and the Andorra model are available, again by loading suitable packages. This has proved particularly useful when teaching, since it allows post- poning the introduction of the (often useful in practice) quirks of Prolog (see the slides of a course starting with pure logic programming and breadth-first search in http://www.cliplab.org/logalg ).

---

" I lovingly reused features from many languages. (I suppose a Modernist would say I stole the features, since Modernists are hung up about originality.) Whatever the verb you choose, I've done it over the course of the years from C, sh, csh, grep, sed, awk, Fortran, COBOL, PL/I, BASIC-PLUS, SNOBOL, Lisp, Ada, C++, and Python. To name a few. To the extent that Perl rules rather than sucks, it's because the various features of these languages ruled rather than sucked.

But note something important here. I left behind more than I took. A lot more. In modern terms, there was a lot of stuff that sucked. Now, on the feature set issue, Perl is always getting a lot of bad press.

I think people who give bad press to Perl's feature set should have more angst about their reporting.

I picked the feature set of Perl because I thought they were cool features. I left the other ones behind because I thought they sucked.

More than that, I combined these cool features in a way that makes sense to me as a postmodern linguist, not in a way that makes sense to the typical Modernistic computer scientist. Recall that the essence of Modernism is to take one cool idea and drive it into the ground. It's not difficult to look at computer languages and see which ones are trying to be modern by driving something into the ground. Think about Lisp, and parentheses. Think about Forth, and stack code. Think about Prolog, and backtracking. Think about Smalltalk, and objects. (Or if you don't want to think about Smalltalk, think about Java, and objects.)

Think about Python, and whitespace. Hi, Guido.

Or think about shell programming, and reductionism. How many times have we heard the mantra that a program should do one thing and do it well? " -- http://www.wall.org/~larry/pm.html

---

---

" MMIX versus reality. A person who understands the rudiments of MMIX programming has a pretty good idea of what today's general-purpose computers can do easily; MMIX is very much like all of them. But MMIX has been idealized in several ways, partly because the author has tried to design a machine that is somewhat "ahead of its time" so that it won't become obsolete too quickly. Therefore a brief comparison between MMIX and the computers actually being built at the turn of the millennium is appropriate. The main differences between MMIX and those machines are:

ffl Commercial machines do not ignore the low-order bits of memory addresses,

as MMIX does when accessing M8[A]; they usually insist that A be a multiple of 8. (We will find many uses for those precious low-order bits.)

ffl Commercial machines are usually deficient in their support of integer arithmetic. For example, they almost never produce the true quotient bx=yc and true remainder x mod y when x is negative or y is negative; they often throw away the upper half of a product. They don't treat left and right shifts as strict equivalents of multiplication and division by powers of 2. Sometimes they do not implement division in hardware at all; and when they do handle division, they usually assume that the upper half of the 128-bit dividend is zero. Such restrictions make high-precision calculations more difficult.

ffl Commercial machines do not perform FINT and FREM efficiently. ffl Commercial machines do not (yet?) have the powerful MOR and MXOR operations. They usually have a half dozen or so ad hoc instructions that handle only the most common special cases of MOR.

ffl Commercial machines rarely have more than 64 general-purpose registers. The

256 registers of MMIX significantly decrease program length, because many variables and constants of a program can live entirely in those registers instead of in memory. Furthermore, MMIX's register stack is more flexible than the comparable mechanisms in existing computers.

All of these pluses for MMIX have associated minuses, because computer design always involves tradeoffs. The primary design goal for MMIX was to keep the machine as simple and clean and consistent and forward-looking as possible, without sacrificing speed and realism too greatly.

"

what MOR is (this is much easier to read at http://www-cs-faculty.stanford.edu/~uno/fasc1.ps.gz page 11 (PDF page 16)):

"

We can also regard an octabyte as an 8 \Theta 8 Boolean matrix, that is, as an 8 \Theta 8 array of 0s and 1s. Let m(x) be the matrix whose rows from top to bottom are the bytes of x from left to right; and let m

T

(x) be the transposed matrix,

whose columns are the bytes of x. For example, if x =

9e 37 79 b9 7f 4a 7c 16 is

the octabyte (2), we have

m(x) =

0 B B B B B B B B B @

1 0 0 1 1 1 1 0 0 0 1 1 0 1 1 1 0 1 1 1 1 0 0 1 1 0 1 1 1 0 0 1 0 1 1 1 1 1 1 1 0 1 0 0 1 0 1 0 0 1 1 1 1 1 0 0 0 0 0 1 0 1 1 0

1 C C C C C C C C C A

; m

T

(x) =

0 B B B B B B B B B @

1 0 0 1 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 0 0 1 0 1 1 1 1 0 0 1 1 0 1 0 1 1 1 1 0 0 0

1 C C C C C C C C C A

(10)

This interpretation of octabytes suggests two operations that are quite familiar to mathematicians, but we will pause a moment to define them from scratch.

If A is an m \Theta n matrix and B is an n \Theta s matrix, and if ffi and ffl are binary operations, the generalized matrix product A

ffi

ffl B is the m \Theta s matrix C defined

by

Cij = (Ai1 ffl B1j) ffi (Ai2 ffl B2j) ffi \Delta \Delta \Delta ffi (Ain ffl Bnj) (11)

for 1 ^ i ^ m and 1 ^ j ^ s. [See K. E. Iverson, A Programming Language (Wiley, 1962), 23-24; we assume that ffi is associative.] An ordinary matrix product is obtained when ffi is + and ffl is \Theta , but we obtain important operations

11

12 BASIC CONCEPTS 1.3.1' on Boolean matrices if we let ffi be . or \Phi :

(A

.

\Theta

B)ij = Ai1B1j . Ai2B2j . \Delta \Delta \Delta . AinBnj?; (12)

(A

\Phi

\Theta

B)ij = Ai1B1j \Phi Ai2B2j \Phi \Delta \Delta \Delta \Phi AinBnj?: (13)

Notice that if the rows of A each contain at most one 1, at most one term in (12) or (13) is nonzero. The same is true if the columns of B each contain at most one 1. Therefore A

.

\Theta

B and A

\Phi

\Theta

B both turn out to be the same as the ordinary

matrix product A

+

\Theta

B = AB in such cases.

ffl MOR $X,$Y,$Z (multiple or): m

T

($X) m

T

($Y)

.

\Theta

m

T

($Z);

equivalently, m($X) m($Z)

.

\Theta

m($Y). (See exercise 32.)

ffl MXOR $X,$Y,$Z (multiple exclusive-or): m

T

($X) m

T

($Y)

\Phi

\Theta

m

T

($Z);

equivalently, m($X) m($Z)

\Phi

\Theta

m($Y).

These operations essentially set each byte of $X by looking at the corresponding byte of $Z and using its bits to select bytes of $Y; the selected bytes are then ored or xored together. If, for example, we have

$Z =

01 02 04 08 10 20 40 80; (14)

then both MOR and MXOR will set register $X to the byte reversal of register $Y: The kth byte from the left of $X will be set to the kth byte from the right of $Y, for 1 ^ k ^ 8. On the other hand if $Z =

00000000000000ff, MOR and MXOR

will set all bytes of $X to zero except for the rightmost byte, which will become either the OR or the XOR of all eight bytes of $Y. Exercises 33-37 illustrate some of the many practical applications of these versatile commands. "


m_mueller 1 day ago

One thing that almost always gets overlooked when critizing / trying to innovate on Xerox Parc-like interfaces, is discoverability. Look at departures from this interface (or predecessors of it) and you'll almost always find a system where it's hard for users to discover what they can do and how their actions will affect the state. Most prominently:

The only interface that has improved on discoverability so far, is OSX, especially with its integrated spotlight search in each application's help menu.

What I'd like to see is a CLI that (a) understands objects by default (i.e. PowerShell?) and (b) is discoverable, for example by using mouse interactions when you're trying to learn.

(a) would mean that the command line applications become much easier to compose. Imagine something like list / dict comprehensions in the command line:

ls

[entry.created for entry in $@ if entry.filename[0] == 'a']sort

(b) would mean that you could hover each of the commands above, inspect the possible parameters, default values, examples without having to execute anything. The whole interface could get much richer as well, for example if the output of your commands is a list of objects that have the same attributes (e.g. `ls`), it would display it in a table where each column is sortable using gasp the mouse.


" Out of the frustration with AMQP I've started my own ZeroMQ? project. No committee this time! Surely, I was able to make it functionally complete? Well, no. I've tried to make it do too much. It is a compatibility library, a async I/O framework, a message delimitation protocol and a library of messaging patterns. Today, almost eight years after its inception, the project is still under active development and heading towards the point where it will be able to send email.

Next one: nanomsg. An alternative to ZeroMQ?. I've tried to limit the scope of project by splitting the transport functionality as well as messaging patterns into separate plug-ins. I was partially successful, but it is still too big a chunk to swallow in a single piece. "

-- http://250bpm.com/blog:50


python 3.5:

zipapp¶

The new zipapp module (specified in PEP 441) provides an API and command line tool for creating executable Python Zip Applications, which were introduced in Python 2.6 in issue 1739468 but which were not well publicised, either at the time or since.

With the new module, bundling your application is as simple as putting all the files, including a __main__.py file, into a directory myapp and running:

$ python -m zipapp myapp $ python myapp.pyz

---

https://en.wikipedia.org/wiki/J_operator

---

some simple greenfield computer system redesigns:

Project Oberon: https://news.ycombinator.com/item?id=9847955

Plan 9

Urbit

Alan Kay's STEPS project to implement a full OS + apps in under 20k LoC?: http://www.vpri.org/pdf/tr2008004_steps08.pdf

(sorta) http://www.nand2tetris.org/

templeos (was: losethos)

(sorta) http://www.fullpliant.org/

Android

Chromebook

Firefox OS

---

---

"

During the JRuby 9000 dev cycle, we decided it was time to improve the POSIX behavior of our Process and IO subsystems. In C Ruby, IO and Process are implemented directly atop the standard C library functions. As a result, they reflect behaviors often hidden when running Java applications, where those APIs are wrapped in many layers of abstraction. For example, a subprocess launched from Java can’t be read from in a nonblocking way, can’t be signaled, can’t inherit open files and from the parent process, and many other limitations. In short, we realized we’d need to go native to be truly compatible.

JRuby 9000 now uses native operations for much of IO and almost all of Process. This makes us the first POSIX-friendly JVM language, with full support for spawning processes, inheriting open streams, perfoming nonblocking operations on all types of IO, and generally fitting well into a POSIX environment. "

---

 Aleman360 3 hours ago

I work on the Start menu. It's just a UWP XAML app with Models and ViewModels? written in C++/Cx, as are most of the new Shell features and built-in apps--although some of the newer ones, like Maps and Xbox, are in XAML/C#/.Net Native. Even the UI frame of Edge is in XAML, and the new Office UWP apps are too. I encourage everyone here to give UWP apps a shot; we dogfooded the dev platform to ensure it was stable and fast, and XAML really is a pleasure to use. It's come a long way since WPF.

reply

---

---

"If you were to design a new language today, he said, you would make it without mutable (changeable) objects, or with limited mutability." -- van Rossum, paraphrased in https://lwn.net/Articles/651967/

" The GIL

Someone from the audience asked about the global interpreter lock (GIL), looking for more insight into the problem and how it is being addressed. Van Rossum asked back with a grin: "How much time have you got?" He gave a brief history of how the GIL came about. Well after Python was born, computers started getting more cores. When threads are running on separate cores, there are race conditions when two or more try to update the same object, especially with respect to the reference counts that are used in Python for garbage collection.

One possible solution would be for each object to have its own lock that would protect its data from multiple access. It turns out, though, that even when there is no contention for the locks, doing all of the locking and unlocking is expensive. Some experiments showed a 2x performance decrease for single-threaded programs that didn't need the locking at all. That means there are only benefits when three or more threads and cores are being used.

So, the GIL was born (though that name came about long after it was added to the interpreter). It is a single lock that effectively locks all objects at once, so that all object accesses are serialized. The problem is that now, 10 or 15 years later, there are multicore processors everywhere and people would like to take advantage of them without having to do multiprocessing (i.e. separate communicating processes rather than threads).

If you were to design a new language today, he said, you would make it without mutable (changeable) objects, or with limited mutability. From the audience, though, came: "That would not be Python." Van Rossum agreed: "You took the words out of my mouth." There are various ongoing efforts to get around the GIL, including the PyPy? software transactional memory (STM) work and PyParallel?. Other developers are also "banging their head against that wall until it breaks". If anyone has ideas on how to remove the GIL but still keep the language as Python, he (and others) would love to hear about it. " -- van Rossum, paraphrased in https://lwn.net/Articles/651967/

---

example of ClojureScript? AST for "(+ 1 1)":

{:args [{:op :constant, :form 1, :tag number} {:op :constant, :form 1, :tag number}], :children [{:op :constant, :form 1, :tag number} {:op :constant, :form 1, :tag number}], :numeric nil, :segs ("(" " + " ")"), :op :js, :js-op cljs.core/+, :form (js* "(~{} + ~{})" 1 1), :tag any}

http://swannodette.github.io/2015/07/29/clojurescript-17/

---

"Most things in LLVM—including Function, BasicBlock?, and Instruction—are C++ classes that inherit from an omnivorous base class called Value. A Value is any data that can be used in a computation—a number, for example, or the address of some code."

---

" Do you think that it would be possible to create a language with a Hindley Milner type system for the Erlang VM without affecting the power of Erlang semantics?

Not only do I think its possible, I have been planning to do it for a while now, time being the limiting factor. The main problem you will run into is the mismatch between the untyped bits of the Erlang native system and the typed bits of the new language. Dialyzer attempts to solve this through Success Typing, but there may be a better way. Something like what Roy [programming languages that tries to meld JavaScript? semantics with some features common in static functional languages] is doing in its type system or Clojure’s core.typed. I am not sure, but it’s a fun and solvable problem.

"

" BEAM is the only reasonably popular VM that took the language model, in this case Actors, and leveraged that model to make the platform itself more efficient. I find that brilliant. The two major examples of that approach in BEAM are how the Garbage Collector works with the runtime and how IO works.

In many systems, Java included, the Garbage Collector (GC) must examine the entire heap in order to collect all the garbage. There are optimizations to this, like using Generations in a Generational GC, but those optimizations are still just optimizations for walking the entire heap. BEAM takes a different approach, leveraging the actor model on which it is based. That approach basically has the following tenets:

    If a process hasn’t been run, it doesn’t need to be collected
    If a process has run, but ended before the next GC run, it doesn’t need
    to be collected
    If, in the end, the process does need to be collected, only that
    single process needs to be stopped while collection occurs

Those three tenets are one of the primary reasons that Erlang can be a soft-real time system [Elang has a preemptive scheduler that also plays big a part for this]. The fact that the model subsets the work that the GC has to do allows that work to remain small and manageable. Its an impressive achievement. "

--- https://medium.com/this-is-not-a-monad-tutorial/eric-merritt-erlang-and-distributed-systems-expert-gives-his-views-on-beam-languages-hindley-a09b15f53a2f

---

something we should have an annotation for:

https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

" — Built-in Function: long __builtin_expect (long exp, long c)

    You may use __builtin_expect to provide the compiler with branch prediction information. In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform. However, there are applications in which this data is hard to collect.
    The return value is the value of exp, which should be an integral expression. The semantics of the built-in are that it is expected that exp == c. For example:
              if (__builtin_expect (x, 0))
                foo ();
    indicates that we do not expect to call foo, since we expect x to be zero."

---

how to prevent this? have a warning for something like __DATE__, or mb don't allow __DATE__ or other impure code in any compile-time or metaprogrammy thing? or mb just not inside conditions in compile-time/metaprogrammy stuff (so that you can still print the date into a log)

" monkeyshelli 5 hours ago

Someone will lose some hair over this

  /* create memory leaks if compiled on April, 1st */
  #define free(x) if(strncmp(__DATE__, "Apr  1", 6) != 0) free(x)

The random ones are just pure evil.

reply " -- https://news.ycombinator.com/item?id=10084898

---

how to prevent this?

cjslep 5 hours ago

From "How to write unmaintainable code" [0], here is a function declaration that changes signature based on how many times the header is #included:

  1. ifndef DONE
  2. ifdef TWICE void g(char* str);
  3. define DONE
  4. else TWICE
  5. ifdef ONCE void g(void* str);
  6. define TWICE
  7. else ONCE void g(std::string str);
  8. define ONCE
  9. endif ONCE
  10. endif TWICE
  11. endif DONE

Granted, it isn't one line long.

[0] https://www.thc.org/root/phun/unmaintain.html (Cert issue shows up on FF unfortunately)

reply

-- https://news.ycombinator.com/item?id=10084840

---

need to be able to do stuff like this with Oot:

grep _expr_ thesis.txt

grep -v _color perl -pe 's/.*{(.*?)}.*/\1/'perl -pe 's/_arrows'xargs -n1 perl -e '$src = $ARGV[0]; $dest = $src; $dest =~ s/.jpg/_color.jpg/; print "mv $src $dest\n";'

mv sdk2_expr_edgeinv_cor_11-57_16x_77332746_389.jpg sdk2_expr_edgeinv_cor_11-57_16x_77332746_389_color.jpg mv sdk2_expr_edgeinvgamma_sag_16_8x_69873475_122.jpg sdk2_expr_edgeinvgamma_sag_16_8x_69873475_122_color.jpg mv robo1_expr_edgeinvgamma_cor_9-53_16x_73521820_404.jpg robo1_expr_edgeinvgamma_cor_9-53_16x_73521820_404_color.jpg

cd images/ish grep _expr_ ../../thesis.txt

grep -v _color perl -pe 's/.*{(.*?)}.*/\1/'perl -pe 's/_arrows'xargs -n1 perl -e '$src = $ARGV[0]; $dest = $src; $dest =~ s/.jpg/_color.jpg/; system("mv $src $dest\n");'

grep _expr_ thesis.txt

grep -v _color perl -pe 's/.*{(.*?)}.*/\1/'perl -pe 's/_arrows'xargs -n1 perl -e '$ARGV[0] =~ /(\d+)x_(\d+)_(\d+)\.jpg/; $scale = $1; $id = $2; $slice = $3; print "download_ish_images_for_sectionNumberRange($id, $slice, $slice, downsampling=$scale, expression_image=True, expression_image_colormap_params=(0.5,1.0,0,256,0))\n"';

download_ish_images_for_sectionNumberRange(77332746, 389, 389, downsampling=16, expression_image=True, expression_image_colormap_params=(0.5,1.0,0,256,0)) download_ish_images_for_sectionNumberRange(69873475, 122, 122, downsampling=8, expression_image=True, expression_image_colormap_params=(0.5,1.0,0,256,0))

---

luismarques 8 hours ago

Try programming with (std.)ranges and (std.)algorithm's. It's something completely refreshing, replacing a mess of loopy code with a clean pipeline of algorithms. The lazy nature of the standard algorithms and the clean syntax you get with the UFCS feature produce some really neat results. Even if you end up not using D any further, it can change your view of programming.

reply

eco 5 hours ago

Yeah, it's a lot like lisp in that regard. I'm glad I learned D even though I don't use it professionally if only because it changed the way I look at some things. The algorithm chaining enabled by UFCS and the range based standard library can lead to some very beautiful code (at least as far as C-family languages go). It also made me painfully aware of how often I copy strings in C++ (string_view cannot come soon enough).

Here's a snippet of code I hacked together in D for a bot to scrape titles from pages of urls in irc messages.

    matchAll(message, re_url)
              .map!(      match => match.captures[0] )
              .map!(        url => getFirst4k(url).ifThrown([]) )
              .map!(    content => matchFirst(cast(char[])content, re_title) )
              .cache // cache to prevent multiple evaluations of preceding
              .filter!( capture => !capture.empty )
              .map!(    capture => capture[1].idup.entitiesToUnicode )
              .map!(  uni_title => uni_title.replaceAll(re_ws, " ") )
              .array
              .ifThrown([]);

It uses D's fast compile-time regex engine to look for URLs, then it downloads the first 4k (or substitutes an empty array if there was an exception), uses regex again to look for a title, filters out any that didn't find a title, converts all the html entities to their unicode equivalents (another function I wrote), replaces excessive whitespace using regex, then returns all the titles it found (or an empty array if there was an exception). There's stuff to improve upon but compared to how I would approach it in C++ it's much nicer.

reply

---

http://www.key-project.org/ recc. by user pron (Lispy guy?) on HN

---

 edgyswingset 2 days ago

Although this isn't OCAML, one thing I ended up missing when writing Haskell was F#'s Active Patterns.

reply

porges 2 days ago

These exist in Haskell as "Pattern Synonyms". Here's a partial translation of some of the F# examples on MSDN to Haskell;

  {-# LANGUAGE PatternSynonyms, ViewPatterns #-}
  pattern Even <- ((`mod` 2) -> 0)
  pattern Odd <- ((`mod` 2) -> 1)
  testNumber x = show x ++
      case x of
          Even -> " is even"
          Odd -> " is odd"
  data Color = Color { r :: Int, g :: Int, b :: Int }
  pattern RGB r g b = Color r g b
  -- NB: this is bidirectional automatically
  printRGB :: Color -> IO ()
  printRGB (RGB r g b) = print $ "Red: " ++ show r ++ " Green: " ++ show g ++ " Blue: " ++ show b
  -- pretend we have functions to and from RGB and HSV representation
  toHSV :: Color -> (Int, Int, Int)
  toHSV = undefined -- implement this yourself!
  fromHSV :: Int -> Int -> Int -> Color
  fromHSV = undefined
  pattern HSV h' s' v' <- (toHSV -> (h', s', v'))
      -- here we explicitly provide an inversion
      where HSV = fromHSV
  printHSV :: Color -> IO ()
  printHSV (HSV h s v) = print $ "Hue: " ++ show h ++ " Saturation: " ++ show s ++ " Value: " ++ show v
  -- demonstrating being able to use pattern
  -- to construct a value
  addHue :: Int -> Color -> Color
  addHue n (HSV h s v) = HSV (h + n) s v

reply

---

great list of things to consider in language design: https://github.com/btrask/stronglink/blob/master/SUBSTANCE.md definitely read the discussion if you read this too, there are some bad ideas in there: https://news.ycombinator.com/item?id=10157018

---

'Cody' a language for describing patterns in code used by the QuantifiedCode? automated code review/lint toolkit:

http://docs.quantifiedcode.com/patterns/index.html

---

"

kragen 8 hours ago

I haven't read this new book yet, so I can't address your contingent criticisms of it, but some of your criticisms are not contingent on the contents of the book, and those I think I can refute.

We've actually learned a lot about how to program since SICP was written; Clojure embodies some of that knowledge.

To take one example, Scheme was built around the state-of-the-art FP-persistent data structure of the 1950s, the singly-linked list, which was still state-of-the-art in the 1980s. Clojure's standard library includes state-of-the-art production-quality data structures of the 2000s, which can support many more operations persistently than singly-linked lists can.

To take another example, miniKanren, which is also in Clojure's standard library, is a dramatically more powerful logic programming system than anything that was available in the 1980s. Basic miniKanren is small enough that you could in fact present it, starting with mu-Kanren, in a book chapter. This may be a better way to introduce people to nondeterministic programming than the simple temporal backtracking approach in SICP.

Of course, we've learned a great deal about formal semantics, types, and proving properties of programs since then, some of which could be presented productively even in dynamically-typed languages like Clojure.

Perhaps most importantly, we've learned an enormous amount about building distributed systems since then.

SICP is, to my mind, largely about different relationships between time, memory, and programming. It begins with a timeless, memoryless reduction semantics, and expands from there, exploring different relationships with time: mutating, backtracking, lazy, and so on. Since SICP was published, some new approaches to time in programming have become important: transactional stores, although those were already in use in niche applications like transaction processing and business data processing in general; incremental computing, where parts of your program are re-executed by need, while leaving other parts alone, although that was already in use in niche applications like compilation of large software systems; and partial evaluation, which, again, existed but was not yet popular.

A SICP for 2015 would surely incorporate some of these things, though I'm not sure which. And surely there were lessons the authors themselves learned in writing SICM that could be deployed to good effect in a new SICP, as well.

reply "

---

---

i found this talk by Rich Hickey to be pretty on-target for how to do design thinking:

https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/HammockDrivenDev.md

---

"

Decorating methods

One nifty thing about Python is that methods and functions are really the same. The only difference is that methods expect that their first argument is a reference to the current object (self).

That means you can build a decorator for methods the same way! Just remember to take self into consideration: "

---

do we want it to be possibly to 'un-decorate' a decorated fn?

---

"

    Decorators wrap functions, which can make them hard to debug. (This gets better from Python >= 2.5; see below.)

The functools module was introduced in Python 2.5. It includes the function functools.wraps(), which copies the name, module, and docstring of the decorated function to its wrapper.

(Fun fact: functools.wraps() is a decorator! ☺) "

---

fun way to use closures in Python without objects:

" def counter(func): """ A decorator that counts and prints the number of times a function has been executed """ def wrapper(*args, kwargs): wrapper.count = wrapper.count + 1 res = func(*args, kwargs) print "{0} has been used: {1}x".format(func.__name__, wrapper.count) return res " -- http://stackoverflow.com/questions/739654/how-can-i-make-a-chain-of-function-decorators-in-python/1594484#1594484


one way to judge the 'size' of a language is the length of time it takes to learn 'all of it', which can be approximated by the size of documents such as official introductory tutorials, language specification, some subset of the standard library documentation. Some ppl like Go b/c of this (eg https://news.ycombinator.com/item?id=10208950 : "About a month ago, I started learning Go. It's so small. The language, the standard library, the tools. They're all so easily digestible; it took me 4 days to work through the Go playground, read the entire Go specification, work through and learn 40% of the standard library, and start writing production-ready programs. C was the last language that enjoyed a language/standard library this small, to me. The same thing can't be said for Ruby or PHP, or any other languages I've worked with. In almost every project I've been involved with, there would be usage of some arcane corner of the language. Whether it's determining the byte offset of a member function in C++, or the GCC extensions to structure initialization in C, or metaprogramming magic in Ruby. All these cracks and crevices are difficult to keep in your head. This isn't so with Go. It's easy to keep the entire language and standard library in your head; providing a certain ease and flow when building a program."

So let's make that a goal for Oot. Here's the Golang spec:

https://golang.org/ref/spec

It's currently 3061 lines, 26157 words, and 157819 characters, according to 'wc' run on a copy-and-paste of that URL's contents. If there are 250 words per page, that would be about 100 pages. But i suspect there are a lot of shorter than average 'words' due to code. If we divide the characters by 3000 chars per page we get 52 pages. If we divide lines by 30 lines per page we get 100 pages. Counting pagedowns on that webpage, it's about 100 pagedowns.

The Go Tour is about 1417 lines, 3008 words, 18297 characters (by doing something like: git clone https://github.com/golang/tour/; cd tour/content; rm -r img; cat */* > /tmp/to.txt; wc /tmp/to.txt)

So that's about 48/12/6 pages depending on whether you go by 30 lines per page, 250 words per page, or 3000 characters per page.

---

" Consider the major insecurities in C++ code:

The obvious approach for avoiding these problems is to provide a library (or a set of libraries) that saves the programmer from having to use these error-prone features. For example, instead of using arrays, the programmer can use a range-checked vector and instead of a union a user can use a tagged union or an Any type. Casts (with exception of the dynamicaly type-safe dynamic_cast ) and void* s are rarely useful outside low-level and easily encapsulated uses, so they can simply be avoided. If we use counted pointers, memory leaks won’t happen (depending on how cyclic data structures are handled). Since pointers are checked, we don’t access through invalid pointers and double deletion are easily detected. Basically, errors that cannot be detected until run-time are systematically turned into exceptions, making Safe C++ a dynamically type safe language. Exceptions may not be your favorite language feature, but they are useful in most contexts and are universally used for reporting run-time type violations in languages deemed type-safe.

...

6.3 Real-time C++ The problems of real-time code for embedded systems com- bine concerns for correnctness, reliability, and performance in constrained circumstances. Some problems and solutions overlap with those of Safe C++ but others are unique in that they require that every operation is performed in a known constant time (or less). Naturally, not all real-time and em- bedded systems are written under this Draconian rule, but let’s see how we can address those that do. Some C++ opera- tons become unusable: 1. free store (general new and delete ) 2. exceptions (assuming inability to easily predict the cost of a throw ) 3. class hierarchy navigation ( dynamic_cast in the absence of a constant time implementation [7]) First, we add a suitable support library: 1. a fixed size Array class (no conversion to pointer, knows its own size) 2. some safe pointer classes 3. memory allocation classes that guarantee constant time allocation (and deallocation if allowed) — pools, stacks, etc 4...

Next, we use the Pivot to eliminate dangerous operations (as listed in § 6.1) from user code. In principle, this will do the job. However, we can do more. For most programs of this sort, we can do whole-program analysis. Such programs tend to be relatively small and not allow dynamic linking. Thus, the Pivot could be used to allow exceptions for error reporting: we can verify that every exception is caught and calculate the upper bound for each throw. This is a special — and espicially hard — example of using a tool to verify that resource consumption is within acceptable bounds. In general, there are lots more that the Pivot can do in the context of embedded systems. Some depends on a specific application, so the boundary between SELL and application support blurrs. For example, it is not uncommon for an embedded program to be more permissive about the facilities that can be used during a startup phase. The SELL can define what “startup” means (e.g. called from start_up ) and only apply the stringent rules outside that. " -- [1]

---

"

Example:

class Date { ... public: Month month() const; do int month(); don't ... };

The first declaration of month is explicit about returning a Month and about not modifying the state of the Date object. The second version leaves the reader guessing and opens more possibilities for uncaught bugs. "

---

" Example:

change_speed(double s); bad: what does s signify? ... change_speed(2.3);

A better approach is to be explicit about the meaning of the double (new speed or delta on old speed?) and the unit used:

change_speed(Speed s); better: the meaning of s is specified ... change_speed(2.3); error: no unit change_speed(23m/10s); meters per second "

---

" use const consistently (check if member functions modify their object; check if functions modify arguments passed by pointer or reference) "

---

" Example:

int i = 0; while (i<v.size()) { ... do something with v[i] ... }

The intent of "just" looping over the elements of v is not expressed here. The implementation detail of an index is exposed (so that it might be misused), and i outlives the scope of the loop, which may or may not be intended. The reader cannot know from just this section of code.

Better:

for (auto x : v) { /* do something with x */ }

Now, there is no explicit mention of the iteration mechanism, and the loop operates on a copy of elements so that accidental modification cannot happen. If modification is desired, say so:

for (auto& x : v) { /* do something with x */ }

Sometimes better still, use a named algorithm:

for_each(v,[](int x) { /* do something with x */ }); for_each(parallel.v,[](int x) { /* do something with x */ });

The last variant makes it clear that we are not interested in the order in which the elements of v are handled. "

---

C++ array_view: "Represents an N-dimensional view over the data held in another container."

---

http://stackoverflow.com/a/1461449/171761

---

in C++, they say there is a gotcha with unions, which are unchecked, and some recommend 'variant' instead (is this boost::variant)?, which is a typesafe discriminated (tagged) union.

---

some random notes from scala's (old) tour (http://www.scala-lang.org/old/print/book/export/html/104.html):

'sealed classes' this is the idea of ADT/tagged unions/closed cases, that is, you say, "if i say X is of type T, that implies that it is either of case T1, or case T2, or case T3; and no other module can add a 'case T4'; therefore when the compiler sees a switch/case statement on the cases of type T, it should error if it doesn't see, for each of T1,T2,T3, code to handle that case; but if it sees code for each of those cases, then the programmer is guaranteed not to face a runtime error for an nonexhaustive switch, since no new case T4 may be added"

'right-ignoring patterns': i dont completely understand this but it seems to be similar to Python's *args, but for patterns.

scala uses the @ prefix sigil for annotations, like this: "(x: @unchecked)" to annotation "x" with "unchecked" (imo wouldnt it be more concise to just allow "x @unchecked"? in oot wed just do "x Unchecked")

traits are typeclasses/interfaces: "Similar to interfaces in Java, traits are used to define object types by specifying the signature of the supported methods. Unlike Java, Scala allows traits to be partially implemented; i.e. it is possible to define default implementations for some methods. In contrast to classes, traits may not have constructor parameters"

inheritance (including trait 'mixin' inheritance) is via the 'extends' keyword, eg "class Point(xc: Int, yc: Int) extends Similarity {...}"

---

peephole-optimizable non-control primitives:

---

mb a special construct to handle while (true) {switch (instr) case ...} with no bounds checking so that computed goto is needed here jaquesm https://news.ycombinator.com/item?id=9830065 claims that switches over enums are usually compiled to jump tables anyway

---

some universal algebra things: subgraph, direct product, homomorphism

---

" I'd also like to have compatibility of some language extension features, such as built-in functions and attributes. In particular, CPU-agnostic intrinsics for common instructions like atomics, popcount or count leading zeros as well SIMD arithmetic would be great.

My favorite C language extension is SIMD vector arithmetic with infix operators. You can get really pretty and portable (!) vector math code written in Clang and GCC using vector extensions, but again, it's not available in MSVC. "

-- https://news.ycombinator.com/item?id=10277485

" Maybe I need to get more adventurous, but what I see as the key improvements (designated initializers, compound literals, declarations (nearly) anywhere) are now in place.

...

Main omissions I've noticed: printf isn't quite the same (no 'z' modifier, the 'n' forms are noticeably inferior), no VLAs.Also, standard library isn't POSIX (which is not an omission but a lot of C code assumes POSIX so you'll probably end up having to deal with this). ... But I've noticed a tendency for people to sometimes assume that "multiple platforms" means "ten different types of gcc+POSIX+fork+pthreads". I'm sure that even shiny new C99-friendly VC++ won't make them happy.

...

Forgive me as I might be wrong but aren't system intrinsics the same in each compiler? I was under this impression as I wrote a lot of SSE2/3 C99 code for the GCC using MSVC++ documentation it compiles/runs without an issue. (This also before finding the MMX/SSE/AVX docs that intel has which are beautiful).

I mean the function/variable type names (for x86_64 at least) are the same for Clang/ICC/GCC/MSVC.

reply

phkahler 14 hours ago

>> Forgive me as I might be wrong but aren't system intrinsics the same in each compiler?

>> I mean the function/variable type names (for x86_64 at least) are the same for Clang/ICC/GCC/MSVC.

Yes, all those compilers support the same names for x86_64. Then a different set for ARM NEON, and a different set for PPC AltiVec?. What GCC has done is implement another set of names that can be used across all those hardware architectures. To clarify the difference, you can choose - do you want to move your code between different compilers, or different hardware but always with GCC.

yoklov 15 hours ago

Honestly the non-intrinsic SIMD code is a mixed bag. I like the portability compared to intrinsics, but it encourages SIMD antipatterns like using a vector register to store one object (as opposed to using it to step over loops 4/8/16 at a time). It's also less reliable about generating good code IME.

reply

 pornel 17 hours ago

That's great. Please don't forget to finish the C99 implementation!

For tiny values I really like being able to use int arr[runtime_size]; rather than risking buffer overflows with int arr[MAX_SIZE] or arr = malloc(size * sizeof(oops)); — and MSVC is the last compiler that still doesn't support that.

reply

"

---

tips on Elixir:

"

dv_says 12 hours ago

I'm really glad to have recently picked up Elixir. For anyone just starting, a few tips from someone similarly new:

a. After launching the "iex" shell, press Tab. You'll get the list of all built-in commands.

b. The help feature is also very handy for when you're wondering what an operator does. Type "h <command>" for a definition.

    h !/1

c. Try the pipe operator. Your code will closely follow the transformations of your data.

Instead of this:

    m1 = "moo"
    m2 = String.replace(m1, "m", "z")
    m3 = String.upcase(m2)

Or this:

    String.upcase(String.replace("moo", "m", "z"))

Try this:

    "moo" |> String.replace("m", "z") |> String.upcase()

The result of each command will be passed as the first argument to the subsequent command.

d. You get seamless interoperability with Erlang libraries.

    :crypto.hash(:md5, "bob") |> Base.encode64

e. Try the Observer tool in iex to visualize various aspects of your running application, including the supervision tree, details on each running OTP process, and much more. Seriously very handy stuff.

    :observer.start()

f. If you're using EC2 like I am, Amazon images have a too-old version of Erlang, but it's trivial to compile things yourself:

   sudo yum install gcc glibc-devel make ncurses-devel openssl-devel autoconf
   wget https://www2.erlang.org/download/otp_src_18.0.tar.gz
   tar xvzf otp_src_18.0.tar.gz
   cd otp_src_18.0
   ./configure && make && sudo make install
   sudo ln -s /usr/local/bin/escript /usr/bin/escript
   sudo ln -s /usr/local/bin/erl /usr/bin/erl

reply "

---

maybe look at how Ethereum created simplified versions of existing languages to see how they could be simplified:

https://forum.ethereum.org/discussion/1460/solidity-faq https://github.com/ethereum/wiki/wiki/Solidity-Tutorial (javascript-like/C++-like; this seems to be the central one for the Ethereum project) https://github.com/ethereum/wiki/wiki/Serpent (python-like) https://github.com/ethereum/cpp-ethereum/wiki/LLL-PoC-5 (lisp-like) https://github.com/ethereum/go-ethereum/wiki/Mutan (C-like)

https://www.reddit.com/r/ethereum/comments/34gt50/contract_programming_language_solidity_serpent_or/ compares and contrasts:

" Solidity is the most developed language and compiler...Solidity is the flagship language at the moment...Moving forward, Solidity will continue to be the most developed contract orientated language at Ethereum...Solidity has the best interoperability with the Javascript APIs, which is a major reason to use it over the others.

Serpent used to be the flagship language. LLL was, as far as I can tell, a "let's get something higher-level than EVM assembler language going real quick" effort....LLL is practically assembly code, with lisp syntax, although it works, it's hard to get a lot done with it....Serpent 2.0 is relatively easy to learn and implement (just like Python!), and will continue to be developed by Vitalik as a side project....Unlike Solidity which compiles down directly to EVM byte code, Serpent 2.0 compiles down to LLL first, then EVM. "

"Serpent is one of the high-level programming languages used to write Ethereum contracts. The language, as suggested by its name, is designed to be very similar to Python

...

Differences Between Serpent and Python

The important differences between Serpent and Python are:

    Python numbers have potentially unlimited size, Serpent numbers wrap around 2^256. For example, in Serpent the expression 3^(2^254) suprisingly evaluates to 1, even though in reality the actual integer is too large to be recorded in its entirety within the universe.
    Serpent has no decimals.
    Serpent has no list comprehensions (expressions like [x**2 for x in my_list]), dictionaries or most other advanced features
    Serpent has no concept of first-class functions. Contracts do have functions, and can call their own functions, but variables (except storage) do not persist across calls.
    Serpent has a concept of persistent storage variables (see below)
    Serpent has an extern statement used to call functions from other contracts (see below)

"

other used-to-be-differences (items in that list in Serpent 1.0 but not there as of this writing) include:

summary: