# Languages for scientific, numerical, or mathematical computation

## MATLAB and Octave

### Internals and implementations

Core data structures: todo

#### Number representations

Integers

Floating points todo

#### array representation

variable-length lists: todo

multidimensional arrays: todo

limits on sizes of the above

todo

## R

features:

• "r functions are very similar to fexprs" [1]
• "R has some really crazy metaprogramming facilities. This might sound strange coming from Python, which is already very dynamic - but R adds arbitrary infix operators, code-as-data, and environments (as in, collections of bindings, as used by variables and closures) as first class objects. On top of that, in R, argument passing in function calls is call-by-name-and-lazy-value - meaning that for every argument, the function can either just treat it as a simple value (same semantics as normal pass-by-value, except evaluation is deferred until the first use), or it can obtain the entire expression used at the point of the call, and try to creatively interpret it. This all makes it possible to do really impressive things with syntax that are implemented as pure libraries, with no changes to the main language." -- [2]
• "In R, everything is an expression, and every expression is a function call. Even things like assignments, if/else, or function definitions themselves, are function calls, with C-like syntactic sugar on top. You don't have to use that sugar, though! And all those function calls are represented as "pairlists", which is to say, linked lists. Exactly like an S-expr would - first element is the name being invoked, and the rest are arguments. And you can do all the same things with them - construct them at runtime, or modify existing ones, macro-style. So in that sense, R is actually pretty much just Lisp with lazy argument evaluation (which makes special forms unnecessary, since they can all be done as functions), and syntax sugar on top. Where it really deviates is the data/object model, with arrays and auto-vectorization everywhere... Something like that would be called a FEXPR in Lisp. " [3] [4]

free tutorials:

books:

best practices and style guides:

libraries:

Retrospectives:

Opinions:

• "I think R is a great language for certain applications - namely statistics and some data analysis. " -- [5]
• "I strongly believe that R has excellent features to support data science: non-standard evaluation, NAs at a very low level, data frames, first class functions, ..." -- [6]
• the type and object system
• "I don't think R's OO is great; but it's mostly misunderstood because it's fundamentally different to most popular modern languages. Interesting Julia, uses exactly the same model as R (I.e. Generic functions not message passing)" [7]
• "My main criticism is around two things. One, "clunkiness" of class creation (which I guess is subjective); I don't think it's very readable, but maybe that's bias from other languages. Second the S3/S4/R4 mess." -- https://news.ycombinator.com/item?id=9784471
• "I'd say R is a _terrible_ language. Its types are just really different from every major programming language, and it's horrible for an experienced programmer to use." [8]
• "I have to disagree. Its main model is generic function method dispatching. It can feel odd at first to someone coming from the C++ style of OO where objects own methods, not methods owning objects. But it's a legitimate OO style with its own advantages. [9]. I've found the more I use R, the more intuitive a lot of its operations are. It's relatively easy to "guess" what you ought to do to accomplish what you want. More so then other languages I've learned." https://news.ycombinator.com/item?id=10388416
• "It's type system is seems very complicated and while the language tries to do what it thinks you want, it's not always clear what is going on (are you working on a matrix or a dataframe that has been cast into a matrix?). ...R is one of those languages that is good in a certain domain, but once you get out of that domain, it makes things more complicated than they need to be. Normally for a language design, you aim to make easy things easy, and difficult things possible. For R, it seems like it makes difficult things easy and easy things difficult." [10]
• "it has 4 (four) objects systems which differ in subtle ways between each other. It's programmers' nightmare. It's not the worst language in the world, but it isn't terrific language either." [11]
• "I would say that any language that does not have a facility to get the path of the current file, is not 'excellent' under the criteria an experienced programmer would use for assessing it." [12]
• "I'm not qualified to comment on how good or bad a language R is. But it is maddening how package developers don't follow some convention for naming functions. I load a package that I haven't used recently and I know the function I want but can't remember if it is called my_function, myFunction, my.function, or MyFunction?." [13]
• "R does have some pretty cool tools for after the fact debugging, like dump.frames, but few people know about them." -- [14]
• "I still feel like the many great features of R (as you said some time ago, NA handling, data frames, etc.) are sometimes outweighed by the cons." [15]
• "The language was designed for data analysis, and has some quirks (like the way data structures are indexed and have to be stored in physical memory)..." -- [16]
• "This is my biggest beef with R. It is constantly changing the dimensions and types of your data without telling you. Want to grab some subset of the rows of a matrix? Better add some extra post-processing in case there's only one row that satisfies your query, or else R will change its type!... (reply: You know that you can tell it not to do that, right? drop=FALSE ; reply to the reply: I think this only reinforces my point: this is a ridiculous default. another reply: use dplyr's tbl_df wrapper around data.frame)" -- https://news.ycombinator.com/item?id=9267362
• "...R language is such a mess. For example, R has lazy evaluation despite being an imperative stateful language." [17]
• "R's syntax is not the problem, the "standard" library is the biggest problem in it's inconsistencies. That said, the other big area of complaint in R is the type system. We are too often having to coerce types, but I'm not exactly sure of the solution for that." [18]
• "I find R's syntax to be a big hangup for new learners, especially on indexing and apply-to-each (sapply, mapply, just plain apply...), but dplyr really makes life much easier. The %>% operator alone (which to be fair was originally from magrittr) is a great help. Not sure if this is my personal biases, but I always find it easier to read calls chained postfix-style." -- [19]
• http://arrgh.tim-smith.us/index.html
• "importing a package takes a meaningful amount of time in R. Several seconds, that is just unacceptable." -- [20]
• "I have to fight with R on scientific notation, always copy - pasting into my code: options(scipen=999)" -- [21]
• "its a personal matter, but R has syntaxes that get on my nerves. python list: a = [1,2,3] a = c(1,2,3). perhaps its because i used other languages before, but my fingers are more adept at hitting [ which requires no shift compared to (. some people love curly braces and lots of parentheses in if/for statements, I appreciate them not being there." -- [22]
• "Its main model is generic function method dispatching. It can feel odd at first to someone coming from the C++ style of OO where objects own methods, not methods owning objects. But it's a legitimate OO style with its own advantages. [23]. I've found the more I use R, the more intuitive a lot of its operations are. It's relatively easy to "guess" what you ought to do to accomplish what you want. More so then other languages I've learned." [24]
• "Dependency management, in my opinion, is one of the problems in the R ecosystem. The lack of name spaces when calling functions has made the community have many little packages that only do one thing on you are not really sure where it was actually used, unless you know the code and the package. An example is the janitor::clean_names function I like to use for standardizing the column names on a data.frame." [25]
• "R is a disaster when you want to write programs as you would in a real programming language. R is an excellent choice for what it is used most of the time by these people whose education/training isn't related to programming: interactive analysis of data and (maybe) writing prototypes." [26]
• "The thing to keep in mind is that, from the point of view of someone who works with data, R isn't a programming language. It's a statistical software package that has a programming language. Its competitors are things like Minitab, SPSS, Stata, and JMP, all of which used to be entirely menu-driven. R was a genuine innovation when it was first introduced." [27]
• Evaluating the Design of the R Language
• " R is terrible, and especially so for non-professional programmers, and it is an absolute disaster for the applications where it routinely gets used, namely statistics for scientific applications. The reason is its strong tendency to fail silently (and, with RStudio, to frequently keep going even when it does fail.)" [28]
• "...if I have to analyze a .csv quickly, I'm going for R most of the time." [29]
• "...nothing beats R for getting to an answer as fast as possible (not even Python) at the cost of making it more difficult to productionise a solution in pure R." [30]
• "Most people I know who use R don't care a whit about production. They run an analysis to answer hypotheses." [31]
• " I agree that R shouldn't be used in production, but R is great for prototyping different analytical models before porting them over to Python or another language. " [32]
• "Another feature for this audience is the philosophy that functions shouldn't have side effects. You can still do (several types) of object oriented programming in R, but it does take away some of the ways in which non-programmers shoot themselves in the foot. I've come to really like the way environments work in R, as well." [33]
• "A sane language shouldn’t need three different object systems." [34]
• "Meh, S3 is nice and lightweight for a very particular kind of analysis interoperability. S4 isn't super useful IMHO. RC is very well thought out, and I've heard good things about R6. It might not be sane language design, but it works well for designing very different kinds of analytical procedures." [35]
• "As a long-time R user, I agree with all of these complaints. The language itself is ugly and actively tries to get in your way. I'll add that concepts like data frames are not really intrinsic, and you get needless complexities like "length", "nrow", "dim", each of which does the wrong thing in 90% of the scenarios of interest. The confusion of lvalues is another strange quirk -- a <- 0; length(a) <- 20 is totally valid, and you get things like class(a) <- 'foo' being preferred over the equivalent a\$class <- foo. It has all sorts of odd concepts between lists and data.frames -- the double-bracket syntax, etc. The object model is very confusing, though most people seem to have converged on the S3 system, which is the oldest one. If you discipline yourself to learning "the good parts", especially by learning either data.tables or tidyverse or becoming a master of split/lapply/aggregate/ave, then it is very powerful. The modelling tools and plotting (both base graphics and ggplot2) are excellent. I'd love to see a NeoR? arise at some point that fixes the strange historical inconsistencies (like what happens when you refer to vec[0], as noted by the author) in non-backward compatible ways." [36]

Opinionated comparisons:

• https://matloff.wordpress.com/2014/05/21/r-beats-python-r-beats-julia-anyone-else-wanna-challenge-r/
• https://www.dataquest.io/blog/python-vs-r/
• "It's python that's really a mess for data science; you can't avoid it being a programming language first and a tool for data science a distant tenth. Syntax that only a programmer would like is necessary, and quite a bit of it at that. R is a much better fit for people who want to do statistics first, and as little programming as possible. Thinks like function parameters being promises make it far easier to deal with functions like optimizers where there really are 10+ tuning parameters or things you may want to tweak. Iterative languages are far easier to understand for people who don't want to be programmers. You cannot develop plyr or ggplot in a language agnostic way, because they need the purpose built syntax R has. Contrast to eg the fight in python to get an infix matrix multiplication operator." [37]
• "...I used R for statistics at college, but only base R, and it is verbose even for basic data manipulation. The scripts I made for my older data blog posts are effectively incomprehensible to me now. I ended up learning how to use Python/pandas/IPython because I had had enough and wanted a second option on how to do data analysis. Then the R package dplyr was released in 2013, alleviating most annoyances I had with R. dplyr/ggplot2 alone are strong reasons to stick with the R ecosystem. (not that Python is bad/worse; as I mention at the post, both ecosystems are worth knowing)" -- [38]
• "I use both (((R and Python))), but R for interactive analysis and reporting, Python for data transformations (ETL). While the syntax of Python is "cleaner" for backend scripts, R feels more straightforward when working with dataframes (dplyr) resulting in things to report on. The syntax for ggplot2 fits the same category. As much as having one languages for both categories would be nice, using both today seems like a better option. " -- [39]
• "...Python is way, way better for text. And I say that as a long-time R user. R really doesn't like things that can't be represented as datasets. " [40]
• "Yes a couple of years ago I did this project where I needed to get a ton of analysis done, typically with plots and tables as output. Did this in python and it was this huge mess with pandas and matplotlib. Re-did it in R with data.table and ggplot2 and it was just ridiculously easier, and I could expand upon the code much more easily, plus the output was much prettier."
• "If I need to run a quick analysis on a dataset, I'm grabbing R 9/10 times. If I'm building a production pipeline, I'm using Python 9/10 times." [41]
• "...this kind of code ((replying to a comment about 'answering questions in a rapid, interactive way')) is what R is optimized for, not general purpose programming (even though it can totally do it)." [42]
• "I think R-Studio (an R based IDE that turns it kinda into a more excel like experience) where you can inspect the data in memory (including matrix data) and graph making is where it really helps bring people into the R language. And with a set of instructions anyone can go load the analysis packages and do their data analysis. Compare this to python, where they have to go the unix shell set up the environment, load the libraries. When they come back reset everything and get back to where they started." [43]
• "...there is litterally no equivalent to dplyr and ggplot2 in Python. Those alone can make a huge difference in how many lines you need to write to do something." [44]
• "ggplot2 has plotnine (http://plotnine.readthedocs.io) which has a nearly identical API. I've found though it's not perfect, you can get closer to dplyr with JS-style method chaining on Pandas."
• "I've recently used plotnine, and it's been a relatively good experience, but Pandas is absolute garbage compared to the tidyverse, API-wise."
• "I tried plotnine before and its far from covering everything ggplot can do. And chaining on pandas can make things unreadable compared to dplyr."
• "Non-Computer Scientists seem to have a much easier time with Python than R, anecdotally. I think the reason is that R is not just a badly designed language, but in particular its design is inconsistent. That’s as confusing to newcomers as it is to people who care about PL design. I used R for almost a decade. Last year I switched to Python and Jupyter, never looked back. Can’t recommend the switch highly enough. R has great stats packages, but struggling with the language is just not worth it." [45]
• "R may not be the most "beautiful" language in a general perspective, but it certainly is more beautiful than Python when it comes to actual data analysis. There is nothing in R that is as ugly as even the best implemented pandas, numpy, and matplotlib code. All of the options in Python, which is generally pointed to as the "superior" language to R, feel tacked on and hackish." [46]
• "At any rate, at least it's not Pandas and matplotlib... " [47]
• "I think that Python is probably winning because being a decent language gives you a decent escape hatch, whereas no amount of great libraries can save you from having to go through the bizarro language. That said, R may be bizarro, but at least, once you learn it, it's predictable. Whereas I'm not sure even Pandas really knows whether a given call to .loc will copy or refer to the original data." [48]
• "I learned R coming from Java, Node, PHP and Python and I love it !!! It is awful as an application development programming language, but it was never designed for that purpose. It was designed for STATISTICS. Try to achieve advanced statistics with your traditional software engineer's preferred language and see which language you hate then. The only tricky R concepts to learn for newbies are: recycling, formulas and vectorized functions. Add RevoScaleR? to R and it kicks major ass when dealing with big data manipulation. Oh yes, big time !!!" [49]
• "R is much more lisp-like than python: more functional, more emphasis on DSLs and metaprogramming, and the community is far more inviting to New comers: reminds me a lot of Racket in that regards." [50]
• " Having used Python, JSL, Julia, R and Matlab; I agree with most of the things in R. R is an extremely ugly language. It seems to be created by people who wear capris and uggs (both at the same time). But, R has incredible packages, especially the work done by Hadley Wickam. ggplot2 is beautiful. It is utterly gorgeous. It is what Ted Baker is to the capri guys that designed the language itself." [51]
• "My personal hack to deal with the unbearable ugliness of R is to use Rpy2 and call R packages from Python --- at least writing some boilerplate code in Python makes me happier than having to write in R." [52]
• "Second this. Pandas / Numpy / Numba until I hand off to ggplot, lmer, or whatever specialized R package." [53]
• "I am very clear I am an R "consumer" not an R developer. But, at this point, absent Shiny and a gui, I think that Python and Numpy has as much to offer me basically. Some people say the syntax is FP friendly. I have been trying to learn FP in Haskell and I think R is about the worst notation you could invent to sell FP." [54]
• "I use R most of the time and I find R notebooks very data exploration friendly. It makes it easy to back and forth just like Jupyter notebook. Producing HTML files from Rmarkdown files is also analysis friendly. 99% of the time I use tidyverse with no noticeable impact on the performance. For that occasional 1%, I must admit datatable package works out really well. tidyverse pipes are so unixy that makes it easy to transition to command such as cut, head, sort and column if needed without any mental contortion. I have used Python occasionally and with method chaining, it can almost simulate the "dplyr" like syntax. However, it is hard to find some obscure statistical test out of the box which is easy in R." [55]
• "Great guide. R syntax is awful, no matter how you slice it and would have been the tool of choice before Python could walk. Still, it is very powerful and great to have around." [56]
• "Every error message ever written in R: "Error". Hmmmmm, okay after 20 minutes of squinting at the script I see there's a lowercase letter at the header name of the last column somewhere in the middle of the 200 lines...now we can move on to debugging the next "Error"..." [57]
• "I’d rather do math in a general-purpose language than try to do general-purpose programming in a math language." [58]
• "I don't use R to write programs. I wrangle data, run analysis, and plot results. It has a great number of solid stat packages for mixed modeling, clustering, ordination, etc. That is why I use R." [59]

Lists and tours of libraries:

Gotchas:

• if you select a column from an empty dataframe without drop=FALSE, rather than returning an empty column, it returns a NULL [64]
• 'scalars' are just 1-element vectors
• "is.vector() does not test if an object is a vector. Instead it returns TRUE only if the object is a vector with no attributes apart from names. Use is.atomic(x)
 is.list(x) to test if an object is actually a vector." [65]
• the 'c' atomic vector constructor always produces flat vectors, even if you nest them [66]
• "the pain of needing to specify `stringsAsFactors=FALSE` is solved in the tibble package by setting a sensible default." [67]
• "..here's a tongue in cheek summary of my experience with R. I write a script. It doesn't work. I don't know why. I look at the error message, and then google for 30 minutes to understand what it really means - which parts of the code broke, why, how to fix them. Because none of the 3 things (which, why, how) is easy to get to. OK, I fix it, having learned something new (like that there are infinite special cases with almost any functions). I commit it to repo, go for coffee. In the afternoon, a colleague asks how to run that code. Well, it was a simple script, half a page, what's the problem? I take a look, and on their machine it doesn't run. We don't know why. An hour later we discover she has some R profile file with a setting that changes behavior of some standard library... and she also has different encoding set as default, and so on, and so forth... whatever. I don't know why runtime environment encoding changes behavior of code that only deals with numbers, but hey! It's interesting at least. We fix it, we are happy. A few days later I run the script again. It works. The result doesn't look right though. It's mostly zeroes. Hmm. I run it a few more times, playing around with input, trying to figure out what's up. OK, after a few minutes I realize there's lots of red color that flashes on running the script on my screen - just so fast I barely see it. It turns out half the code isn't really running, the script just ignores it though (errors do NOT stop the code from running), and keeps going. It produces partial output happily announcing it finished. That is the most serious mindfuck. Everything is OK, says the prompt, here's your 1 megabyte result of the calculation, oh, just don't look at the numbers, because I havent' really run any of the code... I couldn't find one of the functions. I sit there wondering. Which is worse: the fact that every time I try launching the script something else is happening, or the fact that the runtime environment by default will return garbage with NO warning at the end (which is the only thing you see on screen) but with a million warnings in between (which you won't see unless you have really good reflexes...). Which is worse? I decided at some point, that I want a language to fail, and to always give me the same result. An error, an exception, this should kill the program and shout as loud as possible "Won't give you anything". Also I want code that ran yesterday to run today, and to run on my colleague's machine, and on a newer version of R. This was never our experience. " [68]
• " Sounds like you aren't running your scripts as scripts. If you source a script or run it via Rscript it will halt when it hits a failure (unless you've changed a default). Copying and pasting in to the REPL will hide errors like you describe. The other part is that it sounds like you don't have a standardized R environment. I admit that R's tooling there isn't the best, but there are options, e.g. {packrat} & {lockbox}... or better yet a Docker image. " [69]
• "...some authorities prefer <- for assignment but I’m increasingly convinced that they are wrong; = is always safer. R doesn’t allow expressions of the form if (a = testSomething()) but does allow if (a <- testSomething())."..."you can intend to type if(b < -9) and instead type if(b <- 9). The latter always evaluates to true and assigns the value you're trying to compare with to your variable. This can be extremely difficult to catch and detect..." [70] and [71]

Types [72]:

• homogeneous:
• atomic vector (1d) (note: 'scalars' are just 1-element vectors in R)
• logical, integer, double, character (string), complex, raw
• matrix (2d),
• array (n-d)
• heterogeneous:
• list (1d),
• data frame (2d)
• nullable types ("na"); 'na's are typed

## Julia

### Julia features

"

• Multiple dispatch: providing ability to define function behavior across many combinations of argument types
• Dynamic type system: types for documentation, optimization, and dispatch
• Good performance, approaching that of statically-compiled languages like C
• Built-in package manager
• Lisp-like macros and other metaprogramming facilities
• Call Python functions: use the PyCall? package
• Call C functions directly: no wrappers or special APIs
• Powerful shell-like capabilities for managing other processes
• Designed for parallelism and distributed computation
• User-defined types are as fast and compact as built-ins
• Automatic generation of efficient, specialized code for different argument types
• Elegant and extensible conversions and promotions for numeric and other types
• Efficient support for Unicode, including but not limited to UTF-8 " [73]

### Julia opinions

• "There is an obvious reason to choose Julia: it's faster than other scripting languages, allowing you to have the rapid development of Python/MATLAB/R while producing code that is as fast as C/Fortran...That sounds like it violates the No-Free-Lunch heuristic. Is there really nothing lost?...Julia is fast because of its design decisions. The core design decision, type-stability through specialization via multiple-dispatch....Type stability is the idea that there is only 1 possible type which can be outputtted from a method...There are some "lunches lost" that we will have to understand....The upside is that Julia's functions, when type stable, are essentially C/Fortran functions. Thus ^ (exponentiation) is fast. However, ^(::Int64,::Int64) is type-stable, so what type should it output? 2^5 ((yields)) 32, ((but)) 2^-5 ((yields)) "DomainError?: Cannot raise an integer x to a negative power -n. Make x a float by adding a zero decimal (e.g. 2.0^-n instead of 2^-n), or write 1/x^n, float(x)^-n, or (x1)^-n." [74] "
• "...the choice of unbalanced ends for blocks and ::s for attaching types makes the Julia code appear unnecessarily noisy IMHO. It is often said that code is read more than it is written and in my opinion Julia has definitely room for improvement here. The expression syntax is more reasonable, but there is still some unorthodox choice of operators, punctuators and the syntax for multiline comments that creates WTF moments every now and then (who has come up with this stuff: #= \ \$ ?)." [75]
• "One-based indexing is another questionable design decision. While it may be convenient in some cases, it adds a source of mistakes and extra work when interoperating with popular programming languages that all (surprise!) use 0-based indexing..." [76]
• "And the last issue that I want to mention in this section is apidocs. The standard documentation system is a step back even compared to Doxygen, not to mention Sphinx. Instead of using semantic markup it relies on rudimentary Markdown-based format with focus on presentation. Apart from obvious limitations of Markdown, this makes documentation of heterogeneous projects more difficult." [77]
• "JNA- and ctypes-like FFI is convenient, there is no doubt about it. But making it the default way to interface with native APIs is a major safety issue. C and C++ have headers for a reason and redeclaring everything by hand is not only time-consuming, but also error-prone. A little mistake in your ccall and you just happily segfaulted, and that’s an optimistic scenario. And now try to correctly wrap strerror_r or similar...So this “feature” instead of eliminating boilerplate eliminates type checking. In fact, there is more boilerplate per function in ccall than in, say, pybind11. This is another area where Python and even Java with their C API win. There is a way to abuse FFI there too, but at least it’s not actively encouraged." [78]
• "Another area where Julia is lacking at the moment is libraries, including the standard library. As has been pointed out elsewhere “Base APIs outside of the niche Julia targets often don’t make sense” and the general-purpose APIs are somewhat limited." [79]
• "For example, text formatting is one of the most basic and commonly used language facilities one could probably think of and Julia is even behind C++98 there. The standard library provides @printf and @sprintf but they are not extensible. You can’t even make them format a complex number. There is a rudimentary string interpolation, but in its current form it only seems to be useful for very basic formatting." [80]
• "startup time...a trivial hello world program in Julia runs ~27x slower than Python’s version and ~187x slower than the one in C....it’s not just scripts, Julia’s REPL which should ideally be optimized for responsiveness takes long to start and has noticeable JIT (?) lags...In addition to that, Julia programs have excessive memory consumption...Possible reason for this is the use of LLVM for JIT. LLVM is great as a compiler backend for statically-typed compiled languages, but it has been known not to work equally well in the context of dynamic languages. Unladen Swallow and a recent migration of WebKit? away from LLVM are notable examples." [81]
• "The libraries for unit testing are also very basic, at least compared to the ones in C++ and Java. FactCheck? is arguably the most popular choice but, apart from the weird API, it is quite limited and hardly developed any more." [82]
• "To summarize, I see the following problems with the language and its infrastructure right now:
```    Performance issues including long startup time and JIT lags
Somewhat obscure syntax and problems with interoperability with other languages
Poor text formatting facilities in the language and lack of good unit testing frameworks
Unsafe interface to native APIs by default
Unnecessarily complicated codebase and insufficient attention to bug fixing```

Despite all this, I think the language can find its niche as an open-source alternative to MATLAB because its syntax might be appealing to MATLAB users. I doubt it can seriously challenge Python as the de-facto standard for numerical computing." [83]

• "Here’s a language that gives near-C performance that feels like Python or Ruby with optional type annotations (that you can feed to one of two static analysis tools) that has good support for macros plus decent-ish support for FP, plus a lot more. What’s not to like? I’m mostly not going to talk about how great Julia is, though, because you can find plenty of blog posts that do that all over the internet." [84]
• "It’s not unusual to run into bugs when using a young language, but Julia has more than its share of bugs for something at its level of maturity. If you look at the test process, that’s basically inevitable. As far as I can tell, FactCheck? is the most commonly used thing resembling a modern test framework, and it’s barely used. Until quite recently, it was unmaintained and broken, but even now the vast majority of tests are written using @test, which is basically an assert. It’s theoretically possible to write good tests by having a file full of test code and asserts. But in practice, anyone who’s doing that isn’t serious about testing and isn’t going to write good tests. Not only are existing tests not very good, most things aren’t tested at all." [85]
• "Something that goes hand-in-hand with the level of testing on most Julia packages (and the language itself) is the lack of a good story for error handling. Although you can easily use Nullable (the Julia equivalent of Some/None) or error codes in Julia, the most common idiom is to use exceptions. And if you use things in Base, like arrays or /, you’re stuck with exceptions. I’m not a fan, but that’s fine – plenty of reliable software uses exceptions for error handling...The problem is that because the niche Julia occupies doesn’t care2 about error handling, it’s extremely difficult to write a robust Julia program....There are problems at multiple levels....If I’m writing something I’d like to be robust, I really want function documentation to include all exceptions the function might throw. Not only do the Julia docs not have that, it’s common to call some function and get a random exception that has to do with an implementation detail and nothing to do with the API interface....Another problem is that catching exceptions doesn’t work (sometimes, at random)." [86]
• "Since we’re broadly on the topic of APIs, error conditions aren’t the only place where the Base API leaves something to be desired. Conventions are inconsistent in many ways, from function naming to the order of arguments. Some methods on collections take the collection as the first argument and some don’t (e.g., replace takes the string first and the regex second, whereas match takes the regex first and the string second)." [87]
• "More generally, Base APIs outside of the niche Julia targets often don’t make sense. There are too many examples to list them all, but consider this one: the UDP interface throws an exception on a partial packet. This is really strange and also unhelpful. Multiple people stated that on this issue but the devs decided to throw the exception anyway. The Julia implementers have great intuition when it comes to linear algebra and other areas they’re familiar with. But they’re only human and their intuition isn’t so great in areas they’re not familiar with. The problem is that they go with their intuition anyway, even in the face of comments about how that might not be the best idea." [88]
• "Another thing that’s an issue for me is that I’m not in the audience the package manager was designed for. It’s backed by git in a clever way that lets people do all sorts of things I never do. The result of all that is that it needs to do git status on each package when I run Pkg.status(), which makes it horribly slow; most other Pkg operations I care about are also slow for a similar reason. That might be ok if it had the feature I most wanted, which is the ability to specify exact versions of packages and have multiple, conflicting, versions of packages installed" [89]
• "There’s lots of friction that keeps people from contributing to Julia. The build is often broken or has failing tests. When I polled Travis CI stats for languages on GitHub?, Julia was basically tied for last in uptime. This isn’t just a statistical curiosity: the first time I tried to fix something, the build was non-deterministically broken for the better part of a week because someone checked bad code directly into master without review. I spent maybe a week fixing a few things and then took a break. The next time I came back to fix something, tests were failing for a day because of another bad check-in and I gave up on the idea of fixing bugs." [90]
• " That tests fail so often is even worse than it sounds when you take into account the poor test coverage. And even when the build is “working”, it uses recursive makefiles, and often fails with a message telling you that you need to run make clean and build again, which takes half an hour. When you do so, it often fails with a message telling you that you need to make clean all and build again, with takes an hour. And then there’s some chance that will fail and you’ll have to manually clean out deps and build again, which takes even longer. And that’s the good case! The bad case is when the build fails non-deterministically. These are well-known problems that occur when using recursive make, described in Recursive Make Considered Harmful circa 1997." [91]
• "...the biggest barrier to contributing to core Julia...is that the vast majority of the core code is written with no markers of intent (comments, meaningful variable names, asserts, meaningful function names, explanations of short variable or function names, design docs, etc.)." [92]
• "The metaprogramming in julia is so good I wrote a verilog DSL that transpiles specially written julia into compilable and verifiable verilog - in 3 days." [93]
• I'm a quite happy Julia user, however I feel there are still some warts in the language that should have warranted a bit more time before banging 1.0 on the badge. Exception handling in julia is poor, which reminds me of how exceptions are (not/poorly) handled in R. Code can trap exceptions, but not directly by type as you _would_ expect. Instead, the user is left to check the type of the exception in the catch block. Aside for creating verbose blocks of boilerplate at every catch, it's very error prone. Very few packages do it right, and like in R, exceptions either blow up in your face or they simply fail silently as the exception is handled incorrectly upstream by being too broad. Errors, warnings and notices are also often written as if the only use-case scenario is the user watching the output interactively. Like with R, it's possible but quite cumbersome to consistently fetch the output of a julia program and be certain that "stdout" contains only what >you< printed. As I use julia also a general-purpose language to replace python, I feel that julia a bit too biased toward interactive usage at times. That being said, I do love multiple dispatch, and julia overall has one of the most pragmatic implementations I've come across over time, which also makes me forget that I don't really like 1-based array indexes." [94]
• "Pkg.generate: brilliant" [95] from [96]
• "Readability...unicode operators? Brilliant...sometimes!" [97] from [98]
• "Array indexing...Arbitrary indexing? Brilliant!... 1-based indexing…eww!" [99] from [100]
• "REPL • Shell? Help? C++?!? Brilliant! • REPL as exploration vs REPL as dev tool • workspace()? Brilliant! • method redeﬁnition… boo! (Infamous #265)" [101] from [102]

"

lenticular 42 days ago [-]

It's extremely expressive. Notably, Julia is homoiconic, with full lisp-style macros. It also has multiple dispatch, which is a far more general technique that OO single-dispatch. This makes it very easy to define modular interfaces that work much like statically-typed type classes in Haskell. This allows you, for example, to define a custom matrix type for your bespoke sparse matrix layout and have it work seamlessly with existing linear algebra types.

I've done a lot of work in both Python with Scipy/Numpy, and Julia. Python is painfully inexpressive in comparison. Not only this, but Julia has excellent type inference. Combined with the JIT, this makes it very fast. Inner numerical loops can be nearly as fast as C/Fortran.

Expanding on the macro system, this has allowed things like libraries that give easy support for GPU programming, fast automatic differentiation, seamless Python interop, etc.

tombert 42 days ago [-]

I'm not sure how I feel about multi-dispatch...I've had a few headaches chasing down problems with multimethods in Clojure...I'd have to try using Julia full-time to see how it feels.

I was unaware that Julia was homoiconic...I'm somewhat of a Lisp fanboy so I might need to give the language another chance.

lenticular 42 days ago [-]

There's pretty big differences in usage between multimethods in Clojure and Julia. I've used both a decent amount. All functions in Julia are multimethods by default. If you don't use type annotations, a new method will be generated whenever you call the function with new argument types. This explicit type specialization is a very important part of why Julia can have such a consistently fast JIT despite its dynamicity.

Errors from missing or conflicting methods tend to not happen much in practice.

https://docs.julialang.org/en/v0.7.0/manual/methods/

nicoburns 42 days ago [-]

> If you don't use type annotations, a new method will be generated whenever you call the function with new argument types.

Damn. That's a pretty clever trade off between dynamic and static types.

vanderZwan 42 days ago [-]

I vaguely recall a talk by Stefan Karpinsky where he mentions meeting one of the big names in compiler land (working on V8 or something) and they said Julia's JIT is just a lazily evaluated AOT compiler, and as a result much simpler than the JITs typically seen in other languages.

tombert 42 days ago [-]

Forgive a bit of ignorance here, but that doesn't sound terribly different than overloading functions in C++. Am I way off on that?

KenoFischer? 42 days ago [-]

The difference is run-time (semantically) vs compile time. If you had overloading at runtime in C++, you wouldn't need virtual functions or any of the thousands of OO "patterns" (visitor, factory, etc.), that are working around the lack of this capability. " -- [103] todo digest this

"

kgwgk 42 days ago [-]

> Julia is homoiconic

It depends on what you understand by homoiconic:

eigenspace 42 days ago [-]

This is why the language creators usually avoid using the word 'homoiconic' because every time one uses that word there's a finite probability of being bogged down in an incredibly uninteresting semantic argument.

Instead, people prefer to say that julia code is just another (tree-like) data-structure in the language and it can be manipulated at runtime with functions or compile time with macros or at parse time with string macros and now with Cassette.jl[1] we can even manipulate the form of code that has already been written and shipped by other packages all with first class metaprogamming tools. It seems to me that even if Julia is not 'truly homoiconic', that we seem to get the touted benefits of homoiconicity to the point that it seems like an unimportant distinction.

Athas 41 days ago [-]

Then why not just say 'Julia has macros'? Lightly perusing the description of Julia's features, that seems like a clear way of expressing what it is.

(I also vaguely recall Julia describing its type system as "dependent" in way that goes against convention. Maybe they just liked controversy in the early days!)

eigenspace 41 days ago [-]

What I just said is why Julia people don’t tend to call it homoiconic or dependantly typed. It lead to so many semantic arguments that most just talk about actual features such as macros and various multiple dispatch features instead of using words like homoiconicity and dependant typing.

zem 41 days ago [-]

"macros" are unfortunately used by C and lisp to describe two different things, and both usages are as widely popular as their parent languages (i.e. very). "

-- [104] todo digest this

### Julia Internals and implementations

Core data structures: todo

#### Number representations

Integers

Floating points todo

#### array representation

variable-length lists: todo

multidimensional arrays: todo

limits on sizes of the above

## Mathematica

"symbolic manipulation and solving"

"continuous-time dynamic systems"

## Modelica

"mechanical, electrical, etc systems"

## Verilog-AMS

"analog and mixed-signal electronics"

## Esterel

"reactive control systems"

## SBOL

"synthetic biology systems"

## Church

http://v1.probmods.org/

## WebPPL

http://probmods.org/

## Relay IR

https://docs.tvm.ai/langref/index.html

"Relay is a functional, differentiable programming language designed to be an expressive intermediate representation for machine learning systems. Relay supports algebraic data types, closures, control flow, and recursion, allowing it to directly represent more complex models than computation graph-based IRs can. Relay also includes a form of dependent typing using type relations in order to handle shape analysis for operators with complex requirements on argument shapes."

part of the TVM project

## Weld IR

https://www.weld.rs/

"A Common Runtime for High Performance Data Analytics"

"Weld is a runtime for improving the performance of data-intensive applications. It optimizes across libraries and functions by expressing the core computations in libraries using a small common intermediate representation, similar to CUDA and OpenCL?...For example, for Spark, NumPy?, and TensorFlow?, porting over a few Weld operators can increase performance by up to 30x even on some simple workloads!"

https://cs.stanford.edu/~matei/papers/2017/cidr_weld.pdf

https://github.com/weld-project/weld/tree/master/python/grizzly/grizzly https://www.weld.rs/grizzly https://pypi.python.org/pypi/pygrizzly/0.0.1 Grizzly is a subset of the Pandas data analytics library integrated with Weld

https://www.weld.rs/weldnumpy WeldNumpy? is a subset of the NumPy? numerical computing framework integrated with Weld

"Split annotations are a system that allow annotating existing code to define how to split, pipeline, and parallelize it. They provide the optimization that we found was most impactful in Weld (keeping chunks of data in the CPU caches between function calls rather than scanning over the entire dataset), but they are significantly easier to integrate than Weld because they reuse existing library code rather than relying on a compiler IR. This also makes them easier to maintain and debug, which in turn improves their robustness. Libraries without full Weld support can fall back to split annotations when Weld is not supported, which will allow us to incrementally add Weld support based on feedback from users while still enabling some new optimizations." -- [106] https://github.com/weld-project/split-annotations https://shoumik.xyz/static/papers/mozart-sosp19final.pdf

## TensorFlow MLIR

https://github.com/tensorflow/mlir/blob/master/g3doc/Dialects/Standard.md

Standard Types: "Standard types are a core set of dialect types that are defined in a builtin dialect and thus available to all users of MLIR."

• complex-type
• float-type
• function-type
• index-type
• integer-type
• memref-type
• none-type
• tensor-type
• tuple-type
• vector-type