proj-jasper-jasperLibrariesNotes1

"(Is Urbit German? Sadly, no. But all our noun print formats are URL-safe, which dot is and comma isn’t.)"

--

upvote

TylerE? 37 days ago

link

You might be surprised just how much churn there is if you look at the source of, for instance, Haskell's Data.Map, which in the name of immutability is implemented as a balanced binary-tree representation, rather than the more common hash table. Many operations that are amortized O(1) with a hash table are O(log N) with a balanced B-tree, since for instance after an update a all the sub-trees around the update site will be made afresh, since the balancing will change.


upvote

chongli 37 days ago

link

The balanced binary tree of Data.Map is nowhere near state of the art in terms of persistent data structures. Clojure (and Haskell's unordered-containers library) use Hash Array Mapped Tries (HAMTs) which have a much higher branching factor (32-ary) than binary trees. In practice, this is a neglible difference from constant-time.

-- -- -- --

upvote

zorked 37 days ago

link

Also, C++, which is the king of mutability, also implements map as a tree instead of a hash table, probably trying to avoid the worst-case linear search.


upvote

bloodorange 36 days ago

link

The standards committee could not come to a complete agreement on hash tables and therefore it was not a part of C++03. Hash tables were in consideration along with the tree based map. However, maps made it in time but hash tables didn't. It was not one versus the other. They wanted to have both and only one was ready on time.

Eventually by the time they came to an agreement on hash tables (C++11), a lot of vendors had their own versions of it as non-standard additions to the library. To avoid conflicts with these existing hash tables, they named it "unordered_map".


upvote

detrino 37 days ago

link

The C++ standard places a number of requirements (ordered, stable iterators/references, logarithmic operations) on std::map which really only lines up with balanced binary trees. As another poster pointed out, C++11 introduced std::unordered_map (A hash table).


upvote

cgag 37 days ago

link

Somehow the difference between O(1) and O(log32 n) has never actually been an issue for me in Clojure.

upvote

reycharles 26 days ago

link

Ah, I see. That doesn't matter in big-O notation, though, because log_32(n) = log_2(n) / log_2(32), that is the difference between two bases is a constant factor.

I was a bit confused because O(log_32(n)) = O(log(n)) = log(32)+log(n) = O(log(32n)), so no matter how I parsed it I would just get O(log(n)) :)


upvote

strager 37 days ago

link

Of course, there's Data.HashMap?.HashMap?[1], which is just a Data.IntMap?.{Lazy,Strict}.IntMap?[2], whose "implementation is based on big-endian patricia trees".

> Many operations have a worst-case complexity of O(min(n,W)). This means that the operation can become linear in the number of elements with a maximum of W -- the number of bits in an Int (32 or 64).

[1] http://hackage.haskell.org/packages/archive/hashmap/1.3.0.1/... [2] http://hackage.haskell.org/packages/archive/containers/lates...

--

upvote

paddyforan 36 days ago

link

I think one of my favourite things about Go is that it makes it easy and obvious to code well. Generally the first thing that comes to my mind is going to be performant and extendable.

The reason this is the case, I think, is that Go strives to make algorithmic complexity clear: you know when you're allocating memory, but you don't need to jump through hoops to do it. You know the rough performance costs of what you're doing because the core team works hard to make it obvious. For example, some regular expressions features (lookahead, if I remember correctly) aren't implemented in the Go standard library's regular expression package, because they're impossible to achieve in O(n) time. https://groups.google.com/forum/#!topic/golang-nuts/7qgSDWPI...

This level of care in making simple, clearly-defined tools with known properties makes it easy to code well. Ruby, Python, PHP, NodeJS?... you can shoot yourself in the foot and not even notice.

--

" A good collections library should have the choices required to get asymptotically optimal algorithms built with almost no fuss, with the choices between ordered/unordered and set/list/map and queue/stack/heap being easy to find. Java undeniably does this really well; even if Go still does this ok. "

---

how time library works in Go:

http://blog.cloudflare.com/its-go-time-on-linux

---

need a std convention for whether various functions that return both a horizontal and vertical component do order y, x = fn(a) or x, y = fn(a)

and this should match a std convention for array ordering, e.g. if we choose the matrix convention that axis zero is vertical and axis 1 is horizontal, then it should be

  y, x = fn(a)

---

i notice that in the Python world often there is an awesome library, like SciPy?, and then an environment, like PyPy? or AppEngine?, that doesn't support or doesn't fully support C extensions, and then someone else makes a 'pure python' version or slight variant of the original library (or not, if it's a huge library like scipy).

so perhaps the standard in the Jasper world should be that a pure jasper version of a library should always be made available in conjunction with the library. this will not be mandatory except for recommended libs

---

languages seem to have a standard license for their libraries, which is the same as the language license.

we should use apache/BSD/MIT or similar. FSF says that Apache 2.0 is the best of these due to patent conveyance: https://www.gnu.org/licenses/license-recommendations.html#software . However note that Apache 2.0 is GPL2 (but not GPL3) incompatible: http://stackoverflow.com/questions/40100/apache-license-vs-bsd-vs-mit . BSD seems to be preferred over MIT although i havent really looked into this(?).

so now we have to look into whether GPLv2, but not GPLv3, incompatibility, is a substantial problem (arg). First, it is undisputed that a lot of software out there still uses GPLv2, although i dunno if most of those have the 'or any later version' clause which would make this moot. But SHOULD ? GPL 3 is controversial due to anti-DRM provisions (see http://www.zdnet.com/blog/burnette/gplv3-myth-3-gpl-forbids-drm/354 ). Indeed, the Linux kernel is still GPL 2, possibly because of Linus and others' objections to the stronger anti-DRM provisions in an earlier GPL 3 proposal. See also https://en.wikipedia.org/wiki/Tivoization#GPLv3 . Linus seems to consider the final draft of GPL 3 acceptable: http://archive.is/UUfC . However, here Linus says he doesn't like it: http://lkml.iu.edu//hypermail/linux/kernel/0706.1/2214.html , http://news.softpedia.com/news/Linus-Torvalds-Says-No-To-GPLv3-75766.shtml (Jan 2008).

see also http://archive09.linux.com/feature/139885 (old stats) , https://lwn.net/Articles/560670/ (misc discussion), http://www.linuxtoday.com/developer/2010092000435OPKNMO (continuing relevance of tivoization) , http://unix.stackexchange.com/questions/49906/why-is-freebsd-deprecating-gcc-in-favor-of-clang-llvm (freeBSD's dislike of any GPL).

there is also some controversy about the patent provisions, see https://moodle.org/mod/forum/discuss.php?d=213610 , in which UCLA is not permitting contributions to Moodle for fear that someone will sneak unrelated patent-using code into the codebase in order to open those patents, which people say is very unlikely to happen but possible. UCLA later suggests an end-run around these provisions which would appear to be acceptable to everyone: https://moodle.org/mod/forum/discuss.php?d=213610#p1117261

another objection is the need for locked-down devices for safety-critical applications, see the right to repair and GENVI goes open source. imo the GPLv3 should have had an exception allowing lockdown of safety-critical applications.

my personal conclusion: linus and the other linux kernel devs (e.g. https://lkml.org/lkml/2014/4/1/892 ) are an anomaly. There are indeed many entities who don't want anti-tivoization clauses in the license (e.g. hardware vendors like Apple who like to create locked-down devices), but imo these days many of them would go for apache/BSD/MIT. Linus has a point that GPLv2 might be better if you want to legally force an open community for improving source code while also supporting locked-down devices (see also http://yarchive.net/comp/linux/gpl.html ), but in practice it seems to me that most people with those preferences just use apache/BSD/MIT despite the potential problems with companies forking and closing the resulting codebase. There are also many entities who want software freedom to some extent or another (and unlike UCLA most of these are also opposed to software patents in general, so i think that objection affects relatively few people, and even UCLA's lawyers found something they could live with anyhow) and i can't think of any reason aside from safety-critical that these people wouldn't want GPLv3 over GPLv2 (except for just the popularity of GPLv2, which will probably wane over time).

so, GPLv2 compatibility is not very important. So, apache is in the running.

Now, is Apache better than BSD and MIT? FSF thinks so, Android thinks so ( http://lucumr.pocoo.org/2013/7/23/licensing/#androidlicense ), Github considers it in the running (" As such I see GitHub?'s newly added license choosing helper dialog problematic. When you make a new repository it gives you a dialog to pick a license without any explanation of what the licenses mean. It even bolds some licenses for you. The ones that it deems more important than others are “Apache v2 License”, “GPLv2” and “MIT”. "). However "Twitter's Bootstrap is currently migrating from ASL2.0 to MIT precisely because some people still need GPLv2 compatibility. Among those projects that were affected were Drupal, WordPress?, Joomla, the MoinMoin? Wiki and others. And even that case shows that people don't care that much about licenses any more as Joomla 3 just bundled bootstrap even though they were not licenses in a compatible way (GPLv2 vs ASL 2.0).". Google Code's 2008 survey found Apache more popular than MIT and BSD: http://google-opensource.blogspot.com/2008/05/standing-against-license-proliferation.html

some ppl complain that Apache is long and like MIT b/c it's short.

one problem i have with Apache is "(b) You must cause any modified files to carry prominent notices stating that You changed the files; and"; wouldn't this encourage piling a lot of junk metadata into every source code file? i doubt anyone actually cares about that part, though. Perhaps just ignore it.

this dude claims Apache is unpopular: https://lukasa.co.uk/2012/05/GPL_vs_MIT_Which_License_To_Use/ . A guy who apparently founded Django likes Apache due to the patent thing: http://travisswicegood.com/2013/03/06/open-source-licenses/ .

http://www.blackducksoftware.com/resources/data/top-20-open-source-licenses currently has GPL 2 at 33%, Apache 2 at 13%, GPL 3 at 12%, MIT at 11%.

my current best guess: Apache is better b/c of the patent thing.

should check http://www.blackducksoftware.com/resources/data/top-20-open-source-licenses later, which shows foss license popularity

---

need a convention for whether, for things like np.unravel_index, np.ravel_multi_index, a tuple of index arrays is preferred, or an array of index tuples (e.g. when there are many items each with a fixed dimensionality of coordinates, do we group first by coordinate or by item?)

--

interesting note: y = zip(*x) appears to be the inverse of x = zip(y) in Python? or is it, it seems to not quite work b/c of adding an extra []? mb zip(*x) is its own inverse. in one case where variable l appeared to hold the results of a zip, zip(*zip(*l)) == l, but zip(zip(*l)) was not because the zip(*l) had added an enclosing []

--

i found myself doing stuff like:

def nearest_areas_for_points(points, areas, pure_python = False): """distances, indexes = nearest_areas_for_points(points, areas, pure_python = False)""" kdTree = nearest_areas_KDtree(areas, pure_python = pure_python)

to pass a pure_python argument along. since we are having a convention that pure_jasper should always be available, may as well have a mechanism (probably just a blessed global) to indicate this.

---

f = csv.reader(fileinput.input())

  from Python but would be useful

e.g. the 'file reader' iteration convention should be provided by a 'fileinput' library that reads from files whose names are provided on the commandline OR from stdinput would be better if you could also have optional arguments, b/c then you could write 'std unix style utilities' easily in fact maybe there should be a standard argument parsing library emulating Python's best-of-breed, whatever that is..

-- 'std unix style utilities':

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html

(i think is is part of the POSIX 2008 spec?)

--

our regexs should support basic data types directly, e.g. should not have to say "[\d\.]+" to read in a floating point, non-scientific notation number.

--

we need to support decimals (precise arithmetic) almost everywhere so that a library client can request it in lieu of float.

--

ok this is pretty crazy:

http://stackoverflow.com/questions/1986152/why-python-doesnt-have-a-sign-function

they dont have a sign fn but they do have cmp(x, 0) which is equivalent..

---

a platform-independent way of pausing until a keypress or timeout!

http://stackoverflow.com/questions/7606064/python-wait-for-a-key-press-or-until-timeout

"

up vote 3 down vote accepted

With Thomas's suggestion, I came up with this function:

import signal

def input_or_timeout(timeout): def nothing(sig, frame): pass signal.signal(signal.SIGALRM, nothing) signal.alarm(timeout) try: raw_input() signal.alarm(0) except (IOError, EOFError): pass

It waits for input for at most timeout seconds.

Under Windows, I suppose you could replace raw_input() with getch() from msvcrt. share

editflag

edited Sep 30 '11 at 12:50

answered Sep 30 '11 at 5:38 zneak 42.7k9106140

1

Under Windows you can't use this code at all because it doesn't support SIGALRM. – Michał Bentkowski Sep 30 '11 at 8:17 "

bayle: this doesn't seem to work in ipython either :(

but in ipython -pylab you can do pylab.waitforbuttonpress(timeout=3)

--

Common Lisp has a number of ways to call a function. Note the generality of 'apply' and the alternative form, 'funcall'. Not saying that Jasper should have so many alternatives, but these provide some alternatives that can be considered for the one that we choose:

" The following four expressions all have the same effect: (+12) (apply #’+ ’(1 2)) (apply (symbol-function ’+) ’(1 2)) (apply #’(lambda (x y) (+ x y)) ’(1 2)) In Common Lisp, apply can take any number of arguments, and the function given first will be applied to the list made by consing the rest of the arguments onto the list given last. So the expression (apply #’+ 1 ’(2)) is equivalent to the preceding four. If it is inconvenient to give the arguments as a list, we can use funcall , which differs from apply only in this respect. This expression (funcall #’+ 1 2) has the same effect as those above. "

--

common lisp's 'map' function is called 'mapcar':

" > (mapcar #’(lambda (x) (+ x 10)) ’(1 2 3)) (11 12 13) "

and it can take multiple aligned lists of arguments and (sorta) 'zip' them:

" > (mapcar #’+ ’(1 2 3) ’(10 100 1000)) (11 102 1003) "

--

annnnd 10 hours ago

link

<offtopic>

+1 just for mentioning requests (versus urllib). If someone hasn't tried this library yet, they really should. I replaced tons of libcurl / urllib / ... blocks with simple and legible oneliners. Incredible API!

EDIT: url for those who don't know the library yet: http://docs.python-requests.org/en/latest/ (I am not affiliated, just really impressed)

</offtopic>

reply

--

package manager:

python pip wheels is supposed to be cool

---

--

" Nowadays stuff like proper closures, immutability, a good async story, etc, is considered a necessity by discerning hackers. "

"

Remove the GIL. Or provide a compelling async story. Guido’s PEP 3156 might, or might not be that. Some primitives like Go’s channels would be nice to have.

Make it speedy. Seriously, if Javascript can be made fast, CPython could be made fast too. Or PyPy? could mature enough to replace it (there should be only one). If it takes big bucks or Lars Back to do it, start a Kickstarter — I’ll? contribute. Shame major companies into contributing too. Isn’t Dropbox spending some money on its own LLVM based Python anyway?

Add types. Well, opt-in types. That you can use to speed some code up (a la Cython), or to provide assurances and help type-check (a la Dart). Add type annotations to everything in the standard library.

Shake the standard libraries. Get a team together to go through them and fix long standing annoyances, improve speed and fix bugs. Improve their API, and offer nice simpler interfaces for common stuff (think requests vs urllib) Offer the new improved library alongside the current standard library in a different namespace. Make it easy to switch over (perhaps with some automated tool).

Revamp the REPL. It’s 2014 already. Redo the REPL in a modern way. Add some color. Take notes from iPython. Make it a client/server thing, so IDEs and editors can embed it.

" -- https://medium.com/p/2a7af4788b10

"the author's priorities seem to line up exactly with my own thoughts on Python's shortcomings. Async, speed, types, stdlib, repl." -- https://news.ycombinator.com/item?id=7803432

--

"

Powerful. Because it’s so well designed, it’s easier than other languages to transform your ideas into code. Further, Python comes with extensive standard libraries, and has a powerful datatypes such as lists, sets and dictionaries. These really help to organize your data. Namespaces. Matlab supports namespaces for the functions that you write, but the core of Matlab is without namespaces; every function is defined in the global namespace. Python works with modules, which you need to import if you want to use them. (For example from skimage import morphology.) Therefore Python starts up in under a second. Using namespaces gives structure to a program and keeps it clean and clear. In Python everything is an object, so each object has a namespace itself. This is one of the reasons Python is so good at introspection. Introspection. This is what follows from the object oriented nature of Python. Because a program has a clear structure, introspection is easy. Private variables only exist by convention, so you can access any part of the application, including some of Python’s internals. Of course, in good programming practice you would not use private variables of other classes, but it’s great for debugging! String manipulation. This is incredibly easy in Python. What about this line of code which returns a right justified line of 30 characters: "I code in Matlab".replace('Matlab','Python').rjust(30) " -- http://www.pyzo.org/python_vs_matlab.html

---

http://www.infoq.com/articles/Java-8-Quiet-Features

8 Great Java 8 Features No One's Talking about

my summary:

1. Stamped Locks

optimistic concurrency locking ("stamp" because you have to pass your stamp in whenever you access the lock so that it knows if the value has changed after you got your stamp)

2. Concurrent Adders

If multiple threads are adding to the same counter, the old way to do it was to use an atomic CAS and then spin until it succeeds.

Concurrent adders (LongAdder?) provide another way: try once and then if the CAS failed, just threadlocally keep a count of how many counts failed to be added, and then when the thread reads the count, (try again i think? and if still fails) just take the global counter value and add the threadlocal "to be added" count.

note: for jasper, should also offer an atomic CAS that spins, but until a timeout. Everything should have a timeout in jasper.

3. Parallel Sorting

Arrays.parallelSort(myArray) to automatically use multiple CPUs in parallel to sort something faster.

4. new Date API

apparently better and cleaner than the old one. Should take a serious look at this for Jasper.

5. Controlling OS Processes

"To help us with this Java 8 introduces three new methods in the Process class -

    destroyForcibly - terminates a process with a much higher degree of success than before.
    isAlive tells if a process launched by your code is still alive.
    A new overload for waitFor() lets you specify the amount of time you want to wait for the process to finish. This returns whether the process exited successfully or timed-out in which case you might terminate it."

6. Exact Numeric Operations

"Java 8 has added several new “exact” methods to the Math class geared towards protecting sensitive code from implicit overflows, by throwing an unchecked ArithmeticException? when the value of an operation overflows its precision.

int safeC = Math.multiplyExact(bigA, bigB); will throw ArithmeticException? if result exceeds +-2^31 "

7. Secure Random Generation

"So far selection of the Random Number Generation algorithms has been left to the developer. The problem is that where implementations depend on specific hardware / OS / JVM, the desired algorithm may not be available. In such cases applications have a tendency to default to weaker generators, which can put them at greater risk of attack.

Java 8 has added a new method called SecureRandom?.getInstanceStrong() whose aim is to have the JVM choose a secure provider for you. If you’re writing code without complete control of the OS / hardware / JVM on which it would run (which is very common when deploying to the cloud or PaaS?), my suggestion is to give this approach some serious consideration. "

8. Optional References

"Optional" parametric type (like Maybe in Haskell)

---

remember; everything should have a timeout in Jasper

---

some clues as to what the current best Java logging library is:

"

smackfu 2 days ago

link

Beyond actual capabilities of log4j vs JDK logging, it's not clear that anyone considered how adoption would work. At the time, log4j supported several back versions of the JDK, while JDK logging obviously required you to be on the newest version, v1.4. As a result, if a library wanted to support any older JDKs, they couldn't switch to JDK logging, or had to support two parallel logging frameworks for little gain. And if your libraries are sticking with log4j, what's the advantage to your program of using the JDK logging?

slf4j is the current solution for this kind of problem, a common interface to various different logging solutions.

reply

yawz 2 days ago

link

Logback (next gen log4j): http://logback.qos.ch/

reply

jhawk28 2 days ago

link

Log4j 2 (next gen logback): http://logging.apache.org/log4j/2.x/

reply

dclusin 2 days ago

link

Thanks for sharing this. Curious where you get your java framework/cool api news from? I hadn't heard about this.

reply

deepsun 2 days ago

link

Curious, why Ceki Gülcü (Log4J author) is no longer in team?

reply

burkaman 2 days ago

link

Ceki left Log4J to develop Logback, I think because he felt like he had lost control of the project.

"For me, starting a new project was a lot worse than just "disheartening". The SLF4J vote was just the straw that broke the camel's back. After putting many many hours of work into log4j, it became increasingly painful to waste time in arguments, where opinions got asserted by the one writing the longest email. Not fun."

reply

thescrewdriver 2 days ago

link

Just about every project used Log4j. When they added logging to the JDK they shipped logging support which was worse than what everyone was already using (log4j), so adoption was very limited.

reply

jburwell 2 days ago

link

Which in turn gave rise to the horrors of commons-logging (clogging) attempting to wrap both APIs and magically detect your logging framework with nasty classloader tricks. While slf4j is a very sane and useful wrapper, it shouldn't be necessary -- JDK logging should have been the wrapper API providing an SPI to plugin implementations (e.g. log4j, logback, etc).

reply

MrBuddyCasino? 2 days ago

link

Right - fixing it in that case means "do the same as thing as with Joda, but do it with slf4j and logging instead".

reply

jburwell 1 day ago

link

At this point, it is attempting the close the barn door after the horses have run half way around the world. Too much many systems have been built up on clogging and SLF4J. "Fixing" it now just add a third wrapper to the mix would not be a Good Thing(tm), IMHO. Sadly, it's as fixed as it is ever going to get at this point.

reply

"

---

(at least sorta) how sets work in golang (i think):

set s

s.set[element_to_add] = true element_to_test_for_membership = element_to_add s.set[element_to_test_for_membership] == true

--

"

Function round is absent This seems like a small fish until you realize that many people implement round incorrectly like this:

    let round x = truncate (x +. 0.5) (* BROKEN! *)
    (forgetting about the negative numbers) instead of the correct
    let round x = int_of_float (floor (x +. 0.5))

Lists

        List.map is not tail-recursive!
        Lists are immutable, thus cannot be spliced - non-consing operation is impossible." -- http://www.podval.org/~sds/ocaml-sucks.html

---

Io> list(30, 10, 5, 20) detect(>10)

> 30

---

consider using simple english type things for some jasper names

--

digital audio in js:

http://wavepot.com/

https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specification.html

" chrisdevereux 22 hours ago

link

Realtime audio DSP seems like about the last thing that in-browser JavaScript? was designed to do well.

Janky audio is unnacceptable in a way that janky visuals aren't. It totally kills the experience.

Without very strong guarantees about GC latency and thread scheduling that are neither available nor on the horizon for in-browser js, it won't work for anything beyond fun hacks.

reply

chrislo 21 hours ago

link

Are you aware of the Web Audio API? It's an API designed especially to address those concerns, and achieves low latency and high performance where it is available (Chrome, Firefox and some versions of Safari currently).

https://dvcs.w3.org/hg/audio/raw-file/tip/webaudio/specifica...

reply

chrisdevereux 21 hours ago

link

Yes, and it works by mostly avoiding realtime DSP in javascript. It provides a high-level API for wiring together prebuilt audio processing nodes. Much like using browser animation APIs vs programatic animation. My point is about the feasibility of JS signal processing, not the feasibility of using JS to glue together signal processors.

The WebAudio? API does provide a ScriptProcessorNode? for JS processing, but it appears to suffer from the problems I described. The spec seems to warn against using this for realtime processing, fwiw.

reply "

--

https://github.com/codemix/fast.js

" What?

Fast.js is a collection of micro-optimisations aimed at making writing very fast JavaScript? programs easier. It includes fast replacements for several built-in native methods such as .forEach, .map, .reduce etc, as well as common utility methods such as .clone.

...

How?

Thanks to advances in JavaScript? engines such as V8 there is essentially no performance difference between native functions and their JavaScript? equivalents, providing the developer is willing to go the extra mile to write very fast code. In fact, native functions often have to cover complicated edge cases from the ECMAScript specification, which put them at a performance disadvantage.

An example of such an edge case is sparse arrays and the .map, .reduce and .forEach functions:

var arr = new Array(100); a sparse array with 100 slots

arr[20] = 'Hello World';

function logIt (item) { console.log(item); }

arr.forEach(logIt);

In the above example, the logIt function will be called only once, despite there being 100 slots in the array. This is because 99 of those slots are empty. To implement this behavior according to spec, the native forEach function must check whether each slot in the array has ever been assigned or not (a simple null or undefined check is not sufficient), and if so, the logIt function will be called.

However, almost no one actually uses this pattern - sparse arrays are very rare in the real world. But the native function must still perform this check, just in case. If we ignore the concept of sparse arrays completely, and pretend that they don't exist, we can write a JavaScript? function which comfortably beats the native version:

var fast = require('fast.js');

var arr = [1,2,3,4,5];

fast.forEach(arr, logIt); faster than arr.forEach(logIt)

By optimising for the 99% use case, fast.js methods can be up to 5x faster than their native equivalents.

Caveats

As mentioned above, fast.js does not conform 100% to the ECMAScript specification and is therefore not a drop in replacement 100% of the time. There are at least two scenarios where the behavior differs from the spec:

    Sparse arrays are not supported. A sparse array will be treated just like a normal array, with unpopulated slots containing undefined values. This means that iteration functions such as .map() and .forEach() will visit these empty slots, receiving undefined as an argument. This is in contrast to the native implementations where these unfilled slots will be skipped entirely by the iterators. In the real world, sparse arrays are very rare. This is evidenced by the very popular underscore.js's lack of support.
    Functions created using fast.bind() and fast.partial() are not identical to functions created by the native Function.prototype.bind(), specifically:
        The partial implementation creates functions that do not have immutable "poison pill" caller and arguments properties that throw a TypeError upon get, set, or deletion.
        The partial implementation creates functions that have a prototype property. (Proper bound functions have none.)
        The partial implementation creates bound functions whose length property does not agree with that mandated by ECMA-262: it creates functions with length 0, while a full implementation, depending on the length of the target function and the number of pre-specified arguments, may return a non-zero length.
    See the documentation for Function.prototype.bind() on MDN for more details.

In practice, it's extremely unlikely that any of these caveats will have an impact on real world code. These constructs are extremely uncommon.

"

---

danabramov 13 hours ago

link

Reminds me of this comment by Petka Antonov on native V8 Promises being way slower than Bluebird[1]:

>I'd expect native browser methods to be an order of magnitude faster.

Built-ins need to adhere to ridiculous semantic complexity which only gets worse as more features get added into the language. The spec is ruthless in that it doesn't leave any case as "undefined behavior" - what happens when you use splice on an array that has an indexed getter that calls Object.observe on the array while the splice is looping?

If you implemented your own splice, then you probably wouldn't even think of supporting holed arrays, observable arrays, arrays with funky setters/getters and so on. Your splice would not behave well in these cases but that's ok because you can just document that. Additionally, since you pretty much never need the return value of splice, you can just not return anything instead of allocating a wasted array every time (you could also make this controllable from a parameter if needed).

[1]: https://github.com/angular/angular.js/issues/6697#issuecomme...

[2]: https://github.com/v8/v8/blob/master/src/promise.js

--

AshleysBrain? 5 hours ago

link

This itself reminds me of similar performance issues hidden in C++. The float->int cast on x86 with Microsoft's compiler called a helper library function 'ftol'. Turns out this does loads of stuff to be spec compliant. Replacing it with a single not-spec-but-works-for-us x86 assembly instruction to convert was way faster.

So not just JS - it seems language built-ins are often slowed down by bureaucratic spec compliance, and hand-rolling code can help you get a speedup.

reply

--

https://github.com/petkaantonov/bluebird/wiki/Optimization-killers

--

AshleysBrain? 5 hours ago

link

Having written a major HTML5 game engine, I've ended up micro-optimizing JS code after small functions really did show up high in profiling measurements. One example: calculating a bounding box from a quad involved code along the lines of Math.min(a, b, c, d) followed by Math.max(a, b, c, d). Replacing that with a tree of ifs to determine both the minimum and maximum at once was faster and moved the bottleneck elsewhere.

reply

--

throwaway_yy2Di 15 hours ago

link

I think in most cases where you'd worry about JS array performance you should use actual numeric arrays [0] rather than the kitchen sink Array(). Also, I think those function abstractions have a pretty significant overhead?

[0] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Type...

(edit): Yeah, the abstraction overhead is ridiculous. Here's the forEach() benchmark again, compared to an explicit for loop (no function calls):

    // new benchmark in bench/for-each.js    
    exports['explicit iteration'] = function() {
        acc = 0;
        for (var j=0; j<input.length; ++j) {
            acc += input[j];
        }
    }
  Native .forEach() vs fast.forEach() vs explicit iteration
    ✓  Array::forEach() x 2,101,860 ops/sec ±1.50% (79 runs sampled)
    ✓  fast.forEach() x 5,433,935 ops/sec ±1.12% (90 runs sampled)
    ✓  explicit iteration x 28,714,606 ops/sec ±1.44% (87 runs sampled)
    Winner is: explicit iteration (1266.15% faster)

(I ran this on Node "v0.11.14-pre", fresh from github).

reply

thegeomaster 14 hours ago

link

I used this in a Firefox OS app, inside an implementation of Dijkstra's algorithm and an accompanying binary heap, and while I haven't run any rigorous benchmarks, I can say the runtime felt way better on my test phone when I rewrote the algorithm to use the typed arrays.

This is very often overlooked but extremely useful for implementations of fast algorithms in JavaScript? that should scale to a lot of input data.

reply

phpnode 14 hours ago

link

regarding your edit, you're exactly right, of course a for loop will be faster. Sometimes you really do need a function call though, in which case fast forEach and map implementations become more useful.

The next step for fast.js are some sweet.js macros which will make writing for loops a bit nicer, because it's pretty painful to write this every time you want to iterate over an object:

    var keys = Object.keys(obj),
        length = keys.length,
        key, i;
    for (i = 0; i < length; i++) {
      key = keys[i];
      // ...
    }

I'd rather write:

   every key of obj {
     // ...
   }

and have that expanded at compile time.

Additionally there are some cases where you must use inline for loops (such as when slicing arguments objects, see https://github.com/petkaantonov/bluebird/wiki/Optimization-k...) and a function call is not possible. These can also be addressed with sweet.js macros.

reply

throwaway_yy2Di 14 hours ago

link

To be fair, it's not obvious if you're not a JS expert: coming from some other language, you could naively assume that function call gets inlined, with no overhead.

reply

--

this article describes the things left out of Bionic, a simpler libc variant for Android:

http://codingrelic.geekhold.com/2008/11/six-million-dollar-libc.html

--

python and hoon both support both absolute dates, and durations (relative dates)

--

http://toolz.readthedocs.org/en/latest/

---

in python:

hcarvalhoalves 1 day ago

link

Split without argument is equivalent to:

    >>> filter(None, " quick   hack for  split".split(" "))
    ['quick', 'hack', 'for', 'split']

reply

using split with a ' ' argument can give surprising results when there are multiple spaces:

klibertp 1 day ago

link

It's not weird, it's wrong:

    In [1]: "a    string     r".split(" ")
    Out[1]: ['a', '', '', '', 'string', '', '', '', '', 'r']
    In [2]: "a    string     r".split()
    Out[2]: ['a', 'string', 'r']

reply

---

http://typelevel.org/

---

shyam says the plotting fn in mathematica is good

---

file directory stuff should be done via graphs, e.g. instead of doing 'ls()', do 'pwd()', which gives you a node for the current directory, and then look at the children of that node. If one of those children is 'dir2', instead of doing 'ls(dir2)', just look at dir2's children. Instead of 'mkdir()', just create a new directory node and attach it as a child of the result of 'pwd()'.

note that this structure is lazy. note that if symbolic links are followed (either choice should be allowed, perhaps as an option when you get the graph? or as a view? or level? i'm guessing a view), then cycles are possible?

---

https://github.com/facebook/immutable-js

---

https://github.com/facebook/immutable-js about https://news.ycombinator.com/item?id=8107447

http://swannodette.github.io/mori/ "a JavaScript? API for using ClojureScript?'s persistent data structures:" "Mori also provides clojure like functionality for the new collections greatly simplifying how you would use them - if you don't use Clojure you might think of the additional functions as an underscore for immutable collections." [1]

---

i find i use Octave when i just want to plot a simple equation, e.g.

$ octave octave:1> plot(1000*(1 + .05/3).^[1:180])

jasper should offer close to this level of convenience for that, too

---

https://news.ycombinator.com/item?id=8139174

rhettg 3 days ago

link

I think Arrow still leaves too much confusion around how to handle timezones. That's why I wrote DMC(https://github.com/rhettg/dmc).

Arrow: better dates and times for Python¶

http://crsmithdev.com/arrow/

calpaterson 3 days ago

link

Suggestion: rename arrow.now() to arrow.localnow() to make it even clearer that it does not generate utc. I've run into this mistake many times with datetime.now() vs datetime.utcnow()

reply

googletron 3 days ago

link

Arrow is ambiguous when dealing with timezones, a lot of the high functionality ideas were copied from Delorean. Delorean has many more sensible default and is clear about educating people about naive vs. localized datetimes and a lot more sane.

Documentation is very educational too well worth the read. http://delorean.readthedocs.org/en/latest/

reply

stuaxo 2 days ago

link

It would be good if the manual had all the examples that moment.js has (which arrow was inspired by).

reply

fletchowns 3 days ago

link

Is there any language that got the date & time library right the first time? It seems like people always suggest to use a third party library. JavaScript? has Moment.js, Java has Joda-Time, and now Python has Arrow.

reply

jbarham 3 days ago

link

Go's time package is nice (http://golang.org/pkg/time/), although some people complain about parsing date/time strings using a reference string ("Mon Jan 2 15:04:05 -0700 MST 2006") vs. PHP or Python style format strings.

reply

sjy 3 days ago

link

That's a really cool idea. I'm always commenting my code with reference strings because format strings are so unreadable.

reply

spankalee 3 days ago

link

The Dart core library team was very conservative regarding dates and time for this reason.

They kept things down to just DateTime? (timezone aware, but only supports local and UTC) and Duration, and assumed that anything more advanced should be outside the core.

I'm not sure if they got it right or not, but we haven't seen anyone want to replace the built in types yet. I think we've always expected a Joda-Time like library eventually, but hopefully DateTime? and Duration are good, so it's additive.

(I'm on the Dart team, but not on the core lib team)

reply

---

---

"

Q: What's the point of map in Haskell, when there is fmap?

Everywhere I've tried using map, fmap has worked as well. Why did the creators of Haskell feel the need for a map function? Couldn't it just be what is currently known as fmap and fmap could be removed from the language?

A:

Historical reasons.

First came map, because, hey, there were lists.

Then someone said: "Let there be functors!". And was somewhat miffed, b/c map was already taken. So they said "screw it, call it fmap."

And it was so.

Then Functor became a part of the standard library, and everbody said "this fmap name is lame, but we don't want to change the name of map, because that might break stuff."

So they did not.

Edit: Or, as the case actually is, I'm wrong: see augustss's comment below.

-- rampion

That's not actually how it happens. What happened was that the type of map was generalized to cover Functor in Haskell 1.3. I.e., in Haskell 1.3 fmap was called map. This change was then reverted in Haskell 1.4 and fmap was introduced. The reason for this change was pedagogical; when teaching Haskell to beginners the very general type of map made error messages more difficult to understand. In my opinion this wasn't the right way to solve the problem. – augustss Jul 26 '11 at 8:47 ... augustss is Lennart Augustsson, who for all practical purposes has been part of the Haskell community since before Haskell existed, cf. A History of Haskell " -- http://stackoverflow.com/questions/6824255/whats-the-point-of-map-in-haskell-when-there-is-fmap

seems to me that this (and ByteString?) are exactly the sort of cases that my 'everything is an interface' would help with

---

http://www.haskell.org/haskellwiki/Foldable_and_Traversable

---

http://stackoverflow.com/questions/3529439/haskell-coding-style-map-fmap-or

these 3 are all the same:

map toLower "FOO"

fmap toLower "FOO"

toLower <$> "FOO"

... <$> is the same as `fmap` ... map is just a less general form of fmap?

---

haskell's 'functor' should be called 'container' in jasper, and 'pointed' 'defaultable'

---