Bayle Shanks's website: proj-oot-old-150618-ootNotes7

https://blog.heroku.com/archives/2014/3/11/node-habits

probs with Django ORM:

https://speakerdeck.com/alex/why-i-hate-the-django-orm

peewee ORM:

http://peewee.readthedocs.org/en/latest/peewee/upgrading.html#upgrading

http://peewee.readthedocs.org/en/latest/

http://reinout.vanrees.org/weblog/2013/08/21/programmatical-all-range.html

" Prehistorical Python: patterns past their prime - Lennart Regebro¶

Tags: django, djangocon Dicts

This works now:

>>> from collections import defaultdict >>> data = defaultdict(list) >>> data['key'].add(42)

It was added in python 2.5. Previously you’d do a manual check whether the key exists and create it if it misses. Sets

Sets are very useful. Sets contain unique values. Lookups are fast. Before you’d use a dictionary:

>>> d = {} >>> for each in list_of_things: ... d[each] = None >>> list_of_things = d.keys()

Now you’d use:

>>> list_of_things = set(list_of_things)

Sorting

You don’t need to turn a set into a list before sorting it. This works:

>>> something = set(...) >>> nicely_sorted = sorted(something)

Previously you’d do some_list.sort() and then turn it into a set. Sorting with cmp

This one is old::

>>> def compare(x, y): ... return cmp(x.something, y.something) >>> sorted(xxxx, cmp=compare)

New is to use a key. That gets you one call per item. The comparison function takes two items, so you get a whole lot of calls. Here’s the new:

>>> def get_key(x): ... return x.something >>> sorted(xxxx, key=get_key)

Conditional expressions

This one is very common!

This old one is hard to debug if blank_choice also evaluates to None:

>>> first_choice = include_blank and blank_choice or []

There’s a new syntax for conditional expressions:

>>> first_choice = blank_choice if include_blank else []

Constants and loops

Put constant calculations outside of the loop:

>>> const = 5 * a_var >>> result = 0 >>> for each in some_iterable: ... result += each * const

Someone suggested this as an old-dated pattern. You can put it inside the loop, python will detect that and work just as fast. He tried it out and it turns out to depend a lot on the kind of calculation, so just stick with the above example. String concatenation

Which of these is faster:

>>> .join(['some', 'string']) >>> 'some' + 'string'

It turns out that the first one, that most of us use because it is apparently faster, is actually slower! So just use +.

Where does that join come from then? Here. This is slow:

>>> result = >>> for text in make_lots_of_tests(): ... result += text

And this is fast:

>>> result = .join(make_lots_of_tests())

The reason is that in the first example, the result text is copied in memory over and over again.

So: use .join() only for joining lists. This also means that you effectively do what looks good. Nobody will concatenate lots of separate strings over several lines in their source code. You’d just use a list there. For just a few strings, just concatenate them. " -- http://reinout.vanrees.org/weblog/2013/05/17/prehistorical-python.html

---

" Introduce jsonb, a structured format for storing json.

The new format accepts exactly the same data as the json type. However, it is stored in a format that does not require reparsing the orgiginal text in order to process it, making it much more suitable for indexing and other operations. Insignificant whitespace is discarded, and the order of object keys is not preserved. Neither are duplicate object keys kept - the later value for a given key is the only one stored.

The new type has all the functions and operators that the json type has, with the exception of the json generation functions (to_json, json_agg etc.) and with identical semantics. In addition, there are operator classes for hash and btree indexing, and two classes for GIN indexing, that have no equivalent in the json type. "

---

http://stackoverflow.com/questions/6430448/why-doesnt-gcc-optimize-aaaaaa-to-aaaaaa

http://spin.atomicobject.com/2014/03/25/c-single-member-structs/

summary: (a) the C sizeof operator has weird behavior on arrays; it seems that if the array is on the stack, it gives the length of the array, but if it is on the heap, it gives the size of the pointer to the array. This can be dealt with either by using a typedef or by wrapping the array in a one-element struct, both of which cause sizeof to give the length of the array in any case. (b) Unwanted Type Coercion: say you have two types which are both ints but of different units, e.g. seconds and milliseconds. You may want the compiler to check to make sure you aren't adding these together. A typedef won't do this; if you create 'second' and 'millisecond' types which are both ints, then you add values of these types together, the compiler will look past the typedef, see they are really both ints, and auto-coerce them to allow them to be added. But you can wrap them in a one-element struct. Now no operations are defined and you must define e.g. addition manually.

Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading

Junfeng Yang, Heming Cui, Jingyue Wu, Yang Tang, and Gang Hu, "Determinism Is Not Enough: Making Parallel Programs Reliable with Stable Multithreading", Communications of the ACM, Vol. 57 No. 3, Pages 58-69.

    We believe what makes multithreading hard is rather quantitative: multithreaded programs have too many schedules. The number of schedules for each input is already enormous because the parallel threads may interleave in many ways, depending on such factors as hardware timing and operating system scheduling. Aggregated over all inputs, the number is even greater. Finding a few schedules that trigger concurrency errors out of all enormously many schedules (so developers can prevent them) is like finding needles in a haystack. Although Deterministic Multi-Threading reduces schedules for each input, it may map each input to a different schedule, so the total set of schedules for all inputs remains enormous.

    We attacked this root cause by asking: are all the enormously many schedules necessary? Our study reveals that many real-world programs can use a small set of schedules to efficiently process a wide range of inputs. Leveraging this insight, we envision a new approach we call stable multithreading (StableMT) that reuses each schedule on a wide range of inputs, mapping all inputs to a dramatically reduced set of schedules. By vastly shrinking the haystack, it makes the needles much easier to find. By mapping many inputs to the same schedule, it stabilizes program behaviors against small input perturbations.

The link above is to a publicly available pre-print of the article that appeared in the most recent CACM. The CACM article is a summary of work by Junfeng Yang's research group. Additional papers related to this research can be found at http://www.cs.columbia.edu/~junfeng/ By Allan McInnes?