notes-computer-whyILikePython

Python is my current favorite programming language, and i'm not sure why.

I'm going to write down some ideas i have for why i like it here in an attempt to make progress finding out. Since i don't understand my own thoughts, here, this will be more a stream of conciousness than a tight essay. Maybe later after i understand what i'm trying to say better i'll rewrite it.

Note: i think the 'Zen of Python', http://legacy.python.org/dev/peps/pep-0020/ , has a lot to do with what i'm talking about here.

Hacky vs. overengineered

Perhaps there is a dimension of design that ranges between 'hacky' on the one extreme and 'overengineered' on the other.

'Hacky' has something in common with 'worse is better' but i don't mean to include the focus on implementation simplicity over interface simplicity that the 'worse is better' essay has.

I don't know quite what i mean by 'Hacky' but it means producing a system with some 'rough edges', a system that may sometimes need some manual hand holding, but which is simple and doesn't have too many things that the operator (in this case, the programmer) needs to remember.

This sounds like it implies that implementation simplicity was preferred over interface simplicity (because you might reason that the reason it needs manual handholding is because making it do everything automatically would have required a complex implementation, so we opted for a simple implementation and the result was a slightly more complicated interface), but i don't think that's what i'm getting at. I think it's possible that sometimes making an interface without 'rough edges' may inherently require making the interface complex (regardless of the implementation difficulty).

In other words, perhaps a tradeoff of this sort between simplicity and other design goals may occur in the design of the interface even if you allow arbitrarily complex implementations.

Are there other ideas in the 'worse is better' article that might help us make the concept of 'rough edges' more concrete?

The 'worse is better' article subdivides the 'MIT approach' (e.g. 'The Right Thing') into two: the 'big complex system' and the 'diamond-like jewel'.

Both 'worse is better' (the 'New Jersey approach') and the 'diamond-like jewel' have the sort of simplicity that i am talking about here.

The 'worse is better' essay talks about 4 design goals: simplicity, correctness, consistency, completeness. Indeed the 'rough edges' i am talking about do seem to have to do with deprioritizing completeness, and slightly prioritizing simplicity over consistency.

I'm not sure if the 'rough edges' i am talking about are allowed to prioritize simplicity over correctness, as in the 'worse is better' essay.

One idea is to think about the cognitive load on the system operator (here, the programmer).

If the system is sometimes incorrect, the operator must always remember those scenarios.

I think this makes the system less simple, and so probably that's not the kind of thing i mean when i say 'rough edges'. It's okay for the system to be incomplete, but not incorrect. Note that one way out is for the system itself to recognize situations when it would be incorrect, and instead of running in those situations, to halt and issue an error, thereby turning incorrectness into incompleteness. Note that this solution exactly corresponds to Python's 'refuse the temptation to guess'.

The tradeoff in terms of completeness is obvious. On the one hand, you could build a 'large' system with built-in functionality that perfectly matches every conceivable use case. The downside (neglecting the difficulty of building this) is that there will be a huge number of features for users to learn. You might say, so what, just use the features you need, the others don't hurt you -- but since a computer programming language is a LANGUAGE, they do hurt you, because others might use the features you neglected, so in order to be able to read and understand all source code written in this language, you have to learn the whole thing.

Which tangentially brings up the point that i think Python's READABILITY is one of its key attributes that causes me to like it so much.

So one would expect a 'hacky' system to provide, not a comprehensive system that elegantly solves every use case, but instead something closer to a minimal basis set or toolbox that can solve every use case, but usually only with some fiddling. The cost of the fiddling must be traded off against the cost of making the underlying system more complex by adding a zillion features.

A 'diamond-like jewel' might push this to the logical extreme, actually including only a minimal basis set, or something close to it. But a 'hacky' system is more focused on practicality, and is willing to include redundant primitives if the complexity penalty of doing so can be justified by their practical usefulness.

Now, in terms of cognitive load, one would think that consistency would be a high priority. After all, a highly consistent system has less things that you need to learn to use it, right?

Yes, but here is where i think there is a key tradeoff in the interface. In fact, two tradeoffs. First of all, you might have to introduce more conceptual primitives in order to achieve consistency. So, you decrease the count of edge cases, but you increase the count of primitives.

Second, you might have to use conceptual primitives that are qualitatively more complex in order to achieve consistency. So, you decrease the count of edge cases, but you increase the cognitive load in learning and perhaps in using the system.

Now, i move on to another point. I've spoken of minimizing the number and weight of conceptual primitives and of edge cases. I've spoken of a 'minimal basis set' of primitives. In Pete Goodliffe's chapter 2 of "Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design", he says one attribute of good design is "unifying concepts drawing the separate parts together". In Peopleware chapter 33, it is stated "Projects begin with planning and design, activities that are best carried out by a smallish team.". There is a widely held sense that "design by committee" is a bad idea. There is a widely held sense that part of 'good design' is 'saying no'. Here's one theory for these things:

It's bad to have a high number of conceptual elements in the system. Why? Because this creates cognitive load on the operator, as well as adding complexity to the design of the implementation, therefore making its design harder to change (and also making the implementation more expensive to create, but we're not worrying about that here). One approach is to choose an orthogonal (minimal (non-redundant) but sufficient) set of conceptual primitives, and then to possibly throw in a few more elements for convenience. Taken to the extreme, this is the 'diamond-like jewel' approach. Taken to less of an extreme, this is part of Python's "only one obvious way to do it" approach, the opposite of Perl's "There's more than one way to do it" approach. But often there are more than one possible ways to break a system's functionality down into a basis set, e.g. more than one basis sets, each of which could if necessary serve as a basis for implementing any of the others. For example, the way that a Turing-complete computer could be architected as a Turing machine, as lambda calculus, as combinatorial logic, etc. Each of these architectures is sort of a 'minimal basis set' for computation, illustrating the point that there may be multiple, quite different 'minimal basis sets'. So the designer who wants to minimze the number of conceptual elements in the system probably wants to choose one (or a few) of these basis sets and then reject the rest as redundant, given that choice. But if there is a committee of designers, and different people on the committee have strong attachments to different basis sets, then the politically likely outcome is a compromise where each basis set is included.

In short: a simple design often privilages one set of conceptual primitives over others in order to reduce the number of conceptual primitives in the design.

Another thing i want to talk about is leaky abstractions. 'Hacky' designs tend to not even provide leaky abstractions, whereas 'overengineered' designs provide them in spades, leading to headaches when you try to debug a shakey tower of leaky abstractions. Hurray for those hacky systems that keep you 'close to the metal' away from all that 'enterprisey', 'frameworky' complexity, right? And this also seems to line up with the 'implementation simplicity over interface simplicity' in the Worse is Better essay. But, this is not entirely true. For example, Python, although it's somewhat 'hacky', provides garbage collection, various transparent optimizations, and abstraction of the call stack.

I think what you see here is related to the degree of leakiness of the abstraction. Some abstractions, such as language-provided garbage collection, are less leaky than others (such as remote procedure calls over a network). In fact, in terms of program semantics, disregarding runtime and memory requirements, garbage collection can be leakless (and if one only cares about asymptotic memory leaks, garbage collection as in Python is almost leakless; you only have to worry about a few edge cases such as reference cycles of objects with destructors; and even there i think i heard about some progress recently, although i don't remember?). Similarly, call-stack abstraction is 'almost leakless' unless the stack overflows. And some other abstractions are positively leakless (such as some provably correct optimizations that an interpreter or compiler does). The abstractions that annoy are the ones that are more leaky.

But gargage collection is definitely a very 'frameworky' kind of abstraction, right?

The reason garbage collection fits well in Python, despite its 'frameworkyness', is because it almost solely subtracts operator complexity, adding very little.

So i think systems like Python seem to embrace the less leaky abstractions, especially the ones that subtract a lot of operator complexity, even if they lead to more implementation complexity. It's only the more leaky ones that they avoid.

comments on The Zen Of Python:

The Zen of Python

    Beautiful is better than ugly.
    Explicit is better than implicit.
    Simple is better than complex.
    Complex is better than complicated.
    Flat is better than nested.
    Sparse is better than dense.
    Readability counts.
    Special cases aren't special enough to break the rules.
    Although practicality beats purity.
    Errors should never pass silently.
    Unless explicitly silenced.
    In the face of ambiguity, refuse the temptation to guess.
    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.
    Now is better than never.
    Although never is often better than *right* now.
    If the implementation is hard to explain, it's a bad idea.
    If the implementation is easy to explain, it may be a good idea.
    Namespaces are one honking great idea -- let's do more of those!
    Beautiful is better than ugly.e.g. the first criterion is an ill-defined one, "beauty"
    Explicit is better than implicit.this is a core 'hacky' value; by leaving what the system is doing visible rather than hiding it, is is easier to read and understand.
    Simple is better than complex.simplicity
    Complex is better than complicated.this one is hard to interpret because it is unclear if the author is using the precise dictionary definition of 'complex' and 'complicated' or some other idiosyncratic understanding.
    Flat is better than nested.this is a specific prescription about data structures and namespaces.
    Sparse is better than dense.not sure what this one means, but i'm guessing it refers to syntax and the desire for (a) whitespace and (b) code across many lines rather than a few very dense lines of code that do many things per line.
    Readability counts.readability
    Special cases aren't special enough to break the rules.consistency
    Although practicality beats purity.exceptions to consistency
    Errors should never pass silently.
    Unless explicitly silenced.this is a specific prescription, but also accords with 'refuse the temptation to guess' and therefore with correctness.
    In the face of ambiguity, refuse the temptation to guess.i convered this under 'correctness', above
    There should be one-- and preferably only one --obvious way to do it.this relates to cognitive economy, as noted above, and indirectly via cognitive economy to readability
    Although that way may not be obvious at first unless you're Dutch.acknowledgement that the language's design choices will sometimes reflect the idiosyncracies of the BDFL, van Rossum, rather than human universals.
    Now is better than never.
    Although never is often better than *right* now.This is a statement about how good a proposal has to be before it is added to the language. The procedure seems to be: if something is sorely needed, we think about it for a little while and then do the best thing we can think of, even if it's not clearly The Right Thing.
    If the implementation is hard to explain, it's a bad idea.
    If the implementation is easy to explain, it may be a good idea.This is a statement on the value of implmentation simplicity. To me it suggests that implementation simplicity tends to relate to interface simplicity, although i may be misreading that.
    Namespaces are one honking great idea -- let's do more of those!this is a specific prescription.

http://web.archive.org/web/20100310233706/http://www.python.org/dev/culture also says: "Python tries to keep things simple, to be orthogonal but not too much so, and to assist the programmer as much as possible."

"Do the simplest thing that can possibly work".

"Correctness and clarity before speed."

"The developers aren't interested in making the interpreter run faster at the expense of unreadable or hard-to-follow tricky code. In the past working patches have been rejected because they would have made the code too difficult to maintain."

todo: reread Self:notes-books-beautifulArchitecture and make other relevant comments here

todo: integrate into programming languages book, design chapter notably, this talks about hierarchical decomposition and about how things should be placed in the subpart where people expect them to be (implying non-redundancy and There's Only One Obvious Way To Do It, so that most people have the same expectations)

todo: read

    “Hints for Computer System Design”, by Butler W. Lampson
    Read “The Interaction of Architecture and Operation System Design”, Thomas E. Anderson, et al.
    “Lisp: Good News, Bad news, How to Win Big”, Richard P. Gabriel

http://webcache.googleusercontent.com/search?q=cache:g5zes4bEWmgJ:ece.ut.ac.ir/Classpages/S87/ECE404/Lectures/design-lec2.ppt+&cd=8&hl=en&ct=clnk&gl=us&client=ubuntu

note: in On Lisp, section 1.2, noted Lisp proponent Graham extols a style of software development in which you extend the language to your domain and iteratively refactor your program. But he also says "this style of development is better suited to programs which can be written by small groups". Why is that? He doesn't say. But in a footnote on the previous page, he notes that some people object that if you extend the language, then people have to learn the extensions in order to understand the program, and refers you to Section 4.8 for his rebuttal to this disadvantage. His rebuttal in 4.8 says that if you compare the program with the language extensions to the same program written straightforwardly (without language extensions), the straightforward implementation will necessarily be longer and more redundant (as if you had inlined library calls), provided that each language extension is used at least 3 times or so in different places. He implies (and states in his other essays) that it's better for code to be shorter, denser, and less redundant.

but the Curse of Lisp suggests that there are problems with this state of affairs:

http://www.winestockwebdesign.com/Essays/Lisp_Curse.html

namely, that because it is so easy to extend Lisp, it is easy to make frameworks, and therefore there are many competing frameworks created, rather than one or two ones that become canonical, and so none of them receive enough contributor labor. furthermore, because individuals can go further with Lisp, cooperation in building libraries and frameworks is less necessarily with Lisp, therefore it differentially attracts lone wolf type people, which is another reason that a good library ecosystem is not created.

so another characteristic of Python that i like is that it creates good communal gravity, e.g. a tendendency for the community to standardize on a small number of 'canonical' libraries and frameworks (ideally this would be balanced in a power law/long-tail way with 'gravitational fluidity', a tendency for the community to smoothly migrate and re-standardize on new entrants that appear and displace existing canonical libraries and frameworks).

but another problem i'd like to postulate is another explanation why Graham might have said "this style of development is better suited to programs which can be written by small groups" (i should really just email him and ask him sometime). Consider a large codebase with a large team of people. Different people specialize on different hierarchically separated parts of the codebase, but they also periodically find themselves working on the parts they aren't specialized on, 'hacking' on code that they don't really understand too well. Assume that people are constantly being hired and fired and leaving for different jobs elsewhere, so that if you look at any piece of code, the probability that someone who authored or worked with that code still being on the team is low. In such a situation, readability is primary.

If a language extension is used 5 times, but in five different parts of the codebase, then for any single programmer, they'll probably only run into it once, so it's not worth their time having to learn it.

Also, contrary to Graham, i think that denser code may not always be better for readability. Concepts at a higher level of abstraction are harder to think about than more concrete ones. Often you find a piece of English text that is hard to read, and you might think, 'Jeez, if they had taken a full page to explain this concretely instead of explaining it abstractly in a single paragraph, it would have actually allowed me to have read it more quickly, even though there would have been more text'. The same can be true with programming. Perhaps this is why the Zen of Python has "sparse is better than dense".

In addition, contrary to Graham, extending the language with control flow macros has a cost in terms of increasing the count of conceptual primitives, in a way that extending a program with subroutines does not.

In addition, in rebuttal to Graham, control flow macros need to be mastered in order to even understand the structure of a program, as opposed to subroutines, which need only be understood to understand its content. So, if you are purposefully browsing through a large codebase mostly written by others, you cannot effectively navigate unless you master each control flow macro that you come across in relevant code, so the fewer of these there are to learn, the better.

These are more motivations for Python's 'Only One Obvious Way To Do It', and most likely to van Rossum's resistance to adding anything like macros to Python.

(in my ideal language, you would have macros for when you needed them, but there were also countervailing forces to discourage you from using them unnecessarily; e.g. just as, in the design of the language, you strive for cognitive economy and not to introduce too many conceptual primitives, introducing a new conceptual primitive via macro is a somewhat expensive decision in terms of readability and should be somehow discouraged, but not forbidden. E.g. if a macro would be rather abstract, would only take one line to write, and would be used 5 times, and each time would save you one line of code, that's probably not enough to justify its readability costs; but if it would be used 100 times and save 10 lines of code each time, that might be enough).

---

to further get at what i mean when Python is somewhat 'hacky' and this seeming weakness is actually a good tradeoff, explore the word 'hack':

http://paulbuchheit.blogspot.com/2009/10/applied-philosophy-aka-hacking.html?m=1 (seeing past mistaken assumptions/ways of ordering the world that we impose as a simplifying filter or via 'magical thinking' but which are not precisely in accord with low-level reality; seeing that the true goal isn't what is it thought to be, and taking a shortcut e.g. Enders game)

http://www.paulgraham.com/hp.html

worse is better (what a dirty hack, you're a hack, it's just a hack)

practicality over purity

mit 'hacks'

hackers as people driven to understand how the system actually works

--

essential complexity vs accidental complexity vs jargon/docs vs learnability:

the following are somewhat mutated versions of the real concepts, but: essential complexity cannot be lowered without simplifying the task. accidental complexity may in theory be able to be lowered with proper design. good or bad jargon/docs affect how easy it is to learn something, but are a property of the community surrounding the system, not of the system itself. learnability is how easy it is to learn something, and is inherent to the system itself (at least, relative to a human audience, e.g. in 2014 how to use computer mice can be assumed to be well-known except in the third world).

--

perhaps a lot of the sense that i'm imputing to the word 'hack' is captured by the Swedish word 'lagom'? not being a Swedish speaker i'm not sure i understand that word, though.

but i think lagom also has social connotations, not bragging, not being greedy, etc. "A popular folk etymology claims that it is a contraction of "laget om" ("around the team"), a phrase used in Viking times to specify how much mead one should drink from the horn as it was passed around in order for everyone to receive a fair share." -- https://en.wikipedia.org/wiki/Lagom

--

also, 'hack' has a strong connotation of 'practicality over purity' which i'm not sure is what i'm trying to get at -- i'm saying that there are pure reasons for 'hacky' solutions (for example, if you recognize that a less hacky solution would induce significantly cognitive load on the operator), not that i want to do what's cheap and easy for system implementors. to put it another way, i'm after 'practicality over purity' but for the system operators (in this case, the programmers using the programming language), not for the system implementors (the programming language implementors).

--