proj-oot-ootPackagingNotes1

(mostly moved to [1]

---

" The one thing I wish it had would be that your dependencies were yours only. I think this is one of the big reasons Node.js got successful so fast. In node you can run different versions with-in the same app depending on the libraries. Elixir is more like ruby in that you can only have one. Well you can have two for swapping out without down time but that is it. I do think this is one of the limits to the Erlang VM. "

---

" Packages and build systems

In addition to GHC itself, we make use of a lot of open-source Haskell library code. Haskell has its own packaging and build system, Cabal, and the open-source packages are all hosted on Hackage. The problem with this setup is that the pace of change on Hackage is fast, there are often breakages, and not all combinations of packages work well together. The system of version dependencies in Cabal relies too much on package authors getting it right, which is hard to ensure, and the tool support isn't what it could be. We found that using packages directly from Hackage together with Facebook's internal build tools meant adding or updating an existing package sometimes led to a yak-shaving exercise involving a cascade of updates to other packages, often with an element of trial and error to find the right version combinations.

As a result of this experience, we switched to Stackage as our source of packages. Stackage provides a set of package versions that are known to work together, freeing us from the problem of having to find the set by trial and error. "

---

https://nylas.com/blog/packaging-deploying-python

summary: docker sounds cool but it's too new for us. Wound up using dh-virtualenv

discussion: https://news.ycombinator.com/item?id=9861127

discussion summary:

svieira 13 hours ago

Back when I was doing Python deployments (~2009-2013) I was:

Fast, zero downtime deployments, multiple times a day, and if anything failed, the build simply didn't go out and I'd try again after fixing the issue. Rollbacks were also very easy (just switch the symlink back and restart Apache again).

These days the things I'd definitely change would be:

Things I would consider:

reply

they should have used Docker anyway (paraphrased)

 Cieplak 17 hours ago

Highly recommend FPM for creating packages (deb, rpm, osx .pkg, tar) from gems, python modules, and pears.

https://github.com/jordansissel/fpm

reply

"http://pythonwheels.com/ solves the problem of building c extensions on installation. " "Pair this with virtualenvs in separate directories (so that "rollback" is just a ssh mv and a reload for whatever supervisor process)" "Also, are there seriously places that don't run their own PyPI? mirrors?"

localshop and devpi are local PyPI? mirrors, apparently

 perlgeek 5 hours ago

Note that the base path /usr/share/python (that dh-virtualenv ships with) is a bad choice; see https://github.com/spotify/dh-virtualenv/issues/82 for a discussion.

You can set a different base path in debian/rules with export DH_VIRTUALENV_INSTALL_ROOT=/your/path/here

reply

 erikb 7 hours ago

No No No No! Or maybe?

Do people really do that? Git pull their own projects into the production servers? I spent a lot of time to put all my code in versioned wheels when I deploy, even if I'm the only coder and the only user. Application and development are and should be two different worlds.

reply

---

"The final question was about what he hates in Python. "Anything to do with package distribution", he answered immediately. There are problems with version skew and dependencies that just make for an "endless mess". He dreads it when a colleague comes to him with a "simple Python question". Half the time it is some kind of import path problem and there is no easy solution to offer." -- van Rossum, paraphrased in https://lwn.net/Articles/651967/

arturhoo 4 hours ago

I agree with Guido that the thing I hate the most in Python is packaging in general. I find Ruby's gems, bundler and Gemfile.lock to be a much more elegant solution.

On the other hand, I really like the explicit imports (when used properly). Less magic that makes code navigation way easier.

reply

davexunit 28 minutes ago

As a distro packager, I find Python's packages to be much better and easier to integrate than Ruby gems. I've had no shortage of troubles with Ruby gems: requiring on the git binary to build the gem (even in a release tarball), test suites not being included in the gem releases on rubygems.org, rampant circular dependencies, etc. Python's PyPI? has caused me no such issues.

reply

---

dchesterton 2 hours ago

I find external dependencies much more reliable in the PHP world than JS. Most packages try to follow semver. Composer is one of the best dependency manager tools I've used and you can easily lock down dependency versions so you can install from a state which you know works.

People hate on PHP but there are a lot of quality packages out there and Composer is definitely considered best practice by most developers now.

reply

---

"Other points: Cargo is underrated. Traits over primitives is a huge win over Java's boxed collections."

---

JoshTriplett? 11 hours ago

I'm curious if there were any significant responses to the tooling/ecosystem questions regarding packaging and integration with Linux distributions.

I'd like to provide an application written in Rust in major Linux distributions. "cargo install" will not work, both because it depends on network access, and because it pulls in dependencies from outside the distribution. Similarly, many distributions have policies against bundling library sources in the application. There needs to be a straightforward way to turn a source package with a Cargo.toml file into a package for Debian and Fedora that depends on other packages corresponding to its Cargo dependencies (and C library dependencies).

reply

bluejekyll 11 hours ago

There are people working on this, here's an announcement for a deb plugin to Cargo:

https://www.reddit.com/r/rust/comments/4ofiyr/announcing_car...

reply

wyldfire 10 hours ago

I hadn't seen this. Thank you for sharing.

reply

steveklabnik 11 hours ago

Cargo install was never intended to be used that way. It's for things useful for Rust developers, not for people who use a program who happen to be written in Rust.

There's a few things at play here. For "build from source" distros, we've been making some changes to make it easier to package rustc. Namely, instead of relying on a specific SHA of the compiler to bootstrap, starting with 1.10, it will build with 1.9, and 1.11 will build with 1.10. This is much, much easier for distros. As for Cargo dependencies on those packages, we'll see. There's a few ways it could go, but we need the compiler distribution sorted first.

A second is that we'd like to eventually have tooling to make giving you a .deb or .rpm or whatever easier: https://github.com/mmstick/cargo-deb is an example of such a tool. This won't necessarily be good enough to go into Debian proper; I know they tend to not like these kinds of tools, and want to do it by hand. But "Hey thanks for visiting my website, here is a thing you can download and install", these kinds of packages can be made with this kind of tooling.

In general, it's complex work, as then, these issues are also different per distro or distro family. We'll get there :)

reply

JoshTriplett? 11 hours ago

I was more looking for a path to producing Policy-compliant packages for Debian and other distributions.

Do you know of anyone working on that right now, and how I can help?

reply

steveklabnik 11 hours ago

https://internals.rust-lang.org/t/perfecting-rust-packaging/2623 is the big internals thread about it; weighing in there is a good entry point. Thanks!

reply

---

some folks are making some noise about the TUF initiative for package signing:

https://theupdateframework.github.io/ http://freehaven.net/%7Earma/tuf-ccs2010.pdf good overview: https://lwn.net/Articles/628842/ https://lwn.net/Articles/629478/ http://legacy.python.org/dev/peps/pep-0458/ http://legacy.python.org/dev/peps/pep-0480/ https://github.com/rust-lang/cargo/issues/1281 https://github.com/rust-lang/crates.io/issues/75

---

"

regularfry 8 days ago [-]

NPM would be hugely improved if:

1) `npm install` on two different machines at the same point in time was guaranteed to install the same set of dependency versions.

1b) shrinkwrap was standard.

2) it was faster.

3) you could run a private repo off a static file server.

reply

jdlshore 7 days ago [-]

Check in your modules.

reply "

---

https://github.com/pypa/pipfile https://news.ycombinator.com/item?id=13011932

---

discussion on https://www.kennethreitz.org/essays/announcing-pipenv

cderwin 13 hours ago [-]

This is great, but sometimes I think that python needs a new package manager from scratch instead of more tools trying to mix and mash a bunch of flawed tools together in a way that's palatable by most of us. Python packaging sucks, the whole lot of it. Maybe I'm just spoiled by rust and elixir, but setuptools, distutils, pip, ez_install, all of it is really subpar. But of course everything uses pypi and pip now, so it's not like any of it can actually be replaced. The state of package management in python makes me sad. I wish there was a good solution, but I just don't see it.

Edit: I don't mean to disparage projects like this and pipfile. Both are great efforts to bring the packaging interface in line with what's available in other languages, and might be the only way up and out of the current state of affairs.

reply

renesd 12 hours ago [-]

I think python packaging has gotten LOTS better in the last few years. I find it quite pleasurable to use these days.

From binary wheels (including on different linux architectures), to things like local caching of packages (taking LOTS of load off the main servers). To the organisation github of pypa [0], to `python -m venv` working.

Also lots of work around standardising things in peps, and writing documentation for people.

I would like to applaud all the hard work people have done over the years on python packaging. It really is quite nice these days, and I look forward to all the improvements coming up (like pipenv!).

I'd suggest people checkout fades [1] (for running scripts and automatically downloading dependencies in a venv), as well as conda [2] the alternative package manager.

[0] https://github.com/pypa/

[1] https://fades.readthedocs.io/en/release-5/readme.html#what-d...

[2] http://conda.pydata.org/docs/intro.html

reply

sametmax 11 hours ago [-]

+1. Relatively to what we have before, it's so much better. But compared to the JS/Rust ecosystem, we are behind.

Now it's hard to compete with JS on some stuff : it's the only language in the most popular dev plateform (the web) and it has one implicit standardized async model by default.

It's hard to compete with rust on some stuff : it's compiled and is fast, can provide stand alone binaries easily and has a checker that can avoid many bugs.

But this. The package manager. We can compete. And yet we are late.

It's partially my fault since it's a project I had in mind for years and never took the time to work on. It's partially everybody's fault I guess :)

reply

....

jessaustin 16 hours ago [-]

One suspects it's you who hasn't distributed or installed many modules on either python or node. So many of the problems that python has, simply don't exist for node, because it finds modules in a bottom-up hierarchical fashion. That allows a single app or module to use modules that in turn use different versions of other modules, and not to worry about what other modules are doing, or how other modules are installed, or how node is installed, or what version of node is installed. This prevents the traditional "dependency hell" that has plagued devs for decades. Thanks to tools like browserify and webpack, the browser may also benefit from this organization.

On top of all that, npm itself just does so many things right. It's quite happy to install from npm repos, from dvcs repos, from regular directories, or from anything that looks like a directory. It just needs to find a single file called "package.json". It requires no build step to prepare a module for upload to an npm repo, but it easily allows for one if that's necessary. package.json itself is basically declarative, but provides scripting hooks for imperative actions if necessary. At every opportunity, npm allows devs to do what they need to do, the easy way.

In a sense, node and npm are victims of their own quality. The types of "issues" (e.g. too many deps, too many layers of deps, too many versions of a particular dep, deps that are too trivial, etc.) about which anal code puritans complain with respect to node simply couldn't arise on other platforms, because dependency hell would cause the tower of module dependencies to collapse first. node happily chugs along, blithely ignoring the "problems".

Personally, I used to be able to build python packages for distribution, but since I've been spoiled by node and npm for several years I've found I simply can't do that for python anymore. It is so much harder.

reply

philsnow 15 hours ago [-]

npm has its own special problems. disclaimer: what I'm talking about in this post is at least six months old, which in node/npm/js world is ancient history.

> it finds modules in a bottom-up hierarchical fashion. That allows a single app or module to use modules that in turn use different versions of other modules, and not to worry about what other modules are doing

To my understanding, if your app transitively depends on package foo-1.2 in thirty different places [0], there will be thirty copies of foo-1.2 on disk under node_modules/ . Each package reads its very own copy of foo-1.2 when it require()s foo.

On a large app, that adds up to a lot of inodes ("why does it say my filesystem is full? there's only 10G of stuff on my 80G partition!" because it's used up all its inodes, not its bytes.) and a _lot_ of unnecessary I/O.

((note: the next comment is referring to how npm 3 ditches the deep directories and puts everything in one directory, de-duplicated, as a post further below explains))

jessaustin 15 hours ago [-]

...what I'm talking about in this post is at least six months old...

Haha npm@3 was out June 2015. b^)

I agree that it would have been better, on balance, for previous versions to have created hard links to already-installed modules. Actually that wouldn't be a bad option to have even now, since debugging is often easier when one has a deep directory structure to explore rather than hundreds of random names in the top-level node_modules directory. That is, if I know the problem is in module foo, I can pushd to node_modules/foo, find the problematic submodule again, and repeat until I get all the way to the bottom. [EDIT: it occurs to me that having all these hard links would make e.g. dependency version updates easier, since un-updated dependencies wouldn't have to be recopied, unix-stow-style.]

To me, the more amusing file descriptor problem is caused by the module "chokidar", which when used in naive fashion tries to set up watches on all 360 files and directories created by itself and its own 55 dependencies. At that point it's real easy to run out of file watches altogether. Some of the utilities that call chokidar do so while ignoring node_modules, but many do not.

reply

...

sjellis 22 hours ago [-]

This is actually one of the big problems, I think: Python packaging involves knowing a number of different things and reading various resources to get the full picture.

Recently, I built a small CLI tool in Python, and learned all of the bits needed to build, test and package my application "the right way". I knew Python syntax before, but it was a lot of effort to set this up. The difference in the experience between Python and Rust or .NET Core is actually shocking, and most it isn't down to anything that Python couldn't do, just the current state of the tooling.

reply

d0mine 17 hours ago [-]

Could you provide some specific examples of the "shocking" difference?

reply

sjellis 5 hours ago [-]

Python best practice: figure out the correct directory structure by reading docs and looking at GitHub? repositories, learn how to write setup.py & setup.cfg & requirements.txt & MANIFEST.in files, setup py.test and tox (because Python 2 still lives), write your README in RST format (as used by nothing else ever), and for bonus points: write your own Makefile. Get depressed when you realize that target platforms either don't have Python or have the wrong version.

Rust: type "cargo new", README and doc comments in Markdown, type "cargo test" and "cargo build".

I'm being deliberately snarky, but you get the point: there has been a slow accretion of complexity over a very long time, and most of it is not the language itself.

reply

bastawhiz 13 hours ago [-]

`setup.py` is shockingly awful compared to most other solutions.

reply

...

ghshephard 1 day ago [-]

Okay - maybe I'm missing something, but pip is the only Python package manager I've ever used. And it's basically "pip install xxx", "pip install --upgrade xxx", "pip show xxx", "pip list", "pip uninstall xxx"

I'm curious what I've been missing about pip that makes it problematic - I've never used the other tools you mentioned (setuptools/distutils/ez_install) - so I can't comment on them, but, on the flip side, I've never had to use them, so maybe my requirements are somewhat more simple than yours.

reply

sametmax 1 day ago [-]

One things is a good dependency management. Right now if you want to upgrade your Python version, or one of your packages, it's a mountain of manual work. There is nothing in the stack helping you with the dependency graph.

Another thing is providing a stand alone build. Something you can just ship without asking the client to run commands in the terminal to make it work. I use nuikta (http://nuitka.net/) for this. It's a fantastic project, but man it's a lot of work for something that works out of the box in Go or Rust.

One last thing is to generate packages for OS (msi/deb/rpm/dmg/snap). Your sysadmin will like you. Pex (https://pypi.python.org/pypi/pex) is the closest, but not very standard.

Other pet peeves of mine:

reply

scrollaway 19 hours ago [-]

Oh my god, you've described every single one of my issues with Python packaging.

The whole setup.py/setup.cfg situation really is ridiculous. Having to import the __version__, read() the README, no markdown support on pypi, MANIFEST / MANIFEST.in files, tox.ini, what a mess.

reply

schlumpf 1 day ago [-]

This. Particularly the need for a minimum standard project structure.

Pipenv shows its pedigree and looks like a great tool...that also overlaps significantly with conda. What are the use cases that Pipenv addresses better than/in lieu of conda?

reply

mdeeks 11 hours ago [-]

It looks like Pipenv does not handle the python install itself or associated non-python libraries. With Conda I can tell it to install Python 3.6 along with FreeTDS? (for mssql). Conda lets me do this in one environment.yml file and have it work cross platform. Separate homebrew or apt-get steps are no longer necessary.

That said pipenv still looks awesome. Any improvement to the python packaging world is welcome gift.

reply

sametmax 1 day ago [-]

pipenv allow you to completly ignore the virtualenv. Like node_packages. It seems a detail, but giving a lot of python and js trainings, I came to realize newcomers needs little help like this.

reply

daenney 1 day ago [-]

You don't need to install (ana

mini)conda just to get a package manager, would be why I would use Pipenv over Conda. Miniconda alone requires somewhere close to 400MB of space and comes with a whole bunch of extra things I don't need just to manage packages and virtualenvs.

reply

kalefranz 19 hours ago [-]

The miniconda bootstrap of conda is ~20-30 MB (compressed) depending on platform. It contains only conda and its dependencies, like python and requests. It's how you install conda if you want only conda. The 400 MB number is for the Anaconda Distribution, which is a self contained, single-install, get-all package primarily aimed at scientists and engineers.

reply

sametmax 21 hours ago [-]

pip-tools doesn't solve the problem at all. It will update things to the last up to date version, cascading from package to package.

That doesn't guaranty your setup will work.

Dependency management suppose to create a graph of all requirements, lower and upper versions bound for the runtime and the libs, and find the most up to date combination of those.

If a combination can't be found, it should let you know that either you can't upgrade, or suggest alternative upgrade paths.

pip-tools will just happily upgrade your package and let you with something broken, because it's based on pip which does that. They don't check mutually exclusive dependencies versions, deprecation, runtime compatibility and such. And they don't build a graph of their relations.

reply

ProblemFactory? 20 hours ago [-]

It would be even better if the tool ran your project's tests when checking upgrade combinations.

Something that would say: "You can safely upgrade to Django 1.9.12. Upgrading to latest Django 1.10.5 breaks 20 tests."

reply

StavrosK? 20 hours ago [-]

How can you have an upper bound on compatibility? When a library is released, it knows that it works with version 1.3.2 with its dependency, but how can it ever know it doesn't work with 1.4, unless the developer goes back and re-releases the app?

reply

AgentME? 16 hours ago [-]

If the library follows semantic versioning, then you can always declare that you work with everything from the current version to before the next major version.

reply

StavrosK?