Bayle Shanks's website: proj-oot-ootModuleNotes1

http://skilpat.tumblr.com/post/9411500320/a-modular-package-language-for-haskell

nadaviv 1 day ago

link

ES6's generators and the `yield` keyword plays very nicely with callback-based code, doesn't require any changes to the underlying libraries and gives you a much nicer syntax to work with.

See https://github.com/visionmedia/co

JoshGlazebrook? 1 day ago

link

I've been looking at another library like the one you linked. Essentially the "yield" keyword can be compared to "await" in C#.

https://github.com/bjouhier/galaxy

reply "

CyruzDraxs? 1 day ago

link

It was easier until we discovered even cleaner ways, and most of the Javascript community began to adopt them. But node chose to stick to the "old" ways. (Not really old, but not as shiny and new as the alternatives)

I feel like promises don't need to be part of core though. It's easy enough to wrap the standard library in userland and just have other things depend on the wrappers.

reply "

e12e 1 day ago

link

This seems to parallel some of the work VPRI is doing, eg from:

http://www.vpri.org/pdf/tr2010004_steps10.pdf

STEPS Toward Expressive Programming Systems, 2010 Progress Report Submitted to the National Science Foundation (NSF) October 2010 Alan Kay et al

"If computing is important—for daily life, learning, business, national defense, jobs, and more — then qualitatively advancing computing is extremely important. For example, many software systems today are made from millions to hundreds of millions of lines of program code that is too large, complex and fragile to be improved, fixed, or integrated. (One hundred million lines of code at 50 lines per page is 5000 books of 400 pages each! This is beyond human scale.) What if this could be made literally 1000 times smaller — or more? And made more powerful, clear, simple, and robust? This would bring one of the most important technologies of our time from a state that is almost out of human reach—and dangerously close to being out of control—back into human scale.

(...)

Previous STEPS Results

The first three years were devoted to making much smaller, simpler, and more readable versions of many of the prime parts of personal computing, including: graphics and sound, viewing/windowing, UIs, text, composition, cells, TCP/IP, etc. These have turned out well (they are chronicled in previous NSF reports and in our papers and memos). For example, essentially all of standard personal computing graphics can be created from scratch in the Nile language in a little more than 300 lines of code. Nile itself can be made in a little over 100 lines of code in the OMeta metalanguage, and optimized to run acceptably in real‐time (also in OMeta) in another 700 lines. OMeta can be made in itself and optimized in about 100 lines of code."

http://www.vpri.org/html/writings.php "

---

http://existentialtype.wordpress.com/2011/04/16/modules-matter-most/

"Modules are about name control and only secondarily, if at all, about types." -- http://existentialtype.wordpress.com/2011/04/16/modules-matter-most/#comment-733

---

(copied from ootTypesNotes2:)

Implementation dependencies are like dependencies in a package management system. Implementations can depend on types (all that is required is any implementation of the desired type), or 'concretely' on other implementations (we demand a specific implementation of a type, because we are going to break into its internals). For example, the Rational implementation can depend on the type Int (meaning that it makes use of Ints internally), or it could be eg a FFI binary containing optimized x86 code which calls subroutines from another binary package containing an implementation of Ints. Outside of FFI, a concrete implentation dependency is one which breaks encapsulation by accessing private variables in another implementation.

does Oot support concrete implementation dependencies aside from FFI at all? If so, does it force you to specify a specific version of the concrete dependency, to prevent the upstream author of the dependency from worrying about breaking code by changing Private variables? Perhaps it only supports concrete dependencies within the same module?

note that, as with the class Ord which can be generated from == and <, or == and >, or <= and <, etc, which implementations are 'primitive' can vary. The module system should do this dependency resolution.

---

i think everything in a module can access the privates of everything else in the same module, and otherwise privates cannot be accessed (except by debuggers and profilers, in debug mode)

---

perhaps all multi-file modules must be have version numbers? This is to prevent someone from extending someone else's module by adding another file to it, and then complaining when the 'parent' code changes private stuff and breaks the extension code. This way the parent code can increment their version number and then the 'child' code simply won't compile unless they keep around the source to the old module code and use that. This implies that a project might have multiple internal implementations of the same module. In this case, Oot should transparently translate values from one version to a later version by serializing and deserializing them; yes, this is slow (perhaps a facility could be provided for implementations to specify fast ways to do this between particular versions?).

---

note that, although the distinction between oot types (hasell typeclasses/java interfaces) and implementations is like C .h files and C .c files, that (a) we can mix both kinds of declarations in the same source code file, and (b) you only have to write the implementation; Oot will autogenerate a type for an implementation. So eg if you write a String implementation that is internally a linked list, and then someone else writes a String implementation that is a packed array of bytes (like Haskell Bytestrings), neither of you need to have a place in your source code that explicitly defines the String interface (type); this will be inferred (provided it matches). You CAN explicitly declare the type, though, and the compiler API has an option to generate the source code for the inferred type declaration.

You might have to explicitly declare types that have pattern matching views, eg if you want a List to be defined with two constructors, EmptyList? or cons(list, item), then you might have to declare that. But i'm not sure. When i speak of inferred type declarations, i'm thinking of things like https://docs.python.org/2/library/threading.html#semaphore-objects which are have a single constructor, have no 'algebraic'ish data type stuff (multiple constructors but also AADTs), and are just basically a list of supported methods with their signatures.

---

"Anything to do with package distribution... There are problems with version skew and dependencies that just make for an "endless mess". He dreads it when a colleague comes to him with a "simple Python question". Half the time it is some kind of import path problem and there is no easy solution to offer." -- summary of the Python founder, Guido van Rossum's answer to the question "what he hates in Python" [1]

---

"Hurdles to reuse include packaging models, entangled dependencies, import boiler plate, namespaces, hidden side-effects." -- [2]

---

" The go tool

GO15VENDOREXPERIMENT is now enabled by default.

How does it work?

/home/user/gocode/ src/ server-one/ main.go (import "github.com/gorilla/mux") server-two/ main.go (import "github.com/gorilla/mux") vendor/ github.com/ gorilla/ mux/ ...

server-one uses the mux package in $GOPATH/src/github.com/gorilla/mux.

server-two uses the mux package in vendor. " -- https://talks.golang.org/2016/state-of-go.slide#46

---

 uglycoyote 1 day ago

I have worked for game companies making large console games both with the STL and without. One thing I noticed about STL is that it really bloats the build times because the STL headers are so massive. I did some analysis of the preprocessor output one time and found that just including a couple of headers like vector.h and string.h would increase the size of the preprocessor output by 80000 to 100000 lines, even for a cop file that is otherwise trivially small. Typically almost every cpp file in the codebase would either directly use STL headers or bring them in as dependencies, and so in a codebase of a few thousand files you are talking about hundreds of millions of extra lines of code that the compiler has to churn through. This amounted to each cpp file talking several seconds to compile and the entire codebase taking most of an hour to rebuild. People would not have been able to survive that without using Incredibuild to farm off compilation to other machines. The company I currently work at does not use STL and so largely avoids this problem. I an curious to what extent EASTL has this problem or avoids it.

egoots 1 day ago

Isn't part of this addressed by taking advantage of the new C++ 11 extern template feature? http://www.stroustrup.com/C++11FAQ.html#extern-templates

maximilianburke 1 day ago

EASTL still has this problem. It's worse on Windows where the prerequisite system header (ie: yvals.h, etc.) can bring in a ton of bloat. Until modules are a reliable thing and we can take advantage of them in a reasonable cross platform way it will still remain a problem.

thinkdoge 1 day ago

The real solution is 'import' but we probably won't see that in ++17

---

for Python Flask: " Think about deployment. How is it getting to the server? egg, wheel, rpm? Will there be continuous integration? Are you using salt or puppet? How you deploy your application will determine what kind of structure you need and what kind of supporting utilites you may or may not have to write. "

---

callmevlad 15 hours ago

The fact that this is possible with NPM seems really dangerous. The author unpublished (erm, "liberated") over 250 NPM modules, making those global names (e.g. "map", "alert", "iframe", "subscription", etc) available for anyone to register and replace with any code they wish.

Since these libs are now baked into various package.json configuration files (some with 10s of thousands of installs per month, "left-pad" with 2.5M/month), meaning a malicious actor could publish a new patch version bump (for every major and minor version combination) of these libs and ship whatever they want to future npm builds. Because most package.json configs use the "^1.0.1" caret convention (and npm --save defaults to this mode), the vast majority of future installs could grab the malicious version.

@seldo Is there a plan to address this? If I'm understanding this right, it seems pretty scary :

[1] https://medium.com/@azerbike/i-ve-just-liberated-my-modules-...

coroutines 15 hours ago

So we need gpg signed packages :> And... all packages should be namespaced under the author who published them. And... I kind of want to say "once it's published, it's forever".

profmonocle 14 hours ago

> And... I kind of want to say "once it's published, it's forever".

This is effectively the norm with more traditional, curated package managers. Say I release a piece of open source software, and some Linux distro adds it to their package manager. Under a typical open source license, I have no legal right to ask them to stop distributing it. They can just say "sorry, you licensed this code to us under X license and we're distributing it under those terms. Removing it would break our users' systems, so we won't do it."

The difference is that NPM is self-service - publishers add packages themselves, and NPM has chosen to also provide a self-service option to remove packages. I honestly wouldn't have a problem with them removing that option, and only allowing packages to be removed by contacting support with a good reason. (Accidental private info disclosure, copyright violation, severe security bug, etc.)

echelon 14 hours ago

  I honestly wouldn't have a problem with them removing that option, and only
  allowing packages to be removed by contacting support with a good reason.
  (Accidental private info disclosure, copyright violation, severe security 
  bug, etc.)

Even Rust's Cargo won't allow you to revoke secrets [1]. I think this is the correct policy.

[1] http://doc.crates.io/crates-io.html#cargo-yank

drewgross 13 hours ago

Aside from secrets there is also sensitive data. If someone accidentally uploads some personal information, they need a way to remove it if, say, they receive a court order ordering them to remove it.

appleflaxen 6 hours ago

If they receive a court order, and there is no technical way to do that, then the court is out of luck. "A court might order it in the future" is not a design constraint on your decisions today.

sveiss 5 hours ago

Sure there's a technical way to do it: you unplug the server hosting it (or more likely, your hosting provider does that for you).

No court is going to shed any tears over fact this has wider consequences than if you'd been able to comply with a narrower takedown request.

---

sratner 20 hours ago

I suspect you aren't seeing much discussion because those who have a reasonable process in place, and do not consider this situation to be as bad as everyone would have you believe, tend not to comment on it as much.

For the sake of discussion, here is my set of best practices.

I review libraries before adding them to my project. This involves skimming the code or reading it in its entirety if short, skimming the list of its dependencies, and making some quality judgements on liveliness, reliability, and maintainability in case I need to fix things myself. Note that length isn't a factor on its own, but may figure into some of these other estimates. I have on occasion pasted short modules directly into my code because I didn't think their recursive dependencies were justified.

I then pin the library version and all of its dependencies with npm-shrinkwrap.

Periodically, or when I need specific changes, I use npm-check to review updates. Here, I actually do look at all the changes since my pinned version, through a combination of change and commit logs. I make the call on whether the fixes and improvements outweigh the risk of updating; usually the changes are trivial and the answer is yes, so I update, shrinkwrap, skim the diff, done.

I prefer not to pull in dependencies at deploy time, since I don't need the headache of github or npm being down when I need to deploy, and production machines may not have external internet access, let alone toolchains for compiling binary modules. Npm-pack followed by npm-install of the tarball is your friend here, and gets you pretty close to 100% reproducible deploys and rollbacks.

This list intentionally has lots of judgement calls and few absolute rules. I don't follow all of them for all of my projects, but it is what I would consider a reasonable process for things that matter.

[edit: I should add that this only applies to end products which are actually deployed. For my modules, I try to keep dependency version ranges at defaults, and recommend others do the same. All this pinning and packing is really the responsibility of the last user in the chain, and from experience, you will make their life significantly more difficult if you pin your own module dependencies.]

---

" Overview

npm allows packages to take actions that could result in a malicious npm package author to create a worm that spreads across the majority of the npm ecosystem. Description

npm is the default package manager for Node.js, which is a runtime environment for developing server-side web applications. There are several factors in the npm system that could allow for a worm to compromise the majority of the npm ecosystem:

    npm encourages the use of semver, or semantic versioning. With semver, dependencies are not locked to a certain version by default. For any dependency of a package, the dependency author can push a new version of the package.
    npm utilizes persistent authentication to the npm server. Once a user is logged in to npm, they are not logged out until they manually do so. Any user who is currently logged in and types npm install may allow any module to execute arbitrary publish commands.
    npm utilizes a centralized registry, which is utilized by the majority of the Node.js ecosys" --- https://www.kb.cert.org/vuls/id/319816

---

Being a "jack of all trades" means Scala has the superior module system. In Scala you can have abstract modules, the way you have in Ocaml. In Scala type-class instances are lexically scoped, whereas in Haskell they are global. Haskell's type-classes are anti-modular, which is why there are people avoiding type-classes. A big part of what makes Scala so good is OOP. Haskell needs extensions to achieve similar functionality in a half-baked way and modularity is the main complaint of people coming to Haskell from Ocaml. "

---

Javascript still has no good module story. A standard is in the work, but none of the main browsers have implemented it. In the meantime, a few solutions have popped up, such as Asynchronous Module Definition and CommonJS?.

Well, as it turns out, Facebook has its own AMD solution. Modules are defined with the usual __d(name, dependencies, factory). There ‘s also require and a requireLazy for importing modules. "

---

" A crate is a unit of compilation and linking, as well as versioning, distribution and runtime loading. A crate contains a tree of nested module scopes. The top level of this tree is a module that is anonymous (from the point of view of paths within the module) and any item within a crate has a canonical module path denoting its location within the crate's module tree. "

-- https://doc.rust-lang.org/reference.html#crates-and-source-files

---

" What passes for module management in Java is the JAR file - which is basically to say Java really doesn't have any module management. There is a new JSR which might solve this problem, but for the last decade we've lived with classpath hell. Java also suffers from some misguided marketing decisions which have resulted in a monolith J2SE that as of 1.6 weighs in at 44MB. The process to subset this monolithic monstrosity into J2ME moves at a glacial pace. Considering the brilliance in much of the core original Java technology, it is hard to understand why something as fundamental as modularity is lacking.

The .NET design was pretty serious about modularity, and at the high level it has a great design for versioning, GAC, etc. However when it comes to the details, .NET leaves a lot to be desired. Where Java chose ZIP as a simple, flexible way to package up files, .NET uses opaque DLLs with all sorts of Window's specific cruft that makes .NET module files difficult to work with. And to require a separate, undocumented debug pdb file to get meaningful stack traces is just plain wrong.

Everything in Fantom is designed around modular units called pods. Pods are the unit of versioning and deployment. They are combined together using clear dependencies. Like Java they are just ZIP files which can be easily examined. "

" Namespace versus Deployment

Java and .NET to a lesser degree separate the concepts of namespace and deployment. For example in Java packages are used to organize code into a namespace, but JAR files are used to organize code for deployment. The problem is there isn't any correspondence between these concepts. This only exacerbates classpath hell - you have a missing class, but the class name doesn't give you a clue as to what JAR file the class might live in.

This whole notion of type namespaces versus deployment namespaces does offer flexibility, but also seems like unnecessary complexity. Fantom takes a simple approach to managing the namespace using a fixed three level hierarchy "pod::type.slot". The first level of the namespace is always the pod name which also happens to be the unit of deployment and versioning. This consistency becomes important when building large systems through the assembly of pods and their types. For example, given a serialized type "acme::Foo", it is easy to figure out what pod you need. "

---

Olin Shivers and Jonathan Rees think Scheme 48's module system is cool:

https://scsh.net/docu/post/modules.html http://www.s48.org/1.9.2/manual/manual-Z-H-5.html#node_chap_4

" The module system was somewhat like SML's, but allowed modular macros and had another fairly cool feature: when you defined a module, clauses let you specify which files held the module's source. But *other* clauses let you specify which "reader" procedure to use to translate the character stream in the files to the s-expression tree handed to the compiler. So you could handle files with different concrete syntax -- R5RS syntax, scsh syntax, S48 syntax, PLT Scheme syntax, guile syntax, perhaps an infix syntax (as is so often discussed). That eliminated an annoying, low-level but persistent barrier to sharing code across different implementations of Scheme. "

" the module system, which started as a way to decide which files to load. I had worked on module systems before without coming up with anything I liked, so rather than invent anything this time around I just tried to rationalize and Schemize Common Lisp's defpackage, with a dash of ML modules thrown in to make it interesting. The modules started out as lists of files, some of which belonged to the cross-compilation system, some to the interpreter, and some to the "run-time system" (auxiliary Scheme code such as read that is byte-compiled and subsequently byte-interpreted). I introduced new modules as necessary so as to carve the system into parts to be built into the mobile robot system, those to go into the teledebugging system, those to go into the full non-robot system, and those to go into the various overlaps of these destinations. For a long time it was enough for the module system to keep track of which files were to be loaded into the global namespace, but eventually the module system acquired a notion of interface and provided namespace separation. "

---

About C and C++ header files, and the C++ modules proposal [3]:

header files are useful because it allows the #including module to be compiled (otherwise) separately, because the header file provides enough information to determine the in-memory representation of types
in C++, header files must be re-loaded many times, upon each #include, because they might have macros that affect the file #include'ing them, or they might compile-time-execute differently in different files (eg b/c of logic like #ifdef)
in C++, header files cause a significant increase in compilation times
in C++, the macros in header files make things difficult for tooling
- Holzmann's P10 rules recommends avoiding "Token pasting, variable argument lists (ellipses), and recursive macro calls" in macros, and requiring "All macros must expand into complete syntactic units". They recommend minimizing but not totally avoiding conditional compilation directives
another problem with header files is that they make you have to change stuff in two places, rather than one, when you change function signatures

recommendations:

abstractly, consider the type signatures of a module different from its content
in binary semi-compiled module files (OotB??) represent this explicitly (maybe with a hash, so you can easily see when it did not change?)
- this has implications for semver versioning; when the major version is above 0, patch version increases should not change the signature section hash; minor version increases can add new signatures but not alter existing ones. Major version increases can change everything.
this will be something like a module descriptor/interface/header file in order to allow separate compilation. Need enough information to determine in-memory representation of types.
avoid macros in that file
avoid any logic that would require the file to be loaded more than once if imported by different modules
the order of module imports should be irrelevant
private attributes of classes/structs should not be visible

---

should we have an #includeonce directive? No; this makes the ordering of module compilation matter, which means that separate compilation would no longer be possible. So we actually need something like #import (or 'import'), which can actually only load the module into its cache the first time it's encountered, yet has the same effect in every place (so ordering of module compiliation still doesn't matter).

---

i think we only need access modifiers on a per-module basis; eg either something is private to a module, or it's public (exported by the module)

---

:include-macros modifier to import, eg in clojure:

(require '[sablono.core :as sab :include-macros true])

---

npm apparently has a problem where if a package depends on another package, that other package will be installed in a subfolder of the original package, and then this operates recursively, so that dependencies of the dependency are installed in another subfolder within the dependency subfolder. This means that the nesting level on the filesystem of the folders for transitive dependencies can be arbitrarily deep, which is incompatible with popular filesystems, which have path length limits, eg:

https://github.com/nodejs/node-v0.x-archive/issues/6960

i guess the obvious solution to npm's problem is:

when fetching dependencies for a project, do not recursively create subfolders, rather download all dependencies into a single 'dependencies' directory for the given project. This 'dependencies' folder can have different versions of the same module in it. Note that there is still a different 'dependencies' folder for each project, eg by default nothing is installed into a user home directory (or, worse, a root-privileged /usr/share directory), rather they are installed into the project folder from which the user called the 'npm'-like tool.

This implies that the language must support having different versions of the same module in its path, and letting some loaded modules choose to depend on one version while at the same time other loaded modules depend on a different version.

---

Yarn

https://code.facebook.com/posts/1840075619545360

---

https://talks.golang.org/2017/state-of-go.slide#50

" Plugins

Define a plugin:

package main

No C code needed. import "C"

import "fmt"

var V int

func F() { fmt.Printf("Hello, number %d\n", V) }

Then build it:

go build -buildmode=plugin

Note: This currently works only on Linux. Plugins

    p, err := plugin.Open("plugin_name.so")
    if err != nil {
        panic(err)
    }

    v, err := p.Lookup("V")
    if err != nil {
        panic(err)
    }

    f, err := p.Lookup("F")
    if err != nil {
        panic(err)
    }

v.(*int) = 7 f.(func())() prints "Hello, number 7"

Plugins demo

Demo video: twitter.com/francesc

Source code: github.com/campoy/golang-plugins "

---

" And I agree with Bob Harper in his assessment that Modules Matter Most. Many languages have no module system at all, and many that have one have it only as a way of (say) managing namespaces or compilation order. More-powerful module systems exist -- you'll have run into some of the components if you've worked with dependent types, type classes, traits, signatures, functors -- but there's a bewildering array of design constraints to navigate (generativity, opacity, stratification, coherence, subtyping, higher-order-ness, first-class-ness, separate compilation, extensibility, recursion) when arriving at a practical, usable module system. Few if any languages have "done this right" in a convincing enough way that I'd say the problem is solved. The leading research in the field, at the moment, is probably Andreas Rossberg's work on 1ML. But there are decades of ground-work that you should really, really read basically all of if you're going to explore this space. "

---

" Racket has a decent module system, including build and phase separation (even separate phases for testing, cross-compilation or whatever you want), and symbol selection and renaming. Racket has typed modules, and a good interface between typed and untyped modules. While types in Racket do not compete with those of say Haskell, just yet, they are still evolving, fast, and that contract interface between typed and untyped is far ahead of anything the competition has. "

---

D has named scoped imports:

" Scoped Imports

Import declarations may be used at any scope. For example:

void main() { import std.stdio; writeln("bar"); } "

---

cmrx64 on Aug 19, 2017 [-]

1ML just shows that advanced (first-class) ML modules are really just System F_omega. That's just barely more than Haskell! More languages need to expose it nicely. But we already have most if not all the technology we need to make them efficient and worthwhile, vs just possible. This isn't a research problem. This is a design/HCI problem.

catnaroek on Aug 19, 2017 [-]

ML modules are great, but I'm not convinced that making them first-class (1ML) will be equally great. ML hits a sweet spot between automation (full type inference) and expressiveness that is only possible because the type language is first-order. This is IMO the point that most extensions on top of Hindley-Milner completely miss.

So, if anything, what I'd like to see is a dependently typed language whose type language is a first-order, computation-free (no non-constructor function application) subset of the value language.

ratmice on Aug 19, 2017 [-]

The 1ML papers discuss this, ML implementations in reality are already incomplete with regard to Hindley-Milner, a quote:

"We show how Damas/Milner-style type inference can be integrated into such a language; it is incomplete, but only in ways that are already present in existing ML implementations."

tomp on Aug 20, 2017 [-]

1ML is a conservative extension of HM - all valid ML programs are typable in 1ML without extra annotations.

Furthermore, OCaml essentially already has first-class modules - just the syntax is much more awkward than is possible with 1ML.

---

ehnto on Aug 19, 2017 [-]

I know I am basically dangeling meat into lions den with this question; How has PHP7 done in regards to the Modules section or modularity he speaks of?

I am interested in genuine and objective replies of course.

(Yes your joke is probably very funny and I am sure it's a novel and exciting quip about the state of affairs in 2006 when wordpress was the flagship product)

TazeTSchnitzel? on Aug 19, 2017 [-]

PHP 5.3 (2009) added a (simple, static) namespace system. Composer (2012) has built a sophisticated package management infrastructure on top of this.

However, PHP doesn't have a module system, for better or worse. Namespaces merely deal with name collisions and adding prefixes for you. Encapsulation only exists within classes and functions, not within packages. PHP has no concept of namespaced variables, only classes, functions and constants. Two versions of the same package cannot be loaded simultaneously if they occupy the same namespace.

There has been some relatively recent discussion about having, for example, intra-namespace visibility modifiers (e.g. https://externals.io/message/91778#92148). PHP may yet grow modules. The thing is, though, all sorts of things are suggested all the time. Many PHP 7 features had been suggested many years before (https://ajf.me/talks/2015-10-03-better-late-than-never-scala...). The reason PHP 7 has them is people took the initiative to implement them.

---

This article [4] makes a few good points:

imports had better either:
- be resolvable to a file being imported by the import statement alone (that is, without reading and compiling (or even preprocessing) every other file in the project and looking for a MODULE declaration that matches the name of the import)
- or, there needs to be a distinguished 'manifest' file that tells you which modules may be found in which file
you shouldn't be forced to (entirely?) compile the file implementing a module before you compile any file importing that module; this makes it hard to parallelize compilation (but surely you can be forced to compile just the header file?)
modules should also have the following characteristics (which c++ ones do):
- "top-down isolation - The “importer” of a module cannot affect the content of the module being imported. The state of the compiler (preprocessor) in the importing source has no bearing on the processing of the imported code."
- "Bottom-up isolation - The content of a module does not affect the state of the preprocessor in the importing code."
- "Lateral isolation - If two modules are imported by the same file, there is no “cross-talk” between them. The ordering of the import statements is insignificant"
- "Physical encapsulation - Only entities which are explicitly declared as exported by a module will be visible to consumers. Non-exported entities within a module will not affect name lookup in other modules (barring some possible strangeness with ADL. Long story...)"
- "Modular interfaces - The current module design enforces that for any given module, the entire public interface of that module is declared in a single TU called the “module-interface unit” (MIU). The implementation of subsets of the module interface may be defined in different TUs called “partitions.”"

---

great analysis and ideas here:

https://instagram-engineering.com/python-at-scale-strict-modules-c0bb9245c834

"We’ve run into a few pain points working with Python at that scale...

Pain area one: slow startup and reload

by just importing this simple eight line module (not even doing anything with it yet!), we are probably running hundreds, if not thousands of lines of Python code, not to mention modifying a global URL mapping somewhere else in our program.

So what? This is part of what it means for Python to be a dynamic, interpreted language. This lets us do all kinds of useful meta-programming. What's wrong with that?...Our server startup takes over 20s, and sometimes regresses to more like a minute if we aren't paying attention to keeping it optimized...In some ways, that's no different from waiting for another language to compile. But typically compilation can be incremental...But in Python, because imports can have arbitrary side effects, there is no safe way to incrementally reload our server. No matter how small the change, we have to start from scratch every time, importing all those modules, re-creating all those classes and functions, re-compiling all of those regular expressions, etc...

Pain area two: unsafe import side effects

Here’s another thing we often find developers doing at import time: fetching configuration from a network configuration source.

...

The root of the problem here is two factors that interact badly: 1) Python allows modules to have arbitrary and unsafe import side effects, and 2) the order of imports is not explicitly determined or controlled, it’s an emergent property of the imports present in all modules in the entire system (and can also vary based on the entry point to the system).

...

Pain area 3: mutable global state

...

The same thing can easily happen in tests, if people try to monkeypatch without a contextmanager like mock.patch. The effect here is pollution of all future tests run in that process, rather than pollution of all future requests. This is a huge cause of flakiness in our test suite. It's so bad, and so hard to thoroughly prevent, that we have basically given up and are moving to one-test-per-process isolation instead.

So that's a third pain point for us. Mutable global state is not merely available in Python, it's underfoot everywhere you look: every module, every class, every list or dictionary or set attached to a module or class, every singleton object created at module level. It requires discipline and some Python expertise to avoid accidentally polluting global state at runtime of your program.

Enter strict modules

...We have an idea: strict modules... Strict modules are a new Python module type marked with __strict__ = True at the top of the module...

Side-effect-free on import

Strict modules place some limitations on what can happen at module top-level. All module-level code, including decorators and functions/initializers called at module level, must be pure (side-effect free, no I/O). This is verified statically at compile time...

Let’s make that a bit more concrete with an example. This is a valid strict module:

"""Module docstring.""" __strict__ = Truefrom utils import log_to_networkMY_LIST = [1, 2, 3] MY_DICT = {x: x+1 for x in MY_LIST}def log_calls(func): def _wrapped(*args, kwargs): log_to_network(f"{func.__name__} called!") return func(*args, kwargs) return _wrapped@log_calls def hello_world(): log_to_network("Hello World!")

...

How do we know that the log_to_network or route functions are not safe to call at module level? We assume that anything imported from a non-strict module is unsafe, except for certain standard library functions that are known safe. If the utils module is strict, then we’d rely on the analysis of that module to tell us in turn whether log_to_network is safe.

In addition to improving reliability, side-effect-free imports also remove a major barrier to safe incremental reload...

...

Strict modules and classes defined in them are immutable after creation. ...

These changes greatly reduce the surface area for accidental mutation of global state, though mutable global state is still available if you opt-in via module-level mutable containers.

Classes defined in strict modules must also have all members defined in __init__ and are automatically given __slots__ by the module loader’s AST transformation, so it’s not possible to tack on additional ad-hoc instance attributes later.

...

These restrictions don’t just make the code more reliable, they help it run faster as well. Automatically transforming classes to add __slots__ makes them more memory efficient and eliminates per-instance dictionary lookups, speeding up attribute access. Transforming the module body to make it immutable also eliminates dictionary lookups for accessing top-level variables.

---

 rmind 3 days ago [-]

A lot C programmers prefer to keep structures within the C source file ("module"), as a poor man's encapsulation. For example:

component.h:

    struct obj;
    typedef struct obj obj_t;

    obj_t *obj_create(void);
    // .. the rest of the API

component.c:

    struct obj {
        int status;
        // .. whatever else
    };

    obj_t *
    obj_create(void)
    {
        return calloc(1, sizeof(obj_t));
    }

However, as the component grows in complexity, it often becomes necessary to separate out some of the functionality (in order to re-abstract and reduce the complexity) into a another file or files, which also operate on "struct obj". So, we move the structure into a header file under #ifdef __COMPONENT_PRIVATE (and/or component_impl.h) and sprinkle #define __COMPONENT_PRIVATE in the component source files. It's a poor man's "namespaces".

Basically, this boils down to the lack namespaces/packages/modules in C. Are you aware of any existing compiler extensions (as a precedent or work in that direction) which could provide a better solution and, perhaps, one day end up in the C standard?

P.S. And if C will ever grow such feature, I really hope it will NOT be the C++ 'namespace' (amongst many other depressing things in C++). :)

.h files allow for modular, incremental compilation but are a pain for the programming to keep two things in sync. What if we auto-generated something like .h files as part of the compilation process, like Python .pyc files?

---

https://deno.land/manual/linking_to_external_code/import_maps

---

in Zig, you can give #DEFINE settings like function arguments when importing a file

" const c = @cImport({ @cDefine("_NO_CRT_STDIO_INLINE", "1"); @cInclude("stdio.h"); });

pub fn main() void { _ = c.printf("hello\n"); } "

---

toread

https://www.stephendiehl.com/posts/exotic01.html

---

" I think overall it’s a good idea to design name resolution and module systems (mostly boring parts of a language) such that they work well with the map-reduce paradigm.

Require package declarations or infer them from the file-system layout
Forbid meta-programming facilities which add new top-level declarations, or restrict them in such way that they can be used by the indexer. For example, preprocessor-like compiler plugins that access a single file at a time might be fine.
Make sure that each source element corresponds to a single semantic element. For example, if the language supports conditional compilation, make sure that it works during name resolution (like Kotlin’s expect/actual) and not during parsing (like conditional compilation in most other languages). Otherwise, you’d have to index the same file with different conditional compilation settings, and that is messy.
Make sure that FQNs are enough for most of the name resolution.

The last point is worth elaborating. Let’s look at the following Rust example:

File: ./foo.rs trait T { fn f(&self) {} } File: ./bar.rs struct S;

File: ./somewhere/else.rs impl T for S {}

File: ./main.s use foo::T; use bar::S

fn main() { let s = S; s.f(); }

Here, we can easily find the S struct and the T trait (as they are imported directly). However, to make sure that s.f indeed refers to f from T, we also need to find the corresponding impl, and that can be roughly anywhere! " -- Three Architectures for a Responsive IDE

(see notes in chTradeoffs for more)

---

" Rust is one of the few languages which has first-class concept of libraries. Rust code is organized on two levels:

    as a tree of inter-dependent modules inside a crate

    and as a directed acyclic graph of crates

Cyclic dependencies are allowed between the modules, but not between the crates. Crates are units of reuse and privacy: only crate’s public API matters, and it is crystal clear what crate’s public API is. Moreover, crates are anonymous, so you don’t get name conflicts and dependency hell when mixing several versions of the same crate in a single crate graph.

This makes it very easy to make two pieces of code not depend on each other (non-dependencies are the essence of modularity): just put them in separate crates. During code review, only changes to Cargo.tomls need to be monitored carefully. " -- [5]

---

https://instagram-engineering.com/python-at-scale-strict-modules-c0bb9245c834?source=social.tw&gi=c39932b5c7cd

---

against static linking and pinning dependencies (motivation: static linking makes it hard for package maintainers to fix a security bug in a common library and immediately have all instances of that bug which are included (via static linking of that library into other packages) be fixed)

https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare/ https://news.ycombinator.com/item?id=26203853

---

    26
    cgenschwap 21 hours ago | link | flag |

With programming language versions code that works in one version does not work in another (such as python 2 and 3). With Rust editions each library (or crate in Rust parlance) chooses which edition to be compiled with, so you can continue to use libraries written in, say, the 2015 edition while writing code using 2021 edition Rust.

---

https://lobste.rs/s/8bczn4/for_those_familiar_with_go_cargo_npm_etc

 "For those familiar with Go and (cargo | npm | etc) how is "minimal version selection" working out?"
 ---

a friend points out that MATLAB is not good for having many experimental projects that use different version of shared analysis code that evolves over time;

its PATH system is not amenable to having reusable local folders of code with subfolders -- it's difficult to do anything besides just put all the .m files in each 'packages folder
e's not sure if it has a pkging system but e suspects that it wouldn't be up to snuff for dealing with the versioning of the dependencies (e.g. maybe project A and B both use dependency X but A uses version 1 and B uses version 2 of X)

e also points out that it's the lack of autocomplete that drives short unreadable variable names in MATLAB (along with math's tendency to need lots of short-lived temporaries without obvious name-like meanings)

---

https://lobste.rs/s/5r5fsa/don_t_make_my_mistakes_common#c_qvn6kj

39 Garbi 7 hours ago

link

flag

Nobody knows how to correctly install and package Python apps.

That’s a relief. I thought I was the only one.

    6
    KevinMGranger 7 hours ago | link | flag |

Maybe poetry and pyoxidize will have a baby and we’ll all be saved.

One can hope. One can dream.

~ andyc edited 5 hours ago

link

flag

I think the same goes for running Python web apps. I had a conversation with somebody here… and we both agreed it took us YEARS to really figure out how to run a Python web app. Compared to PHP where there is a good division of labor between hosting and app authoring.

The first app I wrote was CGI in Python on shared hosting, and that actually worked. So that’s why I like Unix – because it’s simple and works. But it is limited because I wasn’t using any libraries, etc. And SSL at that time was a problem.

Then I moved from shared hosting to a VPS. I think I started using mod_python, which is the equivalent of mod_php – a shared library within Apache.

Then I used a CherryPy? server and WSGI. (mod_python was before WSGI existed) I think it was behind Apache.

Then I moved to gunicorn behind nginx, and I still use that now.

But at the beginning of this year, I made another small Python web app with Flask. I managed to configure it on shared hosting with FastCGI?, so Python is just like PHP now!!! (Although I wouldn’t do this for big apps, just personal apps).

So I went full circle … while all the time I think PHP stayed roughly the same :) I just wanted to run a simple app and not mess with this stuff.

There were a lot of genuine improvements, like gunicorn is better than CherryPy?, nginx is easier to config than Apache, and FastCGI? is better than CGI and mod_python … but it was a lot of catching up with PHP IMO. Also FastCGI? is still barely supported.

~ strugee 3 hours ago

link

flag

I’d make an exception to this point: “…unless you’re already a Python shop.” I did this at $job and it’s going okay because it’s just in the monorepo where everyone has a Python toolchain set up. No installation required (thank god).

~ tobin_baker 3 hours ago

link

flag

Why is Python’s packaging story so much worse than Ruby’s? Is it just that dependencies aren’t specified declaratively in Python, but in code (i.e. setup.py), so you need to run code to determine them?

    ~
    singpolyma 3 hours ago | link | flag |

Gemfile and gemspec are both just ruby DSLs and can contain arbitrary code, so that’s not much different.

One thing is that pypi routinely distributes binary blobs that can be built in arbitrarily complex ways called “wheels” whereas rubygems always builds from source.

    ~
    tobin_baker 3 hours ago | link | flag |

Thanks for the explanation, so what is the fundamental unfixable issue behind Python’s packaging woes?

    ~
    giffengrabber 23 minutes ago | link | flag |

        I could be wrong but AFAICT it doesn’t seem to be the case that the Ruby crowd has solved deployment and packaging once and for all.

~ technomancy 47 minutes ago

link

flag

    I dunno; if it were me I’d treat Ruby exactly the same as Python. (Source: worked at Heroku for several years and having the heroku CLI written in Ruby was a big headache once the company expanded to hosting more than just Rails apps.)

~ lattera 3 hours ago

link

flag

I just run pkg install some-python-package-here using my OS’s package manager. ;-P

It’s usually pretty straightforward to add Python projects to our ports/package repos.

    ~
    icefox 3 hours ago | link | flag |

Speaking from experience, that works great up until it doesn’t. I have “fond” memories of an ex-coworker who developed purely on Mac (while the rest of the company at the time was a Linux shop), aggressively using docker and virtualenv to handle dependencies. It always worked great on his computer! Sigh. Lovely guy, but his code still wastes my time to this day.

    ~
    lattera 3 hours ago | link | flag |

I guess I’m too spoiled by BSD where everything’s interconnected and unified. The ports tree (and the package repo that is built off of it) is a beauty to work with.

    ~
    icefox 3 hours ago | link | flag |

I mean, we used Ubuntu, which is pretty interconnected and unified. (At the time; they’re working on destroying that with snap.) It just often didn’t have quiiiiiite what we, or at least some of us, wanted and so people reached for pip.

    ~
    lattera 3 hours ago | link | flag |

Yeah. With the ports tree and the base OS, we have full control over every single aspect of the system. With most Linux distros, you’re at the whim of the distro. With BSD, I have full reign. :-)

    ~
    giffengrabber 20 minutes ago | link | flag |

                    But it could still be the case that application X requires Python 3.1 when application Y requires Python 3.9, right? Or X requires version 1.3 of library Z which is not backwards compatible with Z 1.0, required by Y?

~ joshbuddy 2 hours ago

link

flag

Fwiw, I’ve had good luck using Pyinstaller to create standalone binaries. Even been able to build them for Mac in Circleci.

---

https://blog.rust-lang.org/2022/01/13/Rust-1.58.0.html#reduced-windows-command-search-path

---

https://derw.substack.com/p/designing-an-ml-family-language-on

---

Conditional compilation in Ci " is modeled after C#. Conditional compilation symbols can only be given on the cito command line. Conditional compilation symbols have no assigned value, they are either present or not.

Example:

if MY_SYMBOL MyOptionalFunction?();
endif

A more complicated one:

if WINDOWS DeleteFile?(filename);
elif LINUX

UNIX

    unlink(filename);

else UNKNOWN OPERATING SYSTEM!
endif

The operators allowed in #if and #elif are !, &&,

" -- [6]

, == and !=. You may reference true, which is a symbol that is always defined. false should be never defined.

---

C/C++ #embed directive [7]

---

	Hacker News new | threads | past | comments | ask | show | jobs | submit 	bshanks (3330) | logout
	
	coolsunglasses on Aug 14, 2013

parent

context

favorite

on: The Future of Programming in Node.js

You have not used a good module system. Clojure's namespace system for example is really nice.

1 point by bshanks on Aug 15, 2013 [–]

What do you like about Clojure's system?

coolsunglasses on Aug 15, 2013

parent [–]

It's full, proper namespaces. Like the best of Python and Java but simpler, more powerful, and more general.

---

I’m curious if anyone has encountered any good solutions for “dependency hell”. Let’s assume that you are using a language with a vibrant community, and libraries aren’t massive monoliths (i.e. lots of libraries, composed of lots of other libraries, with a complex dependency graph). Do any language handle this well? Or if not languages, do any library managers handle it well? (Or …?)

It’s something I’ve lived through more than a few times, so I have my own proposed solution to it, but I’m curious how other people attack it.

    8
    chenghiz 21 hours ago (unread) | link | flag |

NPM (javascript) solves this problem by installing transitive dependencies underneath the libraries that depend on them, and preferring to resolve dependencies from a peer level node_modules folder first and walking towards root if a match is not found. So if I depended on left-pad, and also another library that depended on a different (non-semver compatible) version of left-pad, when I install dependencies the directory would look like

myapp

node_modules
left-pad@2.0.0
other-library
node_modules
left-pad@1.3.0

and my application would resolve left-pad@2.0.0, but other-library would resolve it to left-pad@1.3.0.

This results in the massive node_modules directories that everyone loves to complain about, but it does solve the problem and disk space is cheap.

~ jjdh 21 hours ago (unread)

link

flag

It’s perhaps too early to tell, but my experience is that the Go ecosystem has done well here. I don’t know if I’ve just happened to be lucky in picking libraries maintained by careful people, or if it’s actually some real difference. But, I’ve had a good experience, compared to other languages I’ve done significant amounts of code in.

Random thoughts that may affect experience

    Go encourages duplication over coupling, which perhaps leads to fewer relationships in the dependency graph / larger libraries on average?
    Go itself has touted the backwards compat guarantee of the 1.x language, and the community repeats that often
    The library dependency scheme they eventually landed at encourages, effectively, renaming a library if you break backwards compat (eg. the v1/v2/vN namespace scheme)
    The interface approach? You often see libraries declare SPIs rather than depending directly on external APIs, and then implementing shims, so things end up being less coupled, in a way?

~ sanxiyn 8 hours ago (unread)

link

flag

I think Rust works fine. Rust allows multiple versions of the same package in the dependency tree. There is no conflict because different versions get different name mangling.

    ~
    brendan 6 hours ago (unread) | link | flag |

Yeah… I think the alternative of keeping tabs on duplicated dependencies is by far preferable to being blocked on an impossible to resolve dependency conflict.

---