proj-oot-ootLibrariesNotes14

https://github.com/millsp/ts-toolbelt rec. by https://news.ycombinator.com/item?id=23122438

---

rozab on May 4, 2020 [–]

Other great tools in the python terminal sphere are colorama, blessings and urwid.

Colorama is just for cross platform colouring, blessings is a very elegant wrapper over curses which is still useful for not-fullscreen things, and urwid is a full-blown widget library for TUI stuff.

jquast on May 5, 2020 [–]

please also consider blessed, an API-compatible fork of blessings that adds Windows 10 support, 24-bit color, keyboard input, and more. https://blessed.readthedocs.io/en/latest/intro.html#brief-ov...

jnwatson on May 5, 2020 [–]

Don’t forget prompt-toolkit. Notably it is a dependency of the truly excellent IPython shell.

also rich

---

things to look at for dates:

https://js-joda.github.io/js-joda/

commentary from moment.js, which is popular:

https://momentjs.com/docs/#/-project-status/recommendations/

https://www.npmtrends.com/date-fns-vs-dayjs-vs-js-joda-vs-luxon https://www.npmtrends.com/date-fns-vs-js-joda-vs-luxon-vs-moment

https://medium.com/swlh/best-moment-js-alternatives-5dfa6861a1eb

https://www.skypack.dev/blog/2021/02/the-best-javascript-date-libraries/

https://blog.logrocket.com/more-alternatives-to-moment-js/

https://news.ycombinator.com/item?id=27661667 https://2ality.com/2021/06/temporal-api.html

usually i would say just go thru these and take the simplified/sorta-intersection but here my lack of experience with date issuses makes me reluctant. Still, that's probably the best option, because we need to end up with something small.

one thing that ppl mention about some newer ones (eg temporal) and dislike about some older ones (eg moment) is immutability.

---

hyc_symas on April 14, 2020 [–]

The standard string library is still pretty bad. This would have been a much better addition for safe strcpy.

Safe strcpy

    char *stecpy(char *d, const char *s, const char *e)
    {
     while (d < e && *s)
      *d++ = *s++;
     if (d < e)
      *d = '\0';
     return d;
    }
    main() {
      char buf[64];
      char *ptr, *end = buf+sizeof(buf) ;
      ptr = stecpy(buf, "hello", end);
      ptr = stecpy(ptr, " world", end);
    }

Existing solutions are still error-prone, requiring continual recalculation of buffer len after each use in a long sequence, when the only thing that matters is where the buffer ends, which is effectively a constant across multiple calls.

What are the chances of getting something like this added to the standard library?

pascal_cuoq on April 14, 2020 [–]

For what it's worth, I personally like this approach, because there are some cases in which it requires less arithmetic in order to be used correctly. And it lends itself better to some forms of static analysis, for similar reasons, in the following sense:

There is the problem of detecting that the function overflows despite being a “safe” function. And there is the problem of precisely predicting what happens after the call, because there might be an undefined behavior in that part of the execution. When writing to, say, a member of a struct, you pass the address of the next member and the analyzer can safely assume that that member and the following ones are not modified. With a function that receives a length, the analyzer has to detect that if the pointer passed points 5 bytes before the end of the destination, the accompanying size it 5, if the pointer points 4 bytes before the end the accompanying size is 4, etc.

This is a much more difficult problem, and as soon as the analyzer fails to capture this information, it appears that the safe function a) might not be called safely and b) might overwrite the following members of the struct.

a) is a false positive, and b) generally implies tons of false positives in the remainder of the analysis.

(In this discussion I assume that you want to allow a call to a memory function to access several members of a struct. You can also choose to forbid this, but then you run into a different problem, which is that C programs do this on purpose more often than you'd think.)

msebor on April 14, 2020 [–]

There are many improved versions of string APIs out there, too many in fact to choose from, and most suffer from one flaw or another, depending on one's point of view. Most of my recent proposals to incorporate some that do solve some of the most glaring problems and that have been widely available for a decade or more and are even parts of other standards (POSIX) have been rejected by the committee. I think only memccpy and strdup and strdndup were added for C2X. (See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2349.htm for an overview.)

AceJohnny?2 on April 14, 2020 [–]

> Most of my recent proposals [...] have been rejected by the committee.

Does anyone have insight on why?

saagarjha on April 15, 2020 [–]

memccpy is a very welcome addition in the front of copying strings; what else were you thinking of proposing?

saagarjha on April 15, 2020 [–]

I recently looked at a number of string copying functions, as well as came up with an API a bit similar to yours: https://saagarjha.com/blog/2020/04/12/designing-a-better-str... (mine indicates overflow more clearly). memccpy, which is coming in C2X, makes designing these kinds of things finally possible. https://saagarjha.com/blog/2020/04/12/designing-a-better-strcpy/

doublesCs on April 14, 2020 [–]

What's wrong with:

spc476 on April 14, 2020 [–]

Well, that should be `snprintf()` to start with, but even with that, there are issues. The return type of `snprintf()` is `int`, so it can return a negative value if there was some error, so you have to check for that case. That out of the way, a positive return value is (and I'm quoting from the man page on my system) "[i]f the output was truncated due to this limit then the return value is the number of characters which would have been written to the final string if enough space had been available." So to safely use `snprintf()` the code would look something like:

    int size = snprintf(NULL,0,"some format string blah blah ...");
    if (size < 0) error();
    if (size == INT_MAX)
      error(); // because we need one more byte to store the NUL byte
    size++;
    char *p = malloc(size);
    if (p == NULL)
      error();
    int newsize = snprintf(p,size,"some format string blah blabh ... ");
    if (newsize < 0) error();
    if (newsize > size)
    {
      // ... um ... we still got truncated?
    }

Yes, using NULL with `snprintf()` if the size is 0 is allowed by C99 (I just checked the spec).

One thing I've noticed about the C standard library is that is seems adverse to functions allocating memory (outside of `malloc()`, `calloc()` and `realloc()`). I wonder if this has something to do with embedded systems?

 wahern on April 14, 2020 [–]

Perhaps you meant snprintf. But snprintf can fail on allocation failure, fail if the buffer size is > INT_MAX, and in general isn't very light weight--last time I checked glibc, snprintf was a thin wrapper around the printf machinery and is not for the faint of heart--e.g. initializing a proxy FILE object, lots of malloc interspersed with attempts to avoid malloc by using alloca.

It can also fail on bad format specifiers--not directly irrelevant here except that it forces snprintf to have a signed return value, and mixing signed (the return value) and unsigned (the size limit parameter) types is usually bad hygiene, especially in interfaces intended to obviate buffer overflows.

---

on flaws in locales

https://github.com/mpv-player/mpv/commit/1e70e82baa9193f6f027338b0fab0f5078971fbe

---

eqvinox on April 14, 2020 [–]

I haven't read most of that rant, but a thread-local setlocale() would be a godsend. Not sure if that's ISO C or POSIX though.

wahern on April 14, 2020 [–]

POSIX has added _l variants taking a locale_t argument to all the relevant string functions. I can see how per-thread state would be convenient, but it's not a comprehensive solution. With the _l variants you can write your own wrappers that pass a per-thread locale_t object.

EdSchouten? on April 23, 2020 [–]

That's uselocale().

---

hsivonen on April 14, 2020 [–]

Does the committee have plans to deprecate (as in: give compiler license to complain suchthat compiler developers can appeal to yhe standard when users complain back) locale-sensitive functions like isdigit, which is useless for processing protocol syntax, because it is locale-sensitive, and useless for processing natural-language text, because it examines only one UTF-8 codw unit?

---

https://www.reddit.com/r/ProgrammingLanguages/comments/ojgr01/a_better_name_for_monad/

---

https://github.com/nvim-lua/plenary.nvim

---

buckminster on Feb 28, 2020 [–]

Rust still doesn't get this right. If I'm calling an NFS library, say, on Windows I need to use UNIX paths. Rust needs WindowsString? and UnixString? on every platform, with OsString? as a synonym for whichever is most useful locally.

foota on Feb 29, 2020 [–]

In that case... You wouldn't be using the rust file system libraries though, right?

It seems like the simplest definition of an OsString? is "the type used to interact with the OS file system API as implemented in rust".

buckminster on Feb 29, 2020 [–]

Rust has a policy of keeping the standard library minimal and this is completely reasonable. But sometimes they overdo it. In this case it's nuts that I need to implement my own UnixString? because the standard library doesn't expose it, and when I run on Linux I have two incompatible versions of the same thing.

Another example: I wrote a command line app which takes a hostname/ip address plus an optional port number after a colon. And the whole thing's async using tokio. The way the hostname/IP address parsing is structured in tokio and the standard library meant I had to reimplement all of it to add the port number. This all feels like more effort than it should be.

---

"Other than that, BeOS? system kits are elegant, so are its APIs for drivers and filesystems. Linux does instead have a huge, ugly, syscall interface."

---

"completion-based APIs are generally “better” and finally coming to Linux" [1]

---

https://tokio.rs/blog/2021-07-tokio-uring

~ viraptor edited 3 hours ago (unread)

link flag

There’s also hugely popular https://github.com/libuv/libuv/pull/2322

(Since libuv isn’t modular it hasn’t officially landed yet, but the way I understand it, both projects are at about the same level of completion)

---

https://www.cmyr.net/blog/gui-framework-ingredients.html https://lobste.rs/s/sjht7g/so_you_want_write_gui_framework

http://www.cmyr.net/blog/rust-gui-infra.html

---

https://github.com/emilk/egui egui: an easy-to-use immediate mode GUI in pure Rust https://news.ycombinator.com/item?id=28166663

---

binarynate 2 hours ago [–]

I'm a long-time Node user but decided to use Deno for a personal project and found that it's is a pleasure to work with. Some highlights:

I haven't yet used Deno in a commercial product, but I believe I will in the future because Deno seems ready for primetime.

  [0]https://doc.deno.land/builtin/stable#Deno.readTextFile
  [1]https://deno.land/manual/standard_library

reply

---

 kaycebasques 6 hours ago [–]

Putting aside the other controversial things that Graham says in this post, I think we could find common ground and have some fun technical discussion by focusing on this angle:

> What can you say in this language that would be impossibly inconvenient to say in others?

The first experience that comes to mind for me was XSLT. Early on in my technical writing career, I needed to convert the Doxygen output of a C library to a more barebones HTML output (so that the reference docs would have the same branding as the rest of our site). I used XSLT to extract only the bits of HTML that I needed and I transformed them into nice, straightforward, semantic HTML. It took me a while to wrap my head around XSLT's flow but I was amazed to find that a task that would have taken probably 50-100 lines of looping and condition checking could be accomplished in literally 1-3 lines of XSLT.

Would love to hear other's experiences along these lines (i.e. concrete examples).

reply

sthatipamala 6 hours ago [–]

The whole tidyverse set of packages for R. Its a DSL for data analysis that makes heavy use of R macros. The vocabulary and mental model that it has about data transformation is superior to any other analysis package I've used.

Pandas/Numpy is close but R's macro/custom operators make everything much more seamless.

reply

Const-me 39 minutes ago [–]

Reflection combined with runtime code generation is very powerful in C#. Here's moderately complicated example https://github.com/Const-me/ComLightInterop/tree/master/ComL...

reply

---

https://janet-lang.org/api/index.html

---

https://en.wikipedia.org/wiki/Graphics_BASIC

--- os ipc io

https://arcan-fe.com/2021/09/20/arcan-as-operating-system-design/

---

https://github.com/cksystemsteaching/selfie/blob/50ac893bd6b9c7b88e222d10d3ac931e01e1832c/machine/include/tinycstd.h and mb some other stuff in https://github.com/cksystemsteaching/selfie/blob/main/machine/ https://github.com/cksystemsteaching/selfie/blob/50ac893bd6b9c7b88e222d10d3ac931e01e1832c/machine/include/syscalls.h https://github.com/cksystemsteaching/selfie/blob/50ac893bd6b9c7b88e222d10d3ac931e01e1832c/machine/sbi_ecall.c

libcstar: starts at line 2454 with comment L I B R A R Y https://github.com/cksystemsteaching/selfie/blob/7e7738baa9600d5045d2bc06385b09bafacbde1b/selfie.c

---

By the way, Rust’s PathBuf? is just a wrapper over OsString?. There’s nothing fancy under the hood: https://doc.rust-lang.org/src/std/path.rs.html#1076-1078

    7
    quasi_qua_quasi edited 1 month ago | link | 

The problem isn’t the underlying implementation, the problem is that paths are not strings, they just happen to be represented as them. It’d be like if Rust used Vec<u8> as its string type instead of String/str, or if instead of std::time::Instant you had u32 or whatever.

2 soc edited 1 month ago

link

The point is that OsString? has different implementations based on what the underlying operating system APIs uses.¹

This is a requirement for many cases including “OS paths allow a superset of bytes than what would be valid in the languages’ string encoding” down to avoiding “we helpfully converted the OS paths to UTF-8 for you and now we can’t find the file using that string anymore, because OS path → language string → OS path doesn’t result in the same bytes”.

¹ https://doc.rust-lang.org/std/ffi/struct.OsString.html

2 icefox 1 month ago

link

Rust’s PathBuf? is also a gigantic pain in the ass to use. Paths should be lists of path components, IMO. The string is just a serialization format.

    2
    soc 1 month ago | link | 

Not to mention that the semantics of Path::join (i. e. PathBuf::push) are just crazy.

Yeah, I want to have an operation that does two completely different things without telling me which one actually happened! /s

1 quasi_qua_quasi 1 month ago

link

What would the type of the individual components be?

    3
    soc 1 month ago | link | 

I built something like this a while ago:

I had AbsolutePaths? and RelativePaths? (to prevent invalid path operations at compile-time) with PathSegments? that were either OsStrings? or placeholders like <ROOT_DIR>, <HOME_DIR>, <CACHE_DIR> etc. (that the library understood to serialize and deserialize such that you could e. g. use these paths in config files without having to manually implement this for each use-case).

---

https://lib.rs/crates/stdx

---

pandas discussion https://news.ycombinator.com/item?id=22187121

---

http://trove4j.sourceforge.net/html/overview.html

---

https://www.fantom.org/doc/docIntro/WhyFantom describes at the bottom some learnings from Java, and also some ways in which their stdlib is more elegant

---

https://github.com/willmcgugan/textual

---

https://mdk.fr/blog/how-apt-does-its-fancy-progress-bar.html

https://lobste.rs/s/bpiqol/how_apt_does_its_fancy_progress_bar#c_8dndre "I remember when I was working on mtm, I was forever questing for the smallest commonly-distributed terminfo definition that was “useful”. That was the Sun color console terminfo definition (sun-color).

(I wanted mtm to emulate an already-existing terminal so that you wouldn’t have to install a new terminfo and have it “just work”. I ended up going with screen.) " "By default, mtm advertises itself as the widely-available screen-bce terminal type."

---

 "
 Introduction FreeRTOS? versions prior to V9.0.0 allocate the memory used by the RTOS objects listed below from the special FreeRTOS? heap. FreeRTOS? V9.0.0 and onwards gives the application writer the ability to instead provide the memory themselves, allowing the following objects to optionally be created without any memory being allocated dynamically:
    Tasks
    Software Timers
    Queues
    Event Groups
    Binary Semaphores
    Counting Semaphores
    Recursive Semaphores
    Mutexes" [2]

--- time date datetime

https://ijmacd.github.io/rfc3339-iso8601/ ---

https://git.suckless.org/sbase/file/README.html

7 technetium edited 25 hours ago

link flag

babymosesinabasket, was not expecting to see Google use suckless.org’s sbase, all with their minimalist software and quirky argument parsing (ahh, arg.h, my favorite) in the name of “simplicity” (since when did Google care about that?). Maybe because it’d be easier to port, being much smaller than, say, Busybox or GNU coreutils?

    13
    ianloic 20 hours ago | link | flag | 

We also use a fork of musl for our libc. When you’re really not unix it’s way easier to get something very simple up and running. The things I did to get opensshd running are not pretty.

(I work on Fuchsia, I do not speak for my employer, etc)

---

https://github.com/nuta/kerla/blob/2a76ec27b43095607c516d6a31407533ab24de0f/kernel/syscalls/mod.rs

" Implements *NIX process concepts: context switching, signals, fork(2), execve(2), wait4(2), etc. Supports commonly used system calls like write(2), stat(2), mmap(2), pipe(2), poll(2), ... " [3]

---

https://www.boringcactus.com/2021/10/24/2021-survey-of-rust-gui-libraries.html

---

" Open source RTOS ports on RISC-V: Nitin Deshpande

    Ported FreeRTOS, MyNewt, and Huawei LitOS.
    FreeRTOS: 32-bit version running on a RISC-V soft processor, 64-bit currently runs on Spike.
    MyNewt: RISC-V support was already available, added the BSP and MCU/HAL support.
    LiteOs: ported the kernel, BSP, and HAL. Already merged into upstream LiteOS GitHub.
    Had a positive experience with RISC-V.
    Mi-V is an ecosystem that aims to accelerate the adoption of RISC-V.

" [4]

---

https://devlog.hexops.com/2021/mach-engine-the-future-of-graphics-with-zig https://en.wikipedia.org/wiki/GLFW

---

semi-standardized exit codes:

https://www.freebsd.org/cgi/man.cgi?query=sysexits https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Process%20Exit%20Codes

---

https://github.com/klange/toaruos/blob/master/kernel/sys/syscall.c

---

"Also due to JS having a very small stdlib. Compare built-in functions to eg Kotlin, its a different world. "

---

"Emacs’s internal APIs are much more powerful than before. One can see that Clojure certainly influenced a few of them (e.g. if-let, thread-first, seq.el, map.el, etc)."

---

" For example, despite the somewhat harsh words I had for JavaScript? earlier in this post, JS has some excellent utility packages. The current market leader, lodash, is a direct or indirect dependency of nearly 9 out of 10 JS packages and does a very good job of aggregating many common functions that might otherwise be micro packages. As large as the dependency trees are in JavaScript?-land, they’d doubtless be even larger without lodash, underscore, and similar utility packages. "

---

polybox fantasy game 3d cgi api:

" Basic draw void BeginObject?(EPrimitiveType? type); void EndObject?(); void Vertex(Vec3f vec); void Color(Vec4f col); void TexCoord?(Vec2f tex); void Normal(Vec3f norm); void SetClearColor?(Vec3f color);

Transforms void MatrixMode?(EMatrixMode? mode); void Perspective(float screenWidth, float screenHeight, float nearPlane, float farPlane, float fov); void Translate(Vec3f translation); void Rotate(Vec3f rotation); void Scale(Vec3f scaling); void Identity();

Texturing void BindTexture?(const char* texturePath); void UnbindTexture?();

Lighting void NormalsMode?(ENormalsMode? mode); void EnableLighting?(bool enabled); void Light(int id, Vec3f direction, Vec3f color); void Ambient(Vec3f color);

Depth Cueing void EnableFog?(bool enabled); void SetFogStart?(float start); void SetFogEnd?(float end); void SetFogColor?(Vec3f color); " -- [5]

---

ztherion · 1 day ago

Anything in particular you miss from Bank Python when working on mainline Python programs? 56 User avatar level 2 timlardner · 1 day ago

We build our own grid compute API. You could scale out code across multiple compute nodes with one extra line:

with grid_compute() as ctx: ctx.map(my_func, my_iterable)

Because code was accessed via the network, you didn't need to worry about ensuring that your code was deployed on whatever remote host you were using. The API would handle scheduling work from different users on the same compute nodes and these jobs may be running code from different git branches.

I've worked with similar APIs in the past, but never anything as seamless.

-- [6]

---

" 2. New http.MaxBytesHandler? middleware ... MaxBytesHandler? returns a Handler that runs h with its ResponseWriter? and Request.Body wrapped by a MaxBytesReader?.

The use case for this is if you’re exposing your server directly to the internet, you may want to put a cap on how large of requests you’ll process to avoid denial of service attacks. This could already be done inside a handler with http.MaxBytesReader?, but by enforcing a limit at the middleware level, now you can ensure that it’s not accidentally forgotten in some neglected corner of your web server.

3. Unreasonably effective strings.Cut function ... strings.Cut is similar to str.partition in Python. It cuts a string into two pieces at the first place it find the separator substring. As Russ Cox wrote in the issue introducing the function: " -- [7]

---

https://libs.suckless.org/libgrapheme/ https://lobste.rs/s/7wx1z4/libgrapheme_suckless_unicode_string

---

linux switches rng to blake2 https://news.ycombinator.com/item?id=29742977

---

https://ariadne.space/2021/12/29/glibc-is-still-not-y2038-compliant-by-default/

---

https://blog.sunfishcode.online/port-std-to-rustix/

---

https://ebiten.org/blog/native_compiling_for_nintendo_switch.html

---

" I cut my teeth on AppleSoft? BASIC in the 1980s. The only affordance for “structured programming” was GOSUB and the closest thing there was to an integrated assembler was a readily accessible system monitor where you could manually edit memory. The graphics primitives were extremely limited. (You could enable graphics modes, change colors, toggle pixels, and draw lines IIRC. You might have been able to fill regions, too, but I can’t swear to that.) For rich text, you could change foreground and background color. Various beeps were all you could do for sound, unless you wanted to POKE the hardware directly. If you did that you could do white noise and waveforms too. I don’t have enough time on the CoCo? to say so with certainty, but I believe it was closer to the Apple experience than what you describe.

The thing that I miss about it most, and that I think has been lost to some degree, is that the system booted instantly to a prompt that expected you to program it. You had to do something else to do anything other than program the computer. That said, manually managing line numbers was no picnic. And I’m quite attached to things like visual editing and syntax highlighting these days. And while online help/autocomplete is easier than thumbing through my stack of paper documentation was, I might have learned more, more quickly, from that paper. "

---

https://github.com/plasma-umass/browsix/blob/master/src/syscall-api/table.ts

---

https://mchow.com/posts/2020-02-11-dplyr-in-python/

---

" private enum EsSyscallType? { Memory.

	ES_SYSCALL_MEMORY_ALLOCATE
	ES_SYSCALL_MEMORY_FREE
	ES_SYSCALL_MEMORY_MAP_OBJECT
	ES_SYSCALL_MEMORY_COMMIT
	ES_SYSCALL_MEMORY_FAULT_RANGE
	ES_SYSCALL_MEMORY_GET_AVAILABLE
	// Processing.
	ES_SYSCALL_EVENT_CREATE
	ES_SYSCALL_EVENT_RESET
	ES_SYSCALL_EVENT_SET
	ES_SYSCALL_PROCESS_CRASH
	ES_SYSCALL_PROCESS_CREATE
	ES_SYSCALL_PROCESS_GET_STATE
	ES_SYSCALL_PROCESS_GET_STATUS
	ES_SYSCALL_PROCESS_GET_TLS
	ES_SYSCALL_PROCESS_OPEN
	ES_SYSCALL_PROCESS_PAUSE
	ES_SYSCALL_PROCESS_TERMINATE
	ES_SYSCALL_SLEEP
	ES_SYSCALL_THREAD_CREATE
	ES_SYSCALL_THREAD_GET_ID
	ES_SYSCALL_THREAD_SET_TLS
	ES_SYSCALL_THREAD_SET_TIMER_ADJUST_ADDRESS
	ES_SYSCALL_THREAD_STACK_SIZE
	ES_SYSCALL_THREAD_TERMINATE
	ES_SYSCALL_WAIT
	ES_SYSCALL_YIELD_SCHEDULER
	// Windowing.
	ES_SYSCALL_MESSAGE_GET
	ES_SYSCALL_MESSAGE_POST
	ES_SYSCALL_MESSAGE_WAIT
	ES_SYSCALL_CURSOR_POSITION_GET
	ES_SYSCALL_CURSOR_POSITION_SET
	ES_SYSCALL_CURSOR_PROPERTIES_SET
	ES_SYSCALL_GAME_CONTROLLER_STATE_POLL
	ES_SYSCALL_EYEDROP_START
	ES_SYSCALL_SCREEN_WORK_AREA_SET
	ES_SYSCALL_SCREEN_WORK_AREA_GET
	ES_SYSCALL_SCREEN_BOUNDS_GET
	ES_SYSCALL_SCREEN_FORCE_UPDATE
	ES_SYSCALL_WINDOW_CREATE
	ES_SYSCALL_WINDOW_CLOSE
	ES_SYSCALL_WINDOW_REDRAW
	ES_SYSCALL_WINDOW_MOVE
	ES_SYSCALL_WINDOW_TRANSFER_PRESS
	ES_SYSCALL_WINDOW_FIND_BY_POINT
	ES_SYSCALL_WINDOW_GET_ID
	ES_SYSCALL_WINDOW_GET_BOUNDS
	ES_SYSCALL_WINDOW_SET_BITS
	ES_SYSCALL_WINDOW_SET_CURSOR
	ES_SYSCALL_WINDOW_SET_PROPERTY
	// IO.
	ES_SYSCALL_NODE_OPEN
	ES_SYSCALL_NODE_DELETE
	ES_SYSCALL_NODE_MOVE
	ES_SYSCALL_FILE_READ_SYNC
	ES_SYSCALL_FILE_WRITE_SYNC
	ES_SYSCALL_FILE_RESIZE
	ES_SYSCALL_FILE_GET_SIZE
	ES_SYSCALL_FILE_CONTROL
	ES_SYSCALL_DIRECTORY_ENUMERATE
	ES_SYSCALL_VOLUME_GET_INFORMATION
	ES_SYSCALL_DEVICE_CONTROL
	// Networking.
	ES_SYSCALL_DOMAIN_NAME_RESOLVE
	ES_SYSCALL_ECHO_REQUEST
	ES_SYSCALL_CONNECTION_OPEN
	ES_SYSCALL_CONNECTION_POLL
	ES_SYSCALL_CONNECTION_NOTIFY
	// IPC.
	ES_SYSCALL_CONSTANT_BUFFER_READ
	ES_SYSCALL_CONSTANT_BUFFER_CREATE
	ES_SYSCALL_PIPE_CREATE
	ES_SYSCALL_PIPE_WRITE
	ES_SYSCALL_PIPE_READ
	// Misc.
	ES_SYSCALL_HANDLE_CLOSE
	ES_SYSCALL_HANDLE_SHARE
	ES_SYSCALL_BATCH
	ES_SYSCALL_DEBUG_COMMAND
	ES_SYSCALL_POSIX
	ES_SYSCALL_PRINT
	ES_SYSCALL_SHUTDOWN
	ES_SYSCALL_SYSTEM_TAKE_SNAPSHOT
	ES_SYSCALL_PROCESSOR_COUNT
	// End.
	ES_SYSCALL_COUNT} " -- [8]

---

"I remember haskell's monads really clicking for me when using megaparsec for the first edition of advent of code, s"

---

" And that there is the bare minimum amount that you have to do in order to use strtol safely. This is, and I can't stress this enough, a horrible interface. Is there anything in there that's better? Well, if you're on BSD then the OpenBSD? folks had your back in 2004 with the release of OpenBSD? 3.6, which came with the shiny new strtonum function, which is itself just a wrapper around the awful interface described above that is much more usable and, dare I say it, sane? But alas, as with many things BSD even when thay do something that is obviously better than Linux it just never really gets adopted there. And that company from Redmond? Well they've been happily overflowing buffers with C++ for some decades now and are too busy patching and force-rebooting to have noticed something that happened in a hobbyist operating system 18 years ago. It's not in the POSIX standard. It made it into the other BSD's, and into libbsd, but unless you want to make your C code non-portable best to not use it. Also, did I mention that it's just a wrapper around that crap function strtoll to begin with? Have I mentioned that BSD libc is in every way superior to glibc? But I digress.. " -- [9]

---

" Unlike other formatting functions in C++ and C libraries, std::to_chars is locale-independent, non-allocating, and non-throwing. Only a small subset of formatting policies used by other libraries (such as std::sprintf) is provided. This is intended to allow the fastest possible implementation that is useful in common high-throughput contexts such as text-based interchange (JSON or XML). " there's also from_chars

---

https://github.com/nicferrier/emacs-kv

https://github.com/emacs-mirror/emacs/blob/master/lisp/emacs-lisp/map.el

---

BeetleB? 13 minutes ago

next [–]

First: Very convenient to provide a numerical program to showcase.

Mainly:

> Certainly things like I/O are still problematic for the novice programmer in these languages.

Poor I/O basically killed my interest in programming in most languages when I was a novice. BASIC was great for this. QuickBasic? was, and probably still is king. People new to programming want to do fun stuff, and it's very hard to do without I/O. Me and most of my peers spent a fair amount of their hobby programming writing very simple games, or doing fun graphics stuff (drawing lines, circles, etc).

After doing this for a while in BASIC and QB, "serious" languages like C/C++ were pure Hell. I could literally do none of the cool things I used to with only a rudimentary command of the language. Did not know about Perl/Python until much later.

In my opinion, if you want learning languages to be fun, provide one where:

Teach people these things, and they can have a lot of fun with loops, conditionals, and functions.

reply

---

https://www.reddit.com/r/emacs/comments/7dp6oa/whats_the_preferred_way_to_filter_a_list_in_elisp/

---

https://blog.nindalf.com/posts/rust-stdlib/ https://lobste.rs/s/ihy9ce/rust_has_small_standard_library_s_ok

---

https://wiki.alopex.li/Dllicious

most popular "DLLs" (shared objects) used on Linux, in ascending order of usage frequency:

    802 libXcursor.so.1
    809 libXrandr.so.2
    818 libdw.so.1
    837 libcairo.so.2
    848 libXi.so.6
    855 libxcb-render.so.0
    857 libxcb-shm.so.0
    873 libXfixes.so.3
    875 libelf.so.1
    877 libpixman-1.so.0
    928 libgdk_pixbuf-2.0.so.0
   1009 libharfbuzz.so.0
   1016 libgraphite2.so.3
   1027 libXrender.so.1
   1035 libdbus-1.so.3
   1060 libfontconfig.so.1
   1061 libbz2.so.1.0
   1113 libjpeg.so.62
   1158 libsystemd.so.0
   1191 libXext.so.6
   1244 libfreetype.so.6
   1256 libexpat.so.1
   1257 libuuid.so.1
   1298 libcap.so.2
   1310 liblz4.so.1
   1362 libbrotlidec.so.1
   1371 libbrotlicommon.so.1
   1443 libgcrypt.so.20
   1482 libgpg-error.so.0
   1507 libpng16.so.16
   1527 libzstd.so.1
   1569 libX11.so.6
   1595 libgio-2.0.so.0
   1693 libxcb.so.1
   1694 libXdmcp.so.6
   1702 libXau.so.6
   1803 libmount.so.1
   1871 libblkid.so.1
   1951 libbsd.so.0
   1952 libmd.so.0
   2035 libselinux.so.1
   2059 libpcre2-8.so.0
   2106 librt.so.1
   2217 libresolv.so.2
   2224 libstdc++.so.6
   2258 libgmodule-2.0.so.0
   2302 libgobject-2.0.so.0
   2506 liblzma.so.5
   2685 libglib-2.0.so.0
   2693 libpcre.so.3
   2953 libffi.so.8
   3851 libgcc_s.so.1
   3956 libz.so.1
   5764 libm.so.6
   7065 libdl.so.2
   7314 libpthread.so.0
  13320 /lib64/ld-linux-x86-64.so.2
  13435 libc.so.6
  13452 linux-vdso.so.1
  ---

benhoyt 16 hours ago

... Go's regexp package does have a couple of advantages over Python, Perl, and so on: 1) it's guaranteed linear time in the length of the input, regardless of the regex, see https://swtch.com/~rsc/regexp/regexp1.html, and 2) it's a relatively simple implementation.
parent next [–]

reply

bboreham 14 hours ago

root parent next [–]

I have done some optimisations in Go regex recently; I have a talk coming up on Saturday:

https://fosdem.org/2022/schedule/event/go_finite_automata/

This repo collects all the changes so you can try them out: https://github.com/grafana/regexp/tree/speedup#readme

reply

benhoyt 14 hours ago

root parent next [–]

That's excellent! Those all look like pretty nice small code changes that all add up. I especially like the very small "avoid copy" change (https://go-review.googlesource.com/c/go/+/355789) that adds up to a 30% speedup on many benchmarks. I hope they get included in Go 1.19. Good work!

reply

---

https://github.com/google/zx

---

https://www.snoyman.com/blog/2019/11/boring-haskell-manifesto/ https://github.com/commercialhaskell/rio#readme

---

filesystem api example https://github.com/littlefs-project/littlefs

---

https://lwn.net/SubscriberLink/888043/1bd384391190f7d1/ Python finally offloads some batteries

---

" eranation 1 day ago

prev next [–]

Some less mathematical / core CS and more cloud / system engineering:

But some of the more day to day productivity boosters: know your development tools (including all keyboard shortcuts, including multi line editing, refactoring, grow shrink AST based selection, know all git / bash commands without googling…)

Know your programming language / framework deeply is also a supposedly simple one that just needs practice and dedication.

reply "

---

https://wiki.xxiivv.com/site/varvara.html

---

"Jd has one or more databases. A database has one or more tables. A table is rows (unnamed) and columns (named). Data in a column is of the same type (integer, float, datetime, fixed length characters, variable length characters, etc.). Tables can be joined with other tables to function as if they were that large table. Rows and columns can be retrieved by queries on the joined tables. Retrieved data from a query can be aggregated, grouped, and sorted. Rows can be deleted, updated, and inserted. "

---

https://wrapt.readthedocs.io/en/latest/

---

https://hpyproject.org/blog/posts/2021/03/hello-hpy/ https://news.ycombinator.com/item?id=26625398

---

https://eugenkiss.github.io/7guis/tasks/

---

https://raphlinus.github.io/rust/gui/2022/05/07/ui-architecture.html

"

pornel 3 days ago

next [–]

The background for this is that most existing UI toolkits are a poor fit for Rust.

Rust doesn't like having shared mutable state, but event-based UIs have a global event loop and can mutate anything at any time.

Rust works best with strictly tree-shaped data structures, but event handlers turn trees of widgets into arbitrary webs with possibility of circular references.

Rust prefers everything thread-safe, but many UI toolkits can't even be touched from a "non-main" thread.

Rust's object-oriented programming features are pretty shallow, and Rust doesn't have inheritance. That makes it awkward to model most toolkits that have deep hierarchies with a base View type and a dozen of Button subclasses.

So instead of retrofitting mutable single-threaded OOP to a functional multi-threaded language, there's a quest to find another approach for UIs. This change of approach has worked for games. Rust wasn't nice for "class Player extends Entity" design, but turned out to be a great fit the ECS pattern.

reply

jchw 3 days ago

parent next [–]

The thing is, there do exist UI frameworks that prefer composition over inheritance and strictly tree shaped components where data only flows one way: they're all the rage on the web.

And that's great, because they give plenty of useful insight into things that work and don't work when designing UI frameworks this way.

To be fair, it's not exactly like nobody realized this. More than one Rust desktop UI framework is explicitly React inspired, and it's not like FRP-based UI was non-existent prior to being popularized in web frameworks. Still, all the same... I suspect the best answers for how to do good UI in Rust are not far away from this paradigm.

reply "

---

https://github.com/tezc/sc https://news.ycombinator.com/item?id=31404201

---

LightMachine? on Sept 28, 2015

parent context prev [–]on: Show HN: Caramel – a modern syntax for the lambda ...

The point of ADT syntax is to allow deriving many functions on different data structures automatically. For example, lists and some functions can be defined as:

    cons      = (x list cons nil -> (cons x (list cons nil)))
    nil       = (cons nil -> (id nil))
    map       = (f list cons -> (list (comp cons f)))
    head      = (list -> (list (a b -> a) nil))
    tail      = (list cons nil -> (list (h t g -> (g h (t cons))) (const nil) (h t -> t)))
    zip_with  = (f a b -> ((left a) (right f b)))
        left  = (foldr (x xs cont -> (cont x xs)) (const []))
        right = (f -> (foldr (y ys x cont -> (cons (f x y) (cont ys))) (const (const []))))
    length    = (flip comp const)

The very same definitions could be written more tersely as just:

    List     = #{Cons Type * | Nil}
    cons     = (Ctor 0 List)
    nil      = (Ctor 1 List)
    head     = (Getter 0 0 List)
    tail     = (Getter 0 1 List)
    zip_with = (ZipWith List)
    length   = (Size List)

Where Ctor/Getter/ZipWith?/Size are proper derivers. It is possible because all the information you need is there, but it is very hard to come up with those derivers and I currently only have some very bloated derivers for Scott encoded structures, and none for Church encoded structures yet.

tikhonj on Sept 28, 2015

parent next [–]

Coincidentally, if you're thinking of adding pattern matching to the underlying calculus (not just as syntactic sugar), the book I mentioned in another comment (Pattern Calculus) is worth a look. The author explores what a language could look like of patterns were first-class and used to express arbitrary computations.

At the very least, it could be fun to have an extended version of the system (or a plugin of some sort, perhaps) using his ideas about pattern matching.

---

https://docs.rs/crossbeam/latest/crossbeam/index.html

--- rust async runtime

https://github.com/smol-rs/smol

---

https://www.csie.ntu.edu.tw/~cyy/courses/introCS/14fall/lectures/handouts/lec16_OS_4up.pdf

--- ppl like loguru over the python stdlib log module: https://news.ycombinator.com/item?id=31945564

--- pandas replacement

https://www.pola.rs/

---

wasm syscalls(?) in golang-on-wasm implementation: wasmMove wasmZero wasmDiv wasmTruncS wasmTruncU exitThread osyield usleep currentMemory growMemory wasmExit wasmWrite nanotime walltime scheduleCallback clearScheduledCallback getRandomData -- https://github.com/neelance/go/blob/13b0931dc3fa8c8a6ab403dbdae348978a53c014/src/runtime/sys_wasm.s

---

syscalls that this 'pledge' port to linux bothered to include:

" stdio allows close, dup, dup2, dup3, fchdir, fstat, fsync, fdatasync, ftruncate, getdents, getegid, getrandom, geteuid, getgid, getgroups, getitimer, getpgid, getpgrp, getpid, getppid, getresgid, getresuid, getrlimit, getsid, wait4, gettimeofday, getuid, lseek, madvise, brk, arch_prctl, uname, set_tid_address, clock_getres, clock_gettime, clock_nanosleep, mmap (PROT_EXEC and weird flags aren't allowed), mprotect (PROT_EXEC isn't allowed), msync, munmap, nanosleep, pipe, pipe2, read, readv, pread, recv, poll, recvfrom, preadv, write, writev, pwrite, pwritev, select, send, sendto (only if addr is null), setitimer, shutdown, sigaction (but SIGSYS is forbidden), sigaltstack, sigprocmask, sigreturn, sigsuspend, umask, socketpair, ioctl(FIONREAD), ioctl(FIONBIO), ioctl(FIOCLEX), ioctl(FIONCLEX), fcntl(F_GETFD), fcntl(F_SETFD), fcntl(F_GETFL), fcntl(F_SETFL). rpath (read-only path ops) allows chdir, getcwd, open(O_RDONLY), openat(O_RDONLY), stat, fstat, lstat, fstatat, access, faccessat, readlink, readlinkat, statfs, fstatfs. wpath (write path ops) allows getcwd, open(O_WRONLY), openat(O_WRONLY), stat, fstat, lstat, fstatat, access, faccessat, readlink, readlinkat, chmod, fchmod, fchmodat. cpath (create path ops) allows open(O_CREAT), openat(O_CREAT), rename, renameat, renameat2, link, linkat, symlink, symlinkat, unlink, rmdir, unlinkat, mkdir, mkdirat. dpath (create special path ops) allows mknod, mknodat, mkfifo. flock allows flock, fcntl(F_GETLK), fcntl(F_SETLK), fcntl(F_SETLKW). tty allows ioctl(TIOCGWINSZ), ioctl(TCGETS), ioctl(TCSETS), ioctl(TCSETSW), ioctl(TCSETSF). recvfd allows recvmsg(SCM_RIGHTS). fattr allows chmod, fchmod, fchmodat, utime, utimes, futimens, utimensat. inet allows socket(AF_INET), listen, bind, connect, accept, accept4, getpeername, getsockname, setsockopt, getsockopt, sendto. unix allows socket(AF_UNIX), listen, bind, connect, accept, accept4, getpeername, getsockname, setsockopt, getsockopt. dns allows socket(AF_INET), sendto, recvfrom, connect. proc allows fork, vfork, kill, getpriority, setpriority, prlimit, setrlimit, setpgid, setsid. thread allows clone, futex, and permits PROT_EXEC in mprotect. id allows setuid, setreuid, setresuid, setgid, setregid, setresgid, setgroups, prlimit, setrlimit, getpriority, setpriority, setfsuid, setfsgid. exec allows execve, execveat, access, faccessat. On Linux this also weakens some security to permit running APE binaries. However on OpenBSD? they must be assimilate beforehand. On Linux, mmap() will be loosened up to allow creating PROT_EXEC memory (for APE loader) and system call origin verification won't be activated. execnative allows execve, execveat. Can only be used to run native executables; you won't be able to run APE binaries. mmap() and mprotect() are still prevented from creating executable memory. System call origin verification can't be enabled. If you always assimilate your APE binaries, then this should be preferred. " -- [10]

---

tui ui lib

https://github.com/charmbracelet/gum

---

"

    If you haven't learned these, you really should. Here are the functions, to give you something to search: For running sub-processes:
    Platform 	Function
    Windows 	CreateProcess
    Linux 	fork, exec
    For dynamic loading:
    Platform 	Function
    Windows 	LoadLibrary, GetProcAddress
    Linux 	dlopen, dlsym
    If you want to load code without using dynamic linking, you'll want to learn about virtual memory and mmap() (Linux) or VirtualAlloc() (Windows).
    By using both sub-process execution and dynamic loading, you can have applications e.g. invoke a compiler to build a dynamic library, then immediately load that library into the same application. This is one way you could allow your users to modify and extend your application while it stays running.↩︎" -- [11]

---

https://github.com/effil/effil

---

https://wintercg.org/

---

don't use "linear congruent generators" for pseudorandomness:

~ k749gtnc9l3w 11 hours ago (unread)

link flag

Just like the dead salmon it is something that needs an article as a part of making it well known, but it looks like the issue is «please don’t use linear congruent generators in their pure form».

I failed to find a comparison of inter-generator differences with variance between re-runs with the same generator while the seed changes; is there one? They do rerun the simulations with slightly different simulation parameters (but apparently don’t bother to distinguish pure random variance between Monte-Carlo runs and the impact of minor parameter changes on random behaviours)

~ madhadron 1 hour ago (unread)

link flag

I have mixed feelings about papers like this.

One one side, this is old news. Like, sixty or seventy year old news. The quality of your random number generator is key, and at least in computational physics paying attention to its properties is drilled into you. And seriously, who uses a linear congruent generator these days? The stdlib may still have one, but you shouldn’t use it. It’s for if you need a couple of random number for some miscellaneous reason, not if you need it to do heavy lifting.

On the other hand, apparently this isn’t common knowledge in molecular simulation, so maybe it’s an important thing to repeat again in this context?

    ~
    kornel edited 13 minutes ago (unread) | link | flag | 
    The stdlib may still have one, but you shouldn’t use it.

This is a thing that sets people up for failure. “Standard library” sounds like an endorsement, and usually standards are things you’re supposed to be using. A biologist shouldn’t need to know about language’s historical mistakes that lead to a minefield.

I’m going to start a petition to rename “standard libraries” to “code cemeteries”.

-- https://lobste.rs/s/ijzwtj/quality_random_number_generators

---

https://blog.m-ou.se/rust-cpp-concurrency/

---

1 point by bshanks on Feb 3, 2021

parent context prev next [–]on: Go 1.16 will make system calls through Libc on Ope...

Could you give any pointers to what is broken about these, and how a better way to do it would be? (I'm thinking about implementing a small system call library interface and I'd like to not repeat the mistakes of the past)

convolvatron on Feb 3, 2021

parent next [–]

user supplied read addresses - when a process does a write() it makes sense for it to supply the data as it filled it in. but it almost never makes the same sense for a read(), and using the user address requires a copyout. i'm convinced that having the read() results show up in a kernel-allocated buffer in userspace is a better idea, but this is somewhat subjective

getpwent(), the whole notion of users - we dont use computers the same way we used to in the 70s. talk, write, finger, wall - they aren't very fun anymore since its either just me on my laptop, or one of the 100s of virtual machines floating around. more importantly, the attempts to glue unix system user identity to distributed identities (PAM) have really turned out to be a mess

filesystem permissions - these are clearly insufficient given the number of system-specific addons here.

signals are so riddled with constraints and incompatibilities that they are basically useless - except you have to fiddle with them for things like SIGPIPE

ttys were already kind of broken when they were relevant,

errno is actually a property of the libc, but the status interface is pretty broken - have you even grepped the kernel code to find out what might be issuing an EINVAL?

---

" I’ve been programming for over 30 years, and never been as productive as before. Need to load and decode a jpeg/png, I grab stb_image.h. Need to decode an ogg file, libogg. Need to decompress, libz. Need to decode video, libavformat. Need physics, libbullet. Need truetype fonts, freetype. Need a GUI, Qt. Need SSL, libssl." -- https://news.ycombinator.com/item?id=32496669

---

uxn/varvara

---

https://www.cliki.net/naming+conventions

---

 To print a string on UNIX systems, we must invoke the int write(fd, data, size) system call, where fd is normally specified as 1, which means standard output. Once again, due to the similarities of the post-shakeout operating system landscape, all four systems happen to implement the exact same function. However, in order to actually call it, there's another number we must specify that exists beneath the iceberg of what we see in terms of C code. That number is the system call magic number, or "magnum", and it's part of what's known as the Application Binary Interface (ABI). Unfortunately, operating systems don't agree on magnums. For example, on Linux write() is 1, but on the BSDs it's 4. So if we ran Linux's write() on BSD, it would actually call exit() – not what we want!UNIX SYSCALL Magnums (x86-64) Function 	Linux 	FreeBSD? OpenBSD? NetBSD? exit 	60 1 1 1 fork 	57 2 2 2 read 	0 3 3 3 write 	1 4 4 4 open 	2 5 5 5 close 	3 6 6 6 stat 	4 n/a 	38 439 fstat 	5 551 53 440 poll 	7 209 252 209

You'll notice there's a subset of numbers on which all systems agree, particularly among the BSDs. Those tend to be the really old functions that were designed for the original UNIX operating system written at Bell Labs. In fact, numbers such as 1 for exit() were copied straight out of the Bell System Five codebase. That's why we call it the System V Application Binary Interface. There even used to be more consensus if we look at past editions.

---

alternative to / generalization of files : the "box"

"

1. is typed, 2. may contain inner boxes, and might be contained in outer boxes, 3. can be handled with three operations:

    copy(otherBox,aBox)
        which copies aBox to otherBox, (provided both boxes are type-compatible). 
    share(otherBox,aBox)
        which makes otherBox become logically the same as aBox (provided both boxes are type-compatible). 
    select(aBox,aSelector) $\Rightarrow$innerBox
        which returns a handle for a box inside the given one--according to the name or selector supplied. 
    When aBox and otherBox are not type-compatible, a type converter must be used to perform a copy or a share operation. Box operations are atomic, in the sense that concurrent operations on a given box will be serialized by the box implementation.

To operate on pairs of boxes which are not type-compatible, every application has a set of converters, including:

    A set of translators, used to automatically ``adapt'' the box type when needed, by doing some transformation. (e.g. the LATEX converter translates LATEX document boxes into DVI boxes).
    A set of type compatibility declarations. (e.g. A LATEX document can be considered to be a text). 

When an application issues a copy (or share) operation requiring type conversion, the (per-application) converter set is searched for a suitable converter. Converters and those system entities which are not simple ``data, can be also seen as boxes. Converters can be modeled as boxes which, once given some input box(es), generate new contents for their output box(es). "

https://web.archive.org/web/20100408201601/http://plan9.escet.urjc.es/who/nemo/export/2kblocks/node2.html

they imagine that file creation would be handled by "select" and deletion would be handled by first selecting the box and then copying an "anti-box" which deletes stuff into the box [12]. reading from a file using a buffer would be done by "selecting" a box that provides additional buffer ops? [13]

---

blackboard systems

javaspaces

"...the basics of the JavaSpaces? API: write deposits an object, called an entry, into a space; read makes a copy of an entry in a space (but leaves the object there); and take removes an entry from a space. Let's see how this simple API incorporates the basics of space-based synchronization ... Any number of processes can read the entry in the space at a given time. But suppose a process wants to update the entry: the process must first remove the entry from the space, make the changes, and then write the modified entry back to the space:

Message result = (Message)space.take(template, null, Long.MAX_VALUE); result.content = "Goodbye!"; space.write(result, null, Lease.FOREVER);

It's important to note that a process needs to obtain exclusive access to an entry that resides in the space before modifying it. If multiple processes are trying to update the same entry using this code, only one obtains the entry at a time. The others wait during the take operation until the entry has been written back to the space, and then one of the waiting processes takes the entry from the space while the others continue to wait.

The basics of space-based programming synchronization can be quickly summarized: Multiple processes can read an entry in a space at any time. But when a process wants to update an entry, it first has to remove it from the space and thereby gain sole access to it. " -- [14]

---

proposes a better replacement for malloc/free/realloc. Shows that C++23 already has a similar allocator API, and so does Rust.

https://www.foonathan.net/2022/08/malloc-interface/

comment on this on lobste.rs:

6 david_chisnall 20 hours ago (unread)

link flag

My perspective, from working on a high-performance memory allocator:

Alignment is actually pretty difficult to support in memory allocators. We align everything to 16 bytes but for anything larger than that we just round the requested size up to that power of two (alignment for power-of-two-sized allocations is easy to support). C11 / C++11’s aligned_alloc or POSIX’s posix_memalign don’t convey the fact that aligned allocations might get a lot more padding than you’d expect. Windows has a very interesting aligned allocation API that allocates such that a specific offset is aligned. This should be useful for OO languages that provide some kind of object header (e.g. a refcount) and so might want a 32-bit field in the header and then for the next field to be 16-byte aligned. Unfortunately, it actually does this by allocating something 16-byte aligned and giving you a pointer into the middle of the allocation, so you don’t actually save memory.

Metadata storage overhead is pretty low for sizeclass-based allocators. In snmalloc, there are two metadata structures:

    The pagemap, which is an array with one entry per chunk (by default, the chunk size is 16 KiB), reserved in the address space and lazily backed by real memory. This uses two pointers worth of space per 16 KiB.
    The slab metadata. This is a structure allocated per chunk for managing free lists. It is quite small and there’s one per 16 KiB chunk (for small allocations) or one shared across multiple 16 KiB chunks (for large allocations).

The sizeclass for a chunk is actually stored in the low bits of one of the pointers in the pagemap, so we’re storing <1 B for every 16 KiB? of allocations. Calling this wasteful is nonsense. Storing a size_t outside of the allocator to track the size is far more wasteful.

Oh, and the size that the user knows is not always the useful one. There are two sizes: the size of the underlying allocation and the size that the user requested. The former is greater than or equal to the size of the latter. If the user gives us the latter, it’s about as expensive for us to get the former as it is for us to get it if they give us nothing. The only thing that we actually use this information for is sanity checking: if the user thinks they’re freeing a size that doesn’t match the sizeclass of the allocation, then we can abort.

For the third problem, malloc_usable_size is non-standard, but it (or some different spelling) is available basically everywhere. Jemalloc has an API that returns the allocated size (snmalloc has a compat implementation of this). I’d like to see this in the standard.

I completely disagree on the fourth point. The best thing to do with realloc is throw it away. Allocators that subdivide or merge allocations are useful only on very resource-constrained systems. For high-performance allocators, realloc is no more efficient than doing a malloc and a copy because that’s what it’s going to be doing almost all of the time. In snmalloc, if the new size is the same sizeclass as the old, realloc is a no-op, if it’s different then we do an alloc and a copy. It is a headache for performance because it’s either a very cheap or a very expensive operation depending on the two sizes. I would love to see realloc deprecated in C2y. It’s also very easy to get into UB-land accidentally with realloc. Because it is logically a free and a malloc, it is UB to use any pointer that you had before the allocation. The only safe way for it to work in C++ would be if the signature looked something like std::unique_ptr<T> realloc(std::unique_ptr<T> old, size_t newSize).

On the solutions, I agree that allocate_at_least is nice. Jemalloc has had this for years and snmalloc also has it, I believe other allocators probably do as well.

~ spc476 1 hour ago (unread)

link flag

I used to program under an OS that had a free() that required the size parameter (only it was called FreeMem?() [1]). Later on, when I moved to programming on Unix, I copied that API and use it for a few years. I don’t recall why exactly I gave up on it, maybe it was just too much trouble tracking the size of all allocations, or having to call strlen() + 1 all the time when freeing memory [2].

Not mentioned is that realloc() can also shrink a memory allocation. It’s useful when you are reading dynamic data in and keep bumping up the allocated memory, then fix the final size once you finished reading the data in. But these days, over commit isn’t that bad as all you really allocate is memory addressing, not memory itself.

The functions I would like to see are:

extern void *pool_default(void); /* default memory pool to allocate from */ extern void *pool_init (void *ptr,size_t size); /* initialize pool of memory */ extern void *pool_malloc (void *pool,size_t nmemb,size_t size); extern void *pool_calloc (void *pool,size_t nmemb,size_t size); extern void *pool_realloc(void *pool,void *ptr,size_t oldnmemb,size_t newnmemb,size_t size); extern void pool_free (void *pool,void *ptr); extern void pool_freeall(void *pool); /* free all memory from pool */

I’m on the fence about letting pool_free() accept a NULL data pointer (like free() does). And what should pool_malloc(pool,0,x) would return (NULL? A pointer of 0 bytes?). I like the nmemb parameter to avoid any multiplication overflow that might happen now with malloc() and realloc(). And it would be a god send if there was a call to free all currently allocated memory.

[1] Bonus points if you know the OS.

[2] Yes, I know, I should use a better abstraction for strings other than NUL terminated character array.

more comments: https://lobste.rs/s/ywsc55/malloc_free_are_bad_api