"
Brad Abrams posted an e-mail from Brian Harry written during development of the .Net framework. It details many of the reasons reference counting was not used, even when one of the early priorities was to keep semantic equivalence with VB6, which uses reference counting. It looks into possibilities such as having some types ref counted and not others (IRefCounted?!), or having specific instances ref counted, and why none of these solutions were deemed acceptable.
Because [the issue of resource management and deterministic finalization] is such a sensitive topic I am going to try to be as precise and complete in my explanation as I can. I apologize for the length of the mail. The first 90% of this mail is trying to convince you that the problem really is hard. In that last part, I'll talk about things we are trying to do but you need the first part to understand why we are looking at these options.
...
We initially started with the assumption that the solution would take the form of automatic ref counting (so the programmer couldn't forget) plus some other stuff to detect and handle cycles automatically. ...we ultimately concluded that this was not going to work in the general case.
...
In summary:
We feel that it is very important to solve the cycle problem without forcing programmers to understand, track down and design around these complex data structure problems.
We want to make sure we have a high performance (both speed and working set) system and our analysis shows that using reference counting for every single object in the system will not allow us to achieve this goal.
For a variety of reasons, including composition and casting issues, there is no simple transparent solution to having just those objects that need it be ref counted.
We chose not to select a solution that provides deterministic finalization for a single language/context because it inhibits interop with other languages and causes bifurcation of class libraries by creating language specific versions."
https://blogs.msdn.microsoft.com/brada/2005/02/11/resource-management/
---
http://stackoverflow.com/tags/garbage-collection/info
---
mb someday ask StackOverflow? for a survey of approaches to garbage collection in programming languages with the following goals:
I'm designing a programming language and i'm looking for suggestions for garbage collection/automatic memory management algorithms that meet the following criteria. Latency (GC pauses) must be minimized; but throughput performance is not important. The language is intended to be easy to reimplement on different platforms, so it is imperative that the GC technique in the reference implementation be simple. To reiterate, the algorithm under discussion could later be replaced with a more complicated, higher-performing GC algorithm on some implementations, but what i need for the reference implementation (ie what i am asking for here) is simplicity and ease of (correct) implementation, even at the expense of inefficiency. The data structures being managed will be concurrently accessed by multiple threads, and will contain reference cycles.
In more detail, design goals include:
The default garbage collection systems of language implementations such as the JVM are not suitable because (a) they can have long GC pauses, and (b) they are probably more concerned with efficiency than simplicity. Some of the other automatic memory management systems that i am currently looking at as potential models are Inferno's, Go's, and Python's; other suggestions, or comments and comparisons, are welcome.
http://stackoverflow.com/questions/tagged/programming-languages http://stackoverflow.com/questions/tagged/language-design http://softwareengineering.stackexchange.com/questions/tagged/programming-languages http://softwareengineering.stackexchange.com/questions/tagged/programming-languages+language-design
---
jeffdavis 6 hours ago [-]
Tangent:
In databases, it's common to do arena based allocation. You allocate objects of similar lifetime (life of a query, life of a transaction, life of a procedure execution, etc.), and free them all at once when the lifetime is up.
When hacking postgresql, for instance, using pfree() is fairly uncommon. You just allocate, (which goes to the current arena by default), and then the arena is wiped out all at once later.
Of course, sometimes you might use a lot of memory temporarily in a loop, and you need to pfree() it each iteration. But that is the exception.
I think of each arena as a tiny heap (though of course they can grow large).
reply
favorited 5 hours ago [-]
This was also an optimization you could do in Objective-C on NeXTSTEP?/OpenStep?/macOS. Allocating methods took an optional "zone" parameter (`allocWithZone:`, `copyWithZone:`), which was basically an arena.
In this case, the bigger benefit wasn't the allocation/deallocation – but the fact that you would be working with memory on the same virtual memory page, IIUC.
reply
---
https://medium.com/@octskyward/modern-garbage-collection-911ef4f8bd8e#.ohyo057ev
---
"Go’s garbage collector is a concurrent mark-sweep with very short stop-the-world pauses and it just seems to keep getting better. This is definitely a good thing for a garbage collected language. We did have some issues with unacceptable latencies, but were able to work them all out. No such luck with the Haskell project."
"...previously tried to use Haskell for the project I was working on, but eventually had to give up due to unpredictable latency caused by garbage collection pauses. GHC’s garbage collector is designed for throughput, not latency. It is a generational copying collector, which means that pause times are proportional to the amount of live data in the heap. To make matters worse, it’s also stop-the-world."
---
"Go is garbage-collected. I personally think Swift/Objective-C-style Automatic Reference Counting would have been a better choice for a statically typed systems language, since it gives you the brevity benefits without the GC pauses. I’m sure this has been argued to death elsewhere, so I’ll save my GC rant for a very boring dinner party."
---
pjmlp 1 day ago
| parent | flag | favorite | on: C for All |
> C does not have RAII-like memory management in any way.
Yes it does, as language extension on gcc and clang.
It is called cleanup attribute.
https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...
https://clang.llvm.org/docs/LanguageExtensions.html#non-stan...
---
tomsmeding 1 day ago
| parent | flag | favorite | on: C for All |
C does not have RAII-like memory management in any way. Exceptions work beautifully with memory management like that, but if it's not there, you can't just say it should work because memory management should work like that.
So basically you're saying, before adding exceptions, add RAII-like memory management, and then actually add exceptions. I like both features, but am not sure how you'd wedge RAII into C. Any ideas on that?
---
sedachv 1 day ago [-]
> C does not have RAII-like memory management in any way.
C does not have memory management in any way period. The C standard library does. How you get to something with dynamic scoping like RAII in C is to use a different library for managing memory. For example Thinlisp[1] and Ravenbrook's Memory Pool System[2] both provide dynamically-scoped region/pool allocation schemes.
[1] https://github.com/vsedach/Thinlisp-1.1
[2] https://www.ravenbrook.com/project/mps/
reply
---
BruceIV? 1 day ago [-]
[On the Cforall team] For what it's worth, one of the features Cforall adds to C is RAII.
The exception implementation isn't done yet, but it's waiting on (limited) run-time type information, it already respects RAII.
reply
RhysU? 1 day ago [-]
I have often wanted c99 with destructors for RAII purposes.
reply
---
UncleMeat? 1 day ago
| parent | flag | favorite | on: C for All |
Exceptions work great in garbage collected languages. Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.
sedachv 1 day ago [-]
> Reasoning about exceptional control flow w.r.t. manual memory management is a total nightmare.
Exceptions make manual memory management easier because a proper exception system has unwind-protect[1]. Exceptions are just movements up the stack - exceptions combine naturally with dynamic scoping for memory allocation (memory regions/pools). This kind of memory management was used in some Lisp systems in the 1980s, and made its way into C++ in the form of RAII. By extending the compiler you can add further memory management conveniences like smart pointers to this scheme.
Now if you want to talk about something that actually makes manual memory management a total nightmare, look at the OP's suggestion for adding closures to C.
[1] http://www.lispworks.com/documentation/HyperSpec/Body/s_unwind.htm
reply
---
reference counting may require write barriers (for pointers at least):
" while reference counting incurs a high cost — every pointer write (including those to stack frames) results in reference count updates "
" The incremental nature of reference counting is generally con- sidered to be its fundamental advantage. However, the cost of up- dating reference counts every time a new pointer is loaded into a register is typically much too high for high-performance applica- tions. As a result, some form of deferred reference counting [20], in which references from stack frames are accounted for separately, is used in most high-performance implementations of reference counting [3, 19]. However, this means that when an object’s reference count drops to zero, it can not be reclaimed immediately, since there might be a reference from the stack that is not accounted for. As a result, collection is deferred until the periodic scanning of the stack refer- ences. However, the result is delayed collection, floating garbage, and longer application pauses — the typical characteristics of trac- ing collectors! "
---
have some bits in each fat pointer (and every pointer should be fat) for the GC to use. This way it can mark pointer instances.
load and store barriers for the GC
---
One reason that Java takes lots of memory, besides GC: lack of value types "data structures can't be value types[1], consequently consuming large amounts of extra memory because everything sits behind 2 or 3 or 5 layers of unnecessary 8-byte pointers." [1]
---
" jemalloc is removed by default
Long, long ago, Rust had a large, Erlang-like runtime. We chose to use jemalloc instead of the system allocator, because it often improved performance over the default system one. Over time, we shed more and more of this runtime, and eventually almost all of it was removed, but jemalloc was not. We didn't have a way to choose a custom allocator, and so we couldn't really remove it without causing a regression for people who do need jemalloc.
Also, saying that jemalloc was always the default is a bit UNIX-centric, as it was only the default on some platforms. Notably, the MSVC target on Windows has shipped the system allocator for a long time.
Finally, while jemalloc usually has great performance, that's not always the case. Additionally, it adds about 300kb to every Rust binary. We've also had a host of other issues with jemalloc in the past. It has also felt a little strange that a systems language does not default to the system's allocator.
For all of these reasons, once Rust 1.28 shipped a way to choose a global allocator, we started making plans to switch the default to the system allocator, and allow you to use jemalloc via a crate. In Rust 1.32, we've finally finished this work, and by default, you will get the system allocator for your programs. "
---
[2] [3] " drfuchs 9 months ago [-]
The proposal here is way too vague. And if you flesh it out, things start to fall apart: If nul-termination of strings is gone, does that mean that the fat pointers need to be three words long, so they have a "capacity" as well as a "current length"? If not, how do you manage to get string variable on the stack if its length might change? Or in a struct? How does concatenation work such that you can avoid horrible performance (think Java's String vs. StringBuffer?)? On the other hand, if the fat pointers have a length and capacity, how do I get a fat pointer to a substring that's in the middle of a given string?
Similar questions apply to general arrays, as well. Also: Am I able to take the address of an element of an array? Will that be a fat pointer too? How about a pointer to a sequence of elements? Can I do arithmetic on these pointers? If not, am I forced to pass around fat array pointers as well as index values when I want to call functions to operate on pieces of the array? How would you write Quicksort? Heapsort? And this doesn't even start to address questions like "how can I write an arena-allocation scheme when I need one"?
In short, the reason that this sort of thing hasn't appeared in C is not because nobody has thought about it, nor because the C folks are too hide-bound to accept a good idea, but rather because it's not clear that there's a real, workable, limited, concise, solution that doesn't warp the language far off into Java/C#-land. It would be great if there were, but this isn't it.
WalterBright? 9 months ago [-]
I happened to know the idea does work, and has been working in D for 18 years now.
> If nul-termination of strings is gone, does that mean that the fat pointers need to be three words long, so they have a "capacity" as well as a "current length"?
No. You'll still have the same issues with how memory is allocated and resized. But, once the memory is allocated, you have a safe and reliable way to access the memory without buffer overflows.
> If not, how do you manage to get string variable on the stack if its length might change? Or in a struct? How does concatenation work such that you can avoid horrible performance (think Java's String vs. StringBuffer?)?
As I mentioned, it does not address allocating memory. However, it does offer one performance advantage in not having to call strlen to determine the size of the data.
> On the other hand, if the fat pointers have a length and capacity, how do I get a fat pointer to a substring that's in the middle of a given string?
In D, we call those slices. They look like this:
T[] array = ...
T[] slice = array[lower .. upper];The compiler can insert checks that the slice[] lies within the bounds of array[].
> Am I able to take the address of an element of an array?
Yes: `T* p = &array[3];`
> Will that be a fat pointer too?
No, it'll be regular pointer. To get a fat pointer, i.e. a slice:
slice = array[lower .. upper];
> How about a pointer to a sequence of elements?
Not sure what you mean. You can get a pointer or a slice of a dynamic array.
> Can I do arithmetic on these pointers?
Yes, via the slice method outlined above.
> If not, am I forced to pass around fat array pointers as well as index values when I want to call functions to operate on pieces of the array?
No, just the slice.
> How would you write Quicksort? Heapsort?
Show me your pointer version and I'll show you an array version.
> And this doesn't even start to address questions like "how can I write an arena-allocation scheme when I need one"?
The arena will likely be an array, right? Then return slices of it.
"
" MrBingley? 9 months ago [-]
I absolutely agree. Adding an array type to C that knows its own length would solve so many headaches, fix so many bugs, and prevent so many security vulnerabilities it's not even funny. Null terminated strings? Gone! Checked array indexing? Now possible! More efficient free that gets passed the array length? Now we could do it! The possibilities are incredible. Sadly, C is so obstinately stuck in its old ways that adding such a radical change will likely never happen. But one can dream ...
bluetomcat 9 months ago [-]
> Adding an array type to C that knows its own length would solve so many headaches
C arrays know their length, it's always `sizeof(arr) / sizeof(*arr)`. It's just that arrays become pointers when passed between functions, and dynamically-sized regions (what is an array in most other languages) are always accessed via a pointer. "
---
---
could have reference counting and rely on the programmer to avoid reference cycles by using weak references. Apparently Swift does this:
i guess that makes our implementation simpler and more performant. But i think at the cost of making our programmers' lives more complex, so i dont really like it.
---
afaict the only way to have a truly pauseless compacting GC is to have a read barrier upon each dereference of any pointer. or maybe handles (pointers to pointers) are enough?
ok there's another way; have a load barrier upon load of any memory location containing a pointer; it's not truly pauseless but you only have to pause to scan the stack roots, so at least the max pause time does not grow as the heap grows.
---