proj-plbook-plChError

Table of Contents for Programming Languages: a survey

Chapter : error handling

exceptions

In the 'design-by-contract' paradigm, some suggest that the semantics of an 'exception' should be that a function emits an exception when it cannot fulfill its contract with the caller [1].

checked or not

resumable or not (see also the more general 'conditions')

catch-all exception handlers (and how to represent arbitrary exceptions in statically typed languages; C++ allows any type to be thrown but perhaps the language should force them all to derive from Exception?)

'escape continuations'

golang panics: http://golang.org/doc/effective_go.html#errors . Note that golang's 'recover' also doubles as an 'was this defer involked as a result of a panic within this function?' boolean function that is called within 'defer's. 'recover' returns nil (e.g. i am not panicing as a result of a panic WITHIN THIS FUNCTION) when it is called within other defer blocks placed within other functions that are called from the initial defer.

note the similarities to try..catch blocks. panics and recovers are similar to exceptions with try..catch in that the panic can pass information on what type of problem it is, and recovers can receive that information and 'catch' the panic, and uncaught panics 'bubble up' and can be 'caught' further up.

finally finally vs. golang defer (defer allows the final code to be placed anywhere in the block corresponding to the 'try' block, so for example if you use it to close a file, you can put it right after the file open command; note that defer executes immediately before the function enclosing it returns, not at the end of the enclosing block; note also that in go, since there is no 'try/catch', a 'recover' from a 'panic' must be inside a 'deer' block, since 'defer' is the only things evaluated as the stack is unwound (http://golang.org/doc/effective_go.html#errors) ) finally vs. python with vs. C# using

(a) benefit to exceptions vs error codes: with error codes, you can silently neglect to check the error code; with exceptions, ignoring the exception is visible

(an) objection to exceptions: non-local control flow

objection to exceptions: http://blogs.msdn.com/b/oldnewthing/archive/2004/04/22/118161.aspx http://blogs.msdn.com/b/oldnewthing/archive/2005/01/14/352949.aspx

todo look again at section "what's wrong with c++ exceptions?" in [2]

todo summarize [3] (upfront boilerplate, versioning, scalability)

the code for handling and exception can be (from specific to general):

retries

some languages support a 'retry' command in the exception handler (eg [4]). One could also have a "retry n" command, where n is an integer and may be a variable, for the common idiom of "retry up to n times". In a dynamic language this may cause run-time errors if the type of 'n' actually passed is not an integer; since in most cases the number of retries does not affect correctness, a language might have special handling for this case for the 'retry' command (for example, defaulting to 'retry 0' in such a case).

exception safety

" A piece of code is said to be exception-safe, if run-time failures within the code will not produce ill effects, such as memory leaks, garbled stored data, or invalid output. Exception-safe code must satisfy invariants placed on the code even if exceptions occur. There are several levels of exception safety:

    Failure transparency, also known as the no throw guarantee: Operations are guaranteed to succeed and satisfy all requirements even in presence of exceptional situations. If an exception occurs, it will not throw the exception further up. (Best level of exception safety.)
    Commit or rollback semantics, also known as strong exception safety or no-change guarantee: Operations can fail, but failed operations are guaranteed to have no side effects so all data retain original values.
    Basic exception safety: Partial execution of failed operations can cause side effects, but invariants on the state are preserved. Any stored data will contain valid values even if data has different values now from before the exception.
    Minimal exception safety also known as no-leak guarantee: Partial execution of failed operations may store invalid data but will not cause a crash, and no resources get leaked.
    No exception safety: No guarantees are made. (Worst level of exception safety)" -- https://en.wikibooks.org/wiki/C%2B%2B_Programming/Exception_Handling

conditions

generalization of exceptions, where the handler runs in the scope that the exception was thrown in. This can allow resuming.

maybe

todo reread 6 ways to handle errors in haskell

also called Option in e.g. Scala, Java 8, Apple Swift

Milewski opines that it's better to have monads rather than special-case optional chaining for these things

example of the usefulness of Option types:

" Bug: After a subtle, undocumented change, a platform API no longer allows nil strings. Wizard mysteriously crashed on OS X Mavericks when that operating system was first released. Why? Because a certain widget that displays strings underwent an imperceptible API change. Before Mavericks, if you passed nil to the widget, the widget displayed nothing. (Sometimes I wanted to display nothing, in which case I passed nil.) But with Mavericks, the widget unilaterally decided that the string argument was mandatory, and passing nil caused an exception instead. Even when compiling on Mavericks, the compiler and the static analyzer were completely silent about this.

The Swift Fix: With Swift, strings are values, and values are either optional or they're not, and methods can say whether they accept optional values. So if we lived in a Swift world, the method definition of the widget would have had a way of saying, "Hey buddy, I no longer accept optional values," and I would have gotten a compile-time error instead of a flood of emails from angry Mavericks users. " [5]

nullable types

nan vs none vs missing/masked

http://lambda-the-ultimate.org/node/3186 "Null References: The Billion Dollar Mistake"

" The problem is that the null reference has been overloaded to mean at least seven different things:

    a semantic non-value, such as a property that is applicable to some objects but not others;
    a missing value, such as the result of looking up a key that is not present in a key-value store, or the input used to request deletion of a key;
    a sentinel representing the end of a sequence of values;
    a placeholder for an object that has yet to be initialized or has already been destroyed, and therefore cannot be used at this time;
    a shortcut for requesting some default or previous value;
    an invalid result indicating some exceptional or error condition;
    or, most commonly, just a special case to be tested for and rejected, or to be ignored and passed on where it will cause unpredictable problems somewhere else.

Since the null reference is a legitimate value of every object type as far as the language is concerned, the compiler has no way to distinguish between these different uses. The programmer is left with the responsibility to keep them straight, and it’s all too easy to occasionally slip up or forget. " -- Anders Kaseorg, http://qr.ae/CS2A6

Ruby, Lua, JavaScript?, Perl, PHP, Python all have nullable types (the nil element is variously called nil, undefined, undel, NULL, and None). Common Lisp doesn't have a special nil element, but uses the empty list as a nil in many contexts ([6]).

uninitialized variables/memory

What value should a new variable or region of memory have just after it is allocated?

Some choices:

issue: handling inconsistent state after an error

Imagine that you have 3 data representations that must be kept in sync. You need to modify them. What if, when you try to modify the second one, an error occurs, leaving you unable to modify it at this time?

You have already modified the first one. Since modifying the second one is not possible, you must attempt to undo the change made to the first one.

Similarly, if you find that you are unable to modify the third one, you must undo the changes to both the first and the second one.

Using either error codes or exceptions, this leads to adding a lot of logic to your code. If you have N data representations to keep in sync, you end up writing n items of code to modify them, plus (1 + 2 + ... + n-1) = 0.5*n^2 items of code to potentially undo the previous steps in case of an error!

Two ways to simplify the situation are transactions and ScopeGuard?.

reading nonexistant array elements

In many languages, reading nonexistant array elements causes an error or exception.

In Perl6, reading nonexistant array elements expands the array (this is called "autovivification").

In Ruby, Lua, Javascript, Perl, and PHP, reading nonexistant array elements returns nil ([7]).

Links:

transactions

ScopeGuard

Something that will be called when the current scope is exited, UNLESS it is dismissed before that point.

See http://www.drdobbs.com/cpp/generic-change-the-way-you-write-excepti/184403758

like golang defer, but dismissable

D has this facility via scope(failure):

http://dlang.org/statement.html#ScopeGuardStatement

http://dlang.org/exception-safe.html

assertions

assertions/point invariants

preconditions (things you demand be true before a statement is reached)

postconditions (things you promise will be true when a statement is reached)

The D language uses 'in' and 'out' blocks of code for function preconditions and postconditions: [8]. You put assertions in them. They also have 'invariant' blocks associated with classes which are run (i) after the constructor, and (ii) before a destructor (iii) before and after every public member function (including exported library functions) [9]. Subclasses loosen preconditions and tighten postconditions; that is, between a class and its ancestors, only one of the precondition blocks in the inheritance chain must be satisfied, but all of the postcondition blocks in the chain must be satisfied [10].

region invariants (assertions which should be always true within some region)

preconditions and postconditions as contracts; exceptions as notifications that the contract has been broken

'eventually' as a modal modifier to an assertion or pre/post-conditions; which means that a thing may not yet be true, but if you wait long enough, it will eventually become true (seen in SPIN's Promela)

error levels: eg Python CRITICAL, ERROR, WARNING, INFO, DEBUG (but in Python, each corresponds to an integer error level, and other integer error levels lie between each one, allowing for customization)

logging (note: logging is a side-effect but often one that you want to ignore)

Links:

erlang-style links

Erlang allows a bidirectional 'link' to be setup between two processes. When a process abnormally terminates, a signal is sent to each linked process; if the signal is not handled, the receiving process also abnormally terminates.

Links:

erlang-style monitors

Erlang allows one process to 'monitor' another. When a process (abnormally?) terminates, a message is sent to each process monitoring it.

Links:

PIPEFAIL

" Unix shell pipelines have two usage patterns January 6, 2021

I've seen a variety of recommendations for safer shell scripting that use Bash and set its 'pipefail' option (for example, this one from 2015). This is a good recommendation in one sense, but it exposes a conflict; this option works great for one usage pattern for pipes, and potentially terribly for another one.

To understand the problem, let's start with what Bash's pipefail does. To quote the Bash manual:

    The exit status of a pipeline is the exit status of the last command in the pipeline, unless the pipefail option is enabled. If pipefail is enabled, the pipeline’s return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully. [...]

The reason to use pipefail is that if you don't, a command failing unexpectedly in the middle of a pipeline won't normally be detected by you, and won't abort your script if you used 'set -e'. You can go out of your way to carefully check everything with $PIPESTATUS, but that's a lot of extra work.

Unfortunately, this is where our old friend SIGPIPE comes into the picture. What SIGPIPE does in pipelines is force processes to exit if they write to a closed pipe. This happens if a later process in a pipeline doesn't consume all of its input, for example if you only want to process the first thousand lines of output of something:

    generate --thing | sed 1000q | gronkulate

The sed exits after a thousand lines and closes the pipe that generate is writing to, generate gets SIGPIPE and by default dies, and suddenly its exit status is non-zero, which means that with pipefail the entire pipeline 'fails' (and with 'set -e', your script will normally exit).

(Under some circumstances, what happens can vary from run to run due to process scheduling. It can also depend on how much output early processes are producing compared to what later processes are filtering; if generate produces 1000 lines or less, sed will consume all of them.)

This leads to two shell pipeline usage patterns. In one usage pattern, all processes in the pipeline consume their entire input unless something goes wrong. Since all processes do this, no process should ever be writing to a closed pipe and SIGPIPE will never happen. In another usage pattern, at least one process will stop processing its input early; often such processes are in the pipeline specifically to stop at some point (as sed is in my example above). These pipelines will sometimes or always generate SIGPIPEs and have some processes exiting with non-zero statuses.

Of course, you can deal with this in an environment where you're using pipefail, even with 'set -e'. For instance, you can force one pipeline step to always exit successfully:

    (generate --thing || true) | sed 1000q | gronkulate

However, you have to remember this issue and keep track of what commands can exit early, without reading all of their input. If you miss some, your reward is probably errors from your script. If you're lucky, they'll be regular errors; if you're unlucky, they'll be sporadic errors that happen when one command produces an unusually large amount of output or another command does its work unusually soon or fast.

(Also, it would be nice to only ignore SIGPIPE based failures, not other failures. If generate fails for other reasons, we'd like the whole pipeline to be seen as having failed.)

My informal sense is that the 'consume everything' pipeline pattern is far more common than the 'early exit' pipeline pattern, although I haven't attempted to inventory my scripts. It's certainly the natural pattern when you're filtering, transforming, and examining all of something (for example, to count or summarize it). " -- https://utcc.utoronto.ca/~cks/space/blog/unix/ShellPipesTwoUsages

todo

"

What would you prefer: if the software of the X-Ray machine that is scanning you would throw an exception and shut down if it divided with zero for some reason, or just returned positive infinity and this would be the X-Ray dose you would get? shareeditflag

answered Jan 5 '09 at 18:59 Tamas Czinege 7613711

What do you prefer, a life support machine that halts that divides by zero, or just prints NaN? since it was for display purposes? We both can make up examples that assume poor testing or design. – Pyrolistical Jan 5 '09 at 19:05

 Pyrolistical: Obviously, on life-critical systems, robustness is extremely important. That's why a life support machine should throw an exception and reset if it divides by zero. Otherwise, it might kill the patient by using invalid data. It should never ignore division by zero. – Tamas Czinege Jan 5 '09 at 19:19 

" -- http://programmers.stackexchange.com/questions/119987/should-integer-divide-by-zero-halt-execution

checked exceptions

[11]

" masklinn on Feb 28, 2020 [–]

Checked exceptions were universally rejected not because they are intrinsically bad but because the language support was awful (e.g. could not wrap or abstract over a nested object possibly rethrowing), they were sitting right next to unchecked exception with limited clarity, guidance and coherence as to which was which, and they are so god damn ungodly verbose, both to (re)throw and to convert.

Results are so much more convenient it's not even funny, but even without that you could probably build a language with checked exceptions where they're not infuriatingly bad (Swift has something along those lines, though IIRC it doesn't statically check all the error types potentially bubbling up so you know that you have to catch something, not necessarily what).

Groxx on Feb 29, 2020 [–]

A very large part of that though is Java not being 'generic' over checked exception types. So if you e.g. build something that supports end-user callback code, you need to either throw Exception (accepting all code but losing all signal as to what's possible) or nothing (forcing RuntimeException? boxing).

That's Java. And I agree it is a wildly painful and incomplete implementation. I wish we'd stop conflating it with checked exceptions as a language feature.

" -- [12]

https://ziglang.org/documentation/master/#errdefer

timeouts

cancel scopes

https://vorpus.org/blog/timeouts-and-cancellation-for-humans/#cancel-tokens-are-level-triggered-and-can-be-scoped-to-match-your-program-s-needs

error codes vs exceptions

https://yosefk.com/blog/error-codes-vs-exceptions-critical-code-vs-typical-code.html