" sehugg 3 hours ago
Great postmortem and good lessons to learn here:
- Don't manually modify database without a well-tested procedure and another pair of eyes
- Don't leave persistent problems (e.g. memory problems) uninvestigated so that you miss new problems with similar symptoms
- Don't push new code to production while operational problem is ongoing (unless it addresses the operational problem)
I'm pretty sure I've repeated this exact same sequence before with similar results... "
"
There is a line from Futurama that perfectly applies ton a lot of debugging.
Farnsworth: My God, is it really possible?
Fry: It must be possible, it's happening.
Fry: By the way, what's happening? "
---
random bugs:
- https://code.facebook.com/posts/1499322996995183/solving-the-mystery-of-link-imbalance-a-metastable-failure-state-at-scale/ https://news.ycombinator.com/item?id=8660994
- https://research.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html
- https://en.bitcoin.it/wiki/Value_overflow_incident
- https://news.ycombinator.com/item?id=11921793
- https://software.intel.com/en-us/blogs/2015/04/24/lets-play-a-game-find-bugs-in-popular-open-source-projects (and http://wayback.archive.org/web/20160425093114/http://q.viva64.com/start )
- http://hakipedia.com/index.php/Poison_Null_Byte
- http://hakipedia.com/index.php/Category:Vulnerabilities
- ethereum
- "memory leaks, infinite loops, SQL injection, XSS, and CSRF, or for that matter “billion-laughs”-like denial-of-service vulnerabilities, let alone more prosaic problems like the Flexcoin bankruptcy due to using a non-transactional data store."
- "as compiler optimizations exploit increasingly recondite properties of the programming language definition, we find ourselves having to program as if the compiler were our ex-wife’s or ex-husband’s divorce lawyer, lest it introduce security bugs into our kernels, as happened with FreeBSD? a couple of years back with a function erroneously annotated as noreturn, and as is happening now with bounds checks depending on signed overflow behavior."
- assume that external binary libraries such as parsers might crash [1]
- in PHP, watch out for == vs === [2]
- https://googleprojectzero.blogspot.com/2016/10/taskt-considered-harmful.html
- https://bugs.chromium.org/p/nativeclient/issues/detail?id=245
- https://news.ycombinator.com/item?id=14373347
- https://isc.sans.edu/diary/A+new+fascinating+Linux+kernel+vulnerability/6820
- http://www.kb.cert.org/vuls/id/162289
- https://www.reddit.com/r/ethereum/comments/6oaqti/important_security_alert/dkg0pml/
- https://www.kingoftheether.com/contract-safety-checklist.html
- actually you could say that there was a second, additional bug, in that .delegatecall was used instead of whitelisting methods:
- http://hackingdistributed.com/2017/07/20/parity-wallet-not-alone/
- https://blog.zeppelin.solutions/augur-rep-token-critical-vulnerability-disclosure-3d8bdffd79d2
- https://www.theatlantic.com/technology/archive/2017/09/saving-the-world-from-code/540393/
- https://medium.freecodecamp.org/the-times-ive-messed-up-as-a-developer-3c0bcaa1afd6
- https://marcan.st/2017/12/debugging-an-evil-go-runtime-bug/
- list, sorta: https://people.csail.mit.edu/nickolai/papers/lazar-cryptobugs.pdf
- https://www.blackhat.com/docs/eu-17/materials/eu-17-Arnaboldi-Exposing-Hidden-Exploitable-Behaviors-In-Programming-Languages-Using-Differential-Fuzzing-wp.pdf
- http://gulftech.org/advisories/WDMyCloud%20Multiple%20Vulnerabilities/125
- https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-design-bug-in-the-xbox-360/
- https://blog.bugsnag.com/bug-day-race-condition-therac-25/
- https://news.ycombinator.com/item?id=17740944
- https://hackernoon.com/how-to-not-destroy-millions-in-smart-contracts-pt-2-85c4d8edd0cf
- https://about.gitlab.com/2018/11/14/how-we-spent-two-weeks-hunting-an-nfs-bug/
- https://news.ycombinator.com/item?id=23053728
- https://thomask.sdf.org/blog/2019/11/09/take-care-editing-bash-scripts.html
- https://songlh.github.io/paper/go-study.pdf
- https://freedom-to-tinker.com/2013/10/09/the-linux-backdoor-attempt-of-2003/ (= instead of ==)
- https://offensi.com/2020/08/18/how-to-contact-google-sre-dropping-a-shell-in-cloud-sql/
- https://hackerone.com/reports/783877
- https://www.bbc.com/news/technology-54423988
- https://mksben.l0.cm/2020/10/discord-desktop-rce.html
- https://alaa.blog/2020/12/how-i-hacked-facebook-part-one/
- https://buttondown.email/cryptography-dispatches/archive/cryptography-dispatches-the-most-backdoor-looking/
- https://popey.com/blog/2021/01/null/
- https://googleprojectzero.blogspot.com/2021/01/the-state-of-state-machines.html
- https://secret.club/2021/01/15/bitlocker-bypass.html
- https://www.qualys.com/2021/01/26/cve-2021-3156/baron-samedit-heap-based-overflow-sudo.txt
- https://randomascii.wordpress.com/2021/02/16/arranging-invisible-icons-in-quadratic-time/
- https://news.ycombinator.com/item?id=26154572
- https://blog.wesleyac.com/posts/timezone-bullshit
- https://engineering.skroutz.gr/blog/uncovering-a-24-year-old-bug-in-the-linux-kernel/
- https://robertchen.cc/blog/2021/04/03/github-pages-xss
- https://news.ycombinator.com/item?id=26768299
- https://twitter.com/m13253/status/1371615680068526081 (accidental infinite loop is undefined behavior, so compiler doesn't generate function return and execution falls off the end of the function into the next function)
- https://news.ycombinator.com/item?id=27292219
- https://sin-ack.github.io/posts/jpg-loader-bork/
- summary: someone forgot that something was supposed to be ordered, and put it in a hashtable, and then used a hashtable iterator to iterate over it; the bug was masked because it happened to come out in the right order anyway; then the bug later appeared when something relating to malloc was changed which changed the hashtable ordering
- "LLVM’s built-in unordered containers have a mode where iteration order is reversed. This was added to catch exactly this kind of bug" -- david chisnall
- https://bhavukjain.com/blog/2020/05/30/zeroday-signin-with-apple/
- https://bugs.xdavidhu.me/google/2020/03/08/the-unexpected-google-wide-domain-check-bypass/
- https://rekken.github.io/2020/05/14/Security-Flaws-in-Adobe-Acrobat-Reader-Allow-Malicious-Program-to-Gain-Root-on-macOS-Silently/
- https://push32.com/post/dating-app-fail/
- https://news.ycombinator.com/item?id=23021637
- https://news.ycombinator.com/item?id=23022948
- https://news.ycombinator.com/item?id=23018710
- https://news.ycombinator.com/item?id=23027338
- https://news.ycombinator.com/item?id=23020137
- https://news.ycombinator.com/item?id=23018179
- https://news.ycombinator.com/item?id=23019491
- https://news.ycombinator.com/item?id=23018770
- http://beza1e1.tuxen.de/lore/index.html
- https://danluu.com/postmortem-lessons/
- https://www.rtcsec.com/post/2020/04/how-we-abused-slacks-turn-servers-to-gain-access-to-internal-services/ https://news.ycombinator.com/item?id=22804290
- https://blog.teddykatz.com/2019/11/12/github-actions-dos.html
- https://gist.github.com/0xabad1dea/be18e11beb2e12433d93475d72016902
- https://google.github.io/security-research/pocs/linux/cve-2021-22555/writeup.html
- https://blog.ryotak.me/post/cdnjs-remote-code-execution-en/
- https://www.openwall.com/lists/oss-security/2021/07/20/1
- https://twitter.com/kelvinfichter/status/1425217046636371969
- https://github.com/stong/how-to-exploit-a-double-free
- https://reinference.net/mp-talk.pdf
- https://blog.unity.com/technology/debugging-memory-corruption-who-the-hell-writes-2-into-my-stack-2
- https://blog.cloudflare.com/the-tale-of-a-single-register-value/?a
- https://increment.com/testing/i-test-in-production/
- https://news.ycombinator.com/item?id=29504755
- http://rachelbythebay.com/w/2021/12/24/mkdir/
- https://serpapi.com/blog/how-a-routine-gem-update-ended-up-charging/
- https://blog.sunfishcode.online/bugs-in-hello-world/
- https://dirtypipe.cm4all.com/
- https://gist.githubusercontent.com/plutooo/2aadbd4a718e269df474079dd2e584fb/raw/7b3af77b5202366c8934c88ef251f1e905967040/gistfile1.txt
- https://www.graplsecurity.com/post/attacking-firecracker
- https://probablydance.com/2022/09/17/finding-the-second-bug-in-glibcs-condition-variable/
- http://rachelbythebay.com/w/2022/12/02/25k/
- https://en.wikipedia.org/wiki/List_of_software_bugs
- https://github.blog/2023-09-26-getting-rce-in-chrome-with-incorrect-side-effect-in-the-jit-compiler/
---
" Recent security stories confirm that errors like buffer overflow and use-after-free can have serious, widespread consequences when they occur in critical open source software. These errors are not only serious, but notoriously difficult to find via routine code audits, even for experienced developers. That's where fuzz testing comes in. By generating random inputs to a given program, fuzzing triggers and helps uncover errors quickly and thoroughly. In recent years, several efficient general purpose fuzzing engines have been implemented (e.g. AFL and libFuzzer), and we use them to fuzz various components of the Chrome browser. These fuzzers, when combined with Sanitizers, can help find security vulnerabilities (e.g. buffer overflows, use-after-free, bad casts, integer overflows, etc), stability bugs (e.g. null dereferences, memory leaks, out-of-memory, assertion failures, etc) and sometimes even logical bugs. OSS-Fuzz's goal is to make common software infrastructure more secure and stable by combining modern fuzzing techniques with scalable distributed execution. OSS-Fuzz combines various fuzzing engines (initially, libFuzzer) with Sanitizers (initially, AddressSanitizer?) and provides a massive distributed execution environment powered by ClusterFuzz?. " -- [4]
---
" Change One Thing at a Time On nuclear-powered subs, there's a brass bar in front of the control panel for the power plant. When status alarms begin to go off, the engineers are trained to grab the brass bar with both hands and hold on until they've looked at all the dials and indicators, and understand exactly what's going on in the system. What this does is help them overcome the temptation to start "fixing" things, throwing switches and opening valves. These quick fixes confuse the automatic recovery systems, bury the original fault beneath an onslaught of new conditions, and may cause a real, major disasters. It's more effective to remember to do something ("Grab the bar!") tha " -- ianmcgowan
---
https://rachelbythebay.com/w/2018/04/21/lb/
if one server is failing each request, but failing fast, the load balancer may divert a proportionately large number of requests to that server
---
https://news.ycombinator.com/item?id=27175622
---
Garbi 13 hours ago
I don’t have all day so I just picked one at the bottom, this one https://blog.nelhage.com/post/a-cursed-bug/ . I didn’t even know that RDMA was a thing and now I’m sitting in the corner hugging my knees and rocking slowly.
~
aidancully 11 hours ago | link | flag |
Honestly, fork is more the problem, here, than RDMA is. (And I see that that’s also the name of the text that links there on the iceberg.) fork only works well for private memory mappings in single-threaded processes. Every other process configuration / OS resource is problematic. This issue https://github.com/smol-rs/polling/issues/37#issue-1122097454 we encountered is much less exotic, but a natural consequence of the trouble with mixing “everything is a file” behaviors with “fork”.
(Also, I’m amused to see that that link describes a bug very much like one we encountered. Our local upshot was that we couldn’t make our system compatible with rsocket, we had to fall back to using libibverbs directly, specifically so that we could control which memory got the MADV_DONTNEED treatment.)
5
---