D. Chris Rayner

Hi! I completed my PhD in computing science in 2014 at the University of Alberta, where I machine-learned pathfinding heuristics. Since then, I've worked as a data scientist on Dissolve's search engine, and then in applied machine learning at IBM's Global Chief Data Office… (see portfolio)

Notes

2020.11.26 "When numbers of different magnitudes are [added/subtracted], digits of the smaller-magnitude number are lost… when numbers very close to each other are subtracted, the result's less significant digits consist mostly of rounding errors." (The Floating-Point Guide - Error Propagation)

2020.11.15 "Production versions of the hypot function … are much more complex than you might imagine." (Hypot – A story of a 'simple' function). "Floating-point math has a reputation for producing results that are unpredictably wrong [but] IEEE floating-point math is designed to, whenever practical, give the best possible answer." (There are Only Four Billion Floats – So Test Them All!)

2020.10.31 "Timid code is kind of like a poorly told story" – "it's uncertain, it's constantly second-guessing itself, it's filled with digressions and provisos… It's constantly mixing together input handling and business logic and error handling." (Confident Code)

2020.10.25 In pursuit of coverage, "splitting up functions [destroyed] system architecture and code comprehension along with it": "functions no longer encapsulated algorithms. It was no longer possible to reason about the execution context of a line of code." (Why Most Unit Testing is Waste)

2020.10.24 "A solid systems's approach should not be based on but it works" (Theo de Raadt Responds) but "contracts enforced by the compiler are usually better than contracts enforced by runtime checks, or worse, documentation-only contracts." (Make Interfaces Hard to Misuse)

2020.10.22 "Throw exceptions only to indicate exceptional conditions" and "avoid return values that demand exceptional processing. Clients will forget to write the specialcase code, leading to bugs. For example, return zero-length arrays or collections rather than nulls." (Bumper-Sticker API Design)

2020.10.18 "It’s simpler to delete the code inside a function than it is to delete a function" (ibid). Compare with Hyrum's Law: "With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody."

2020.10.17 "Copy-paste code … to get a handle on how it will be used"; "keep the hard-to-delete parts as far away as possible from the easy-to-delete parts"; "[minimize] responsibilities of library code, even if we have to write boilerplate …" (Write code that is easy to delete, not easy to extend)

2020.10.10 "Don't include a sentence in documentation if its negation is obviously false." (Test of Negation)

2020.10.04 "…get better at copy editing the hard way: by practicing. Start by acquiring a half-baked piece of writing from someone else and trying to improve it. [S]ince you won’t be attached to it, your editing can be merciless." (You Might as Well Be a Great Copy Editor)

2020.09.13 Minimize cognitive load: "Naming consistency means both internal naming consistency (don't call 'dim' what is called 'axis' in other places) and consistency with established conventions for the problem domain." (Notes to Myself on Software Engineering)

2020.08.21 "Complex: state, objects, methods, syntax, inheritance, switch, vars, imperative loops, actors, ORM, conditionals. Simpler: values, functions, namespaces, data, polymorphism, managed refs, set functions, queues, declarative data manipulation, rules, consistency." (Simple != Easy)

2020.08.13 Performance is underrated – it is both "a feature [that] fundamentally alters how a tool is used" and can "simplify architecture [since] attempts to add performance to a slow system often add complexity…" (Reflections on software performance)

2020.08.12 "Original ideas are rarely born in full generality [and] their communication is not always a simple or straightforward task." E.g. "van Wijngaarden said that goto's were unnecessary [but] Dijkstra stretched the point to say that goto's were inconvenient." (Discoveries of Continuations)

2020.08.11 "The choice of the name 'computing science' instead of the more common 'computer science' was deliberate in order to indicate that computing rather than computers was to be the foundation of the discipline." (Computing Science at the University of Alberta)

2020.08.09 Assert configuration: "Configurations are by their nature ephemeral and less well tested. Assertions about configuration invariants can be critical to prevent mistakes." Aim to be able "to present visual side-by-side differences (diffs) of two configurations." (Configuration Debt)

2020.08.08 "…neither unit tests nor assertions find all bugs, so we should use both." Assert math (e.g. "problems that are inherently much more difficult to solve than to verify"), preconditions, postconditions, invariants, and the spec. (Use of Assertions)

2020.05.25 "…deliberately change notation in order to 'escape from the formalism' [and] highlight different aspects of a problem or solution." (Insights from Expert Software Design Practice)

2020.05.19 Externalize infrastructure using layers: "code cannot depend on layers further out from the core [i.e.] the Domain Model is the very center [and] the outer layer is reserved for things that change often. [Since] the application core needs implementation… we need some mechanism for injecting that code" (abstract interfaces, dependency injection). (Onion Architecture)

2020.04.27 A complicated system — software or otherwise — "not only makes our lives difficult in the present, but it accelerates the loss of knowledge over time… Each individual person knows a smaller percent… it's harder for them to transmit their knowledge onto people in the future… and deep knowledge becomes replaced by trivia." (Preventing the Collapse of Civilization)

2020.03.08 "Anything that isn’t crystal clear to a static analysis tool probably isn't clear to your fellow programmers, either", so "favor indexing over pointer arithmetic, try to keep your call graph inside a single source file, use explicit annotations, etc. " (Static Code Analysis)

2020.01.26 "An increase in the score produced by a base model should not decrease the score of the ensemble"; "each model should either be an ensemble [or a base model] but not both"; "[calibrate the base models] so that changes [don't] confuse the ensemble." (Keep Ensembles Simple)

2020.01.19 "It can be tempting to add a new feature to a model that improves accuracy, even when the accuracy gain is very small… Research solutions that provide a tiny accuracy benefit at the cost of massive increases in system complexity are rarely wise practice." (Epsilon Features)

2020.01.11 Better survival characteristics: "The design must be simple, both in implementation and interface. It is more important for the implementation to be simple than the interface … Completeness must be sacrificed whenever implementation simplicity is jeopardized." (Worse is Better)

2020.01.04 "It is especially common in a research setting to evaluate a new idea with partial or primitive implementations of other parts of the system … it can be very difficult to assess whether the benefits the system claims will still manifest once the fat is removed." (Scalability! But at what COST?)

2019.12.31 Limited to averages: "…the average time between earthquakes [is] approximately normal [but] not everything we study is an average"; in fact, "the [raw] time between earthquakes is usually modeled by an exponential distribution…" (The Central Limit Theorem and its misuse)

2019.12.28 "Since tests don't have tests, it should be easy for humans to manually inspect them for correctness, even at the expense of greater code duplication. This means that the DRY principle often isn’t a good fit for unit tests…" (Tests Too Dry? Make Them Damp!)

2019.12.27 Two identical functions can still be DRY when "the knowledge they represent is different. [For example] when two functions validate two separate things that just happen to have the same rules. That’s a coincidence, not a duplication." (The Evils of Duplication)

2019.12.24 "…once you have encapsulated some task in automation, anyone can execute the task. Therefore, the time savings apply across anyone who would plausibly use the automation. Decoupling operator from operation is very powerful." (Site Reliability Engineering)

2019.12.23 Data rollup "summarizes old, high-granularity data into a reduced granularity format for long-term storage" (Elasticsearch Reference). Spike erosion: "the single spike is averaged with an increasing number of 'normal' samples and hence decreases in height." (Statistics for Engineers)

2019.12.07 Data abstraction "easily leads to inappropriate reuse [and] will often cause un-needed, irrelevant data to be supplied to a function." "The data which does get used (and hence influences the result of a function) is hidden at the function call site." (Out of the Tar Pit)

2019.12.04 Not so regimented: "Abstract Factory, Builder, and Prototype [all] involve creating a new 'factory object' whose responsibility is to create… The 'factory object' and the prototype are the same object, because the prototype is responsible for returning the product." (Design Patterns)

2019.12.02 The fourth type of factory (2009) is a simple factory: "move [creation code] into another object that is only going to be concerned with creating." The 2008 factory might be viewed as "defining a simple factory as a static method." (Head-First Design Patterns)

2019.12.01 The third type of factory with "no direct equivalent in Design Patterns": "where a class seems to require multiple constructors with the same signature, replace the constructors with static factory methods and carefully chosen names…" (Creating and Destroying Java Objects)

2019.11.29 "The syntax for functional languages tends to be verb then noun: f(x), whereas the syntax for object oriented languages tends to be noun then verb: x.f()"; "There's a big difference in usability though: auto-complete … Knowing the type of x narrows down the list." (Verb-noun vs noun-verb)

2019.11.23 "Objects are state data with attached behavior; Closures are behaviors with attached state data and without the overhead of classes." (Design Patterns in Dynamic Programming)

2019.11.06 "Exceptions eliminate collocation. You have to look somewhere else to answer a question of whether code is doing the right thing, so you’re not able to take advantage of your eye’s built-in ability to learn to see wrong code, because there’s nothing to see." (Making Wrong Code Look Wrong)

2019.10.27 "Suboptimally optimizing a sensible objective function is a more viable approach than optimizing a convex objective function that contains obvious flaws." (Dimensionality Reduction)

2019.10.23 "A 25-dollar term for a 5-cent concept": "use the [variable/function, i.e. dependency] that we were given rather than the one we [created, imported, etc]." (Dependency Injection)

2019.10.20 "Sometimes you pass data to a routine or class merely so that it can be passed to another routine or class… Use of global variables can eliminate tramp data." (Code Complete/Global Data)

2019.10.08 "If I look at any small part of it, I can see what is going on … like a fractal, in which every level of detail is as locally coherent and as well thought out as any other" (The Quality Without a Name). "If a method call is not separately understandable, the reader of the code will have to jump to the implementation of the method in order to see what's going on." (Separate Understandability)

2019.10.05 "Building unit tests is itself an interesting test of orthogonality. What does it take to build and link a unit test? Do you have to drag in a large percentage of the rest of the system just to get a test to compile or link?" (Orthogonality)

2019.10.02 "Compression" in object-oriented code can hide context: "the primary feature for easy maintenance is locality: Locality is that characteristic of source code that enables a programmer to understand that source by looking at only a small portion of it." (Reuse Versus Compression)

2019.09.29 "There is a continuum of value in how pure a function is, and the value step from almost-pure to completely-pure is smaller than that from spaghetti-state to mostly-pure. Moving a function towards purity improves the code…" (Functional programming in C++)

2019.09.28 "The act of writing software is the act of naming, repeated over and over again. It’s likely that software engineers create more names than any other profession. Given this, it’s curious how little time is spent discussing names in a typical computer science education." (Elements of Clojure)

2019.09.22 "Implement new ideas in parallel with the old ones, rather than mutating the existing code"; "If the task you are working on can be expressed as a pure function that simply processes input parameters into a return structure, it is easy to switch it out…" (Parallel Implementations)

2019.09.21 Bad physical design: "A few innocent #includes sometimes expand to megabytes of header data once all the recursive inclusion is resolved"; "really easy to screw up, but requires concentrated effort and hard work to fix." Prevent with levelizing, e.g. "header files (.h) cannot include other header files." (Physical Design of the Machinery)

2019.09.18 "Minimize control flow complexity and 'area under ifs', favoring consistent execution paths and times over 'optimally' avoiding unnecessary work": "[this] usually takes more absolute time, but it reduces the variability in frame times, and eliminates a class of bugs." (Inlined Code)

2019.09.15 "When dealing with the wrong abstraction, the fastest way forward is back … re-introduce duplication by inlining the abstracted code back into every caller; within each caller, use the parameters being passed to determine the subset of the inlined code that this specific caller executes; delete the bits that aren't needed for this particular caller." (The Wrong Abstraction)

2019.09.11 "…design principle devised by Bertrand Meyer … every method should either be a command that performs an action (changes state), or a query that returns answer to the caller (without changing the state or causing side-effects), but not both." (Command-Query Separation)

2019.09.09 One level of indentation per method; don’t use the ELSE keyword; wrap all primitives and strings; first class collections; one dot per line; don’t abbreviate; keep all entities small; no classes with more than two instance variables; no getters/setters/properties (Object Calisthenics).

2019.09.07 "…choose your return type based on what you need to continue fluent action … the problems of methods in a fluent interface is that they don't make much sense on their own" (Fluent Interfaces). "Always return self"; "Remove any getter methods [instead] create a command which will apply the data to the thing (East-Oriented Code).

2019.09.05 "…create the object using a parameter-less constructor and then you set only fields which you want using mutators … [but then] fields cannot be declared final". Consider a Builder: "to make things more convenient, the builder returns itself, so you can chain the method calls (fluent interface)" (Telescoping Constructor Pattern alternatives).

2019.08.31 "…work with percentiles rather than the mean (arithmetic average) of a set of values. Doing so makes it possible to consider the long tail of data points, which often have significantly different (and more interesting) characteristics than the average." (Site Reliability Engineering)

2019.08.25 "No clear map or schedule for continuing to learn … while you tackle more immediate challenges; not creating many small katas, exercises, and projects while learning; not seeking one-on-one mentorship" (28 Pitfalls When Learning to Program).

2019.08.24 Experimenting with notes ("a collection without order").

Bookmarks

Rules of Machine Learning
43 best practices couched in standardized terminology, e.g. "The number of feature weights you can learn in a linear model is roughly proportional to the amount of data you have."
Best Kept Secrets of Great Programmers
39 software development tips as a Quora answer, e.g. "Code paths that handle failures are rarely tested/executed (for a reason). This makes them a good candidate for bugs."
Bumper-Sticker API Design
38 API design tips, e.g. "Keep APIs free of implementations details… be wary of overspecification."