From Startups to MegaCorp: Unlearning to fix everything

In the startup world when you discover an issue, whether it's a logic error or an environmental misconfiguration, being able to fix it quickly and, maybe, document the fix is a virtue.

A hard lesson that I'm learning is that in a different environment that which was a virtue in the startup world can actually be harmful. By fixing an environmental issue you can deprive the organization of the process correction necessary to handle it structurally.

This strikes me as analogous to the decision to *not* handle an error in code. Sometimes you intentionally don't handle an error even though it is anticipated because to do so masks a more fundamental problem. The "if this error occurs, something is deeply wrong in a way that can't be recovered automatically" is a widely established pattern in software construction.

The hard part is realizing that it generalizes to the organizational level. By fixing an environmental issue that another team is responsible for you may well deprive that team of a needed corrective. What happens when you're too busy working on problems that can only be handled by your team and the issue comes up again?

Expressivity vs Trickle-Down Layering

A basic architectural principle in software engineering about dependencies across layers is that the flow of dependencies should go in a single direction. For example, in a hypothetical 3 layer system composed of top, middle and bottom layers the top layer depends on the middle layer and the middle layer depends on the bottom layer.

If the middle layer reaches back up into the top layer then you've created a circular dependency. Circular dependencies make a system harder to understand (and therefore maintain). One way in which they do this is their tendency to create chicken-egg problems. Put another way, they introduce another axis of dependency (order/time) that may not be obvious from inspecting the code. Chicken-egg problems aren't necessarily intractable, clearly *something* came first, but by adding another dimension they exponentially increase the size of the solution space making for many more blind alleys in which to get lost.

Working on a large system built up over time by many different contributors has given me a much greater appreciation for the importance of this basic architectural principle. This is a bit of a hard-learned lesson because it tends against one of my favorite design principles: the composite pattern. A layer built based on the composite pattern can reference itself as well as lower layers. I'm playing a little loose with the terminology here because layers generally refer to larger pieces of a system while the composite pattern is usually applied to smaller pieces of a large system but bear with me.

Let's take a hypothetical system called Manuals. The Manuals system is maintained by the Manuals team. Manuals depend on Chapters. Chapters depend on Sentences. Sentences are the lowest level of the system - this is where the meat of the work is done. The higher layers (Chapters and Manuals) are primarily about organizing/ordering the work of Sentences.

A few months after the Manuals system ships Developer X encounters it and, being fresh full of ideas about design patterns, thinks it would be really useful if Chapters could be composites. That is, he'd like Chapters to be able to contain Sentences and Chapters. He thinks the system would be so much more expressive that way. And it would enable reuse at the Chapter level. Who doesn't love wringing more expressivity and reusability out of a system?

Fast forward 2 years. Half a dozen developers have rotated in and out of the project each with varying levels of expertise, varying exposure to the original designs and different constituents to satisfy. None of the original developers is still working on it. The design docs, which were originally shared on the shiny, linky (e.g., non-duplicative) intranet were lost long ago somewhere between conversions to Intranet v2 then v3 and now v4 (the current version). Even if they could be found they'd be horribly out of date as the codebase is 10 times its original size. You're tasked with what should be a simple fix: change the wording of a boilerplate sentence that appears in nearly every Manual produced since the project originally shipped.

Being new to the project what do you do? You look at the manuals you can find. You see that each Manual appears to be made up of Chapters and each Chapter appears to be made up of Sentences. So you start your search in the Chapter directory and change every instance of the boilerplate sentence in each of the Chapters.

In a system that maintains the downward flow of dependencies (Manuals -> Chapters -> Sentences) there's a very good chance that this particular bug has been quashed: the boilerplate sentence is updated everywhere it will appear.

But what if Chapters were Composites in the system: they contained both Sentences and Chapters? Although originally all the Chapters were in a single directory, someone once accidentally chose a Chapter name that was already in use (without knowing it) so there was a major refactoring project that moved Chapters contained by other Chapters into other directories. A few months after that refactoring there was a big push to support the creation of chapters by another team. The Manual team just didn't have the bandwidth to handle the demand for Manuals. But the other team stored their artifacts in a different location. So to keep them productive the Manuals team added support for chapter references - pointers to chapters defined in other locations.

Now the job of changing a single boilerplate sentence everywhere it appears has gone from being a grep/findstr in a single directory to a programming task in and of itself. One that crawls Chapter references and looks for boilerplate sentences in the pointed-to Chapters. And doesn't crash on stale references. And doesn't infinitely loop when circular Chapter references are found. All because the downward flow of dependencies was broken when the elegant composite pattern was introduced into the system.