Accidental or Essential? Understanding Complexity in Software Design

I've been guilty of this myself: walking into a new codebase, seeing some gnarly architecture, and immediately thinking "this is way over-engineered." It's a natural reaction– when you don't understand why something is complex, it's easy to assume it shouldn't be. But I've learned (sometimes the hard way) that not all complexity is accidental or unnecessary. Some of it is domain-driven, arising from real-world requirements and constraints that aren't immediately obvious to someone new to the system.

The challenge isn't just technical; it's learning to distinguish between complexity that serves a purpose and complexity that's just cruft. This requires understanding not just the code, but the organizational and business context that shaped it.

Essential vs. Accidental Complexity

Fred Brooks made this distinction famous: essential complexity comes from the problem domain itself, while accidental complexity comes from our tools and implementation choices¹. Essential complexity is "caused by the problem to be solved, and nothing can remove it." If your banking system needs to handle 30 different regulatory requirements, that complexity has to live somewhere in your software.

Accidental complexity is the stuff we pile on top. Bad abstractions, overly clever code, or just choosing the wrong tools for the job. Think about the difference between writing assembly and writing a high-level language. Moving away from assembly eliminates tons of accidental complexity (manual memory management, register allocation) so you can focus on the actual problem you're trying to solve.

The key insight is that essential (domain-driven) complexity is irreducible– you can only redistribute it– whereas accidental complexity is theoretically avoidable. Complexity in systems tends to follow what Larry Wall called the "waterbed theory": if you push complexity out of one part of a system, it will pop up somewhere else². You can't magically eliminate the intrinsic complexity of a hard problem; you can only shift it around.

But here's the thing: in practice, telling the difference is harder than it sounds. What looks like needless over-engineering to me might actually be solving a critical business problem I don't know about. I've seen this play out countless times– someone new to a team looks at a complex piece of code and assumes it's just bad design, when actually it's handling some edge case that cost the company real money to learn about.

Before I write off something as over-engineered, I try to ask: what problem might this be solving? Maybe that weird caching layer is there because the database can't handle the load. Maybe those extra validation steps exist because of some regulatory requirement. Maybe that convoluted state machine is modeling a genuinely complex business process. You won't know until you dig in.

The "Accidental Complexity" Trap: Don't Dismiss What You Don't Yet Grasp

I see this pattern all the time: a new engineer joins the team, looks at the codebase, and immediately starts talking about how they could rewrite everything in half the lines of code. "Why do we need all these layers? This is way over-engineered!" (It's me, this engineer is me.)

Sometimes they're right. But more often, they're missing crucial context. That "unnecessary" abstraction might be preventing a class of bugs that used to bite the team regularly. Those extra layers might be there because the system needs to handle edge cases that aren't obvious from reading the happy path code.

Here's something I wish someone had told me earlier in my career: the urge to immediately "fix" unfamiliar code is often more about ego than engineering. When I see something I don't understand, my first instinct used to be to assume it was wrong and try to simplify it. But I've learned that the best engineers approach complexity with curiosity first. Instead of asking "How can I make this simpler?" they ask "What problem was this solving?"

Learning from Real Examples

Let me give you a concrete example. Say you're looking at a system that uses a message queue between two services, when a simple HTTP call would seem to do the job. Your first reaction might be "Why all this infrastructure? Just call the service directly!"

But maybe that queue is there because the downstream service used to go down and take the whole system with it. Maybe it's handling traffic spikes that would otherwise overwhelm the backend. Maybe there's a business requirement that certain operations must eventually complete, even if parts of the system are temporarily unavailable. Remove that queue in the name of "simplicity" and you might find out the hard way why it was there in the first place.

To ground this in a concrete example: at Mercury we originally handled background jobs with a mixture of Postgres tables, cron timers, and a thin internal queue library. It felt simple– no extra moving parts, everything ran in three EC2 instances, and every engineer could drop a row (or a few thousand) in the jobs table whenever they needed something asynchronous to happen.

This worked great... until it didn't. As we grew, the cracks started to show. Anyone could touch the shared job tables, so ownership got messy. Different teams implemented retries in subtly different ways. When something broke, debugging meant digging through a maze of ad-hoc scripts and cron jobs that nobody fully understood.

The solution looked like more complexity: we adopted Temporal and I ended up writing the Haskell SDK for it. Suddenly we had dedicated services, new concepts like workflows and task queues, and another system to operate. But here's the thing– that additional structure directly solved the problems we'd been papering over. Exactly-once execution, durable timers, clear ownership boundaries. What used to be scattered retry logic became explicit, type-checked workflow definitions.

A year later, engineers regularly tell me they're glad we made the switch. The operational experience is so much better, even though the system looks more complex on paper.

The lesson? Sometimes what looks like simplicity is actually just pushing complexity around. When multiple teams are stepping on each other in the same "simple" codebase, that coordination overhead can be way more expensive than a bit of extra infrastructure.

This insight aligns with research showing that high-performing teams don't just write better code– they design systems that minimize coordination costs³. When teams can work independently without constantly negotiating changes or waiting on others, they deliver faster and more reliably.

Domain-Driven Design and Perceived Complexity

I see this same pattern with Domain-Driven Design. Someone looks at a DDD codebase with its aggregates, domain events, and bounded contexts and thinks "why so many layers? This could just be a simple CRUD app!"

But those patterns might be the only thing standing between you and a complete mess. In complex domains like finance or healthcare, an "simple" anemic model often turns into spaghetti code– business rules scattered everywhere, teams stepping on each other's data, and nobody quite sure what the system is supposed to do when edge cases arise.

The DDD abstractions are carrying domain complexity in a structured way. As one discussion on software complexity noted, "The domain complexity is something you can't remove, but complexity you've chosen to introduce through tooling should be carefully thought about"⁴. In other words, focus on solving the domain problem, and be wary of any complexity that doesn't serve that end.

Building the Right Habits

When you see something that looks unnecessarily complex, start with curiosity instead of criticism. Ask the people who built it what problem they were solving. Dig into the commit history. You might discover that what looks like over-engineering is actually solving a real problem you didn't know about.

And when you're the one adding complexity, be ready to explain why. If someone asks "Do we really need this extra service / library / abstraction?" you should have a good answer. Something like "Yes, because it prevents X problem that we've been hit by before" or "Yes, because without it we can't handle Y requirement." If you can't explain why something is there, that's a pretty good sign it might not need to be.

The Organizational Dimension: How Teams Shape Systems

High-performing teams don't just manage technical complexity well– they actively design systems to minimize coordination overhead. The most successful organizations structure their software to match how teams actually work, creating clear boundaries that let teams operate independently.

Many seemingly technical problems are actually organizational problems in disguise– and sustainable solutions require addressing both dimensions simultaneously⁵. When we talk about complexity in software systems, we're really talking about complexity in sociotechnical systems where code, teams, and organizational structures all interact.

Bounded Contexts as Coordination Boundaries

This isn't just about microservices versus monoliths. It's about drawing the right boundaries around domains of knowledge and responsibility. When a team needs deep understanding of multiple non-overlapping domains to make changes, that's a form of complexity that often goes unacknowledged. Consider a team that needs to understand both payment processing and inventory management to implement a new feature. Even if the code is clean and well-structured, the cognitive load of mastering two distinct domains creates its own kind of complexity.

The solution is to establish clear bounded contexts– not just in the code, but in how teams are organized and how systems interact. A bounded context creates a natural boundary where one team's deep expertise ends and another's begins. Within their context, teams can make changes without needing to coordinate with others or understand the intricacies of other domains.

For example, in a financial system, you might have separate bounded contexts for:

Payment processing (handling transactions, fraud detection)
Account management (user profiles, authentication)
Reporting (analytics, compliance)

Each of these domains has its own language, its own invariants, and its own failure modes. By drawing clear boundaries between them, you allow teams to develop deep expertise in their domain without being overwhelmed by the complexity of others.

The Benefits of Clear Boundaries

Teams working within clear bounded contexts:

Make changes faster because they don't need to coordinate across domain boundaries
Have fewer production incidents because they understand their domain deeply
Can onboard new team members more quickly because the scope of knowledge is contained
Are more likely to maintain high code quality because the system's behavior is more predictable

These benefits aren't just nice-to-haves– they're essential for high performance. Teams that can work independently, without constant coordination, deliver faster and more reliably.

Organizational complexity follows predictable patterns as teams grow, and system architecture must evolve to match these patterns⁶. A key insight is that team size and structure directly influence the complexity of the systems they can effectively maintain. A team of 4 engineers can successfully manage a different level of system complexity than a team of 40, and the architecture should reflect this reality rather than fighting against it.

Treating Organizational Constraints as Design Constraints

Organizational constraints should be treated as design constraints– not obstacles to work around, but fundamental parameters that shape what solutions are viable⁷. This perspective radically changes how we evaluate complexity in software systems.

Consider a common scenario: a team inherits a system with what appears to be needlessly complex service boundaries. The immediate reaction might be to consolidate services to reduce operational overhead. But a systems thinking approach suggests asking different questions first: What organizational structure was this system designed to support? What coordination patterns was it optimizing for? What happens to team autonomy and delivery speed if we change these boundaries?

Several organizational patterns directly influence system complexity:

Team Topologies and System Architecture: The structure of your teams will inevitably be reflected in your system architecture (Conway's Law). Rather than fighting this, design both your teams and your systems to reinforce each other. If you have three teams working on a product, you'll likely end up with three major system components– and that's not necessarily bad if it matches the domain boundaries and enables team autonomy.

Coordination Costs vs. Autonomy: Every system design involves trade-offs between coordination costs and team autonomy. A monolithic architecture minimizes coordination between services but maximizes coordination between teams working on the same codebase. Microservices do the opposite. The "right" choice depends on your organizational context, not just your technical requirements.

Information Flow and Decision Rights: Complex systems often emerge when information flow and decision-making authority don't align with system boundaries. If Team A needs to understand Team B's internal implementation details to make changes, that's a sign that either the system boundaries or the organizational boundaries need adjustment.

By treating organizational constraints as first-class design constraints, we can make more informed decisions about complexity. Sometimes the "right" technical solution is the wrong organizational solution, and vice versa. The art as you grow as an engineer is finding designs that work well for both.

The Migration Trap: When Transitions Become Permanent

Here's a complexity trap I've seen kill more projects than almost anything else: the incomplete migration. You start moving from system A to system B, get halfway there, and then... stop. Now you have all the complexity of both systems plus the glue code holding them together.

I've seen this play out so many times. A team decides to break up their monolith, extracts a few services, builds all the infrastructure to sync data between the old and new systems. The first couple services work great! But then other priorities come up, the migration gets deprioritized, and you're stuck in this weird hybrid state where 60% of your logic is still in the monolith.

Now you have the worst of both worlds:

All the operational overhead of several services (multiple deployments, network calls, distributed debugging)
All the coupling problems of the monolith (shared database, tangled dependencies)
Plus all the complexity of keeping the two systems in sync
And developers who need to understand both architectures to get anything done

This zombie state can persist for years because it's "working"– you can still ship features, even though the system is way more complex than either a clean monolith or clean microservices would be.

Why Migrations Stall and How to Complete Them

Why do migrations stall? I've seen the same patterns over and over:

Nobody wants to own the boring work: Migrations get treated as "technical debt cleanup" instead of essential business work. When feature deadlines loom, guess what gets cut?

The 80/20 problem: The first 80% of a migration is usually the easy stuff– clean, well-understood code. The last 20% is all the weird edge cases and legacy integrations that nobody has time to touch.

Good enough syndrome: Once the migration is partially done and things are working "well enough," it's hard to justify spending more time on it. The system is better than it was before, so why keep going?

Successful migration completion requires treating it as a first-class engineering and organizational priority:

Executive Sponsorship: Migrations need sustained organizational commitment from leadership
Dedicated Resources: Assign dedicated engineers to migration work rather than expecting it to happen in spare time
Clear Metrics and Timelines: Track migration progress with concrete metrics and set realistic but firm deadlines
Sunset Planning: Plan the decommissioning of old systems from the beginning, with explicit dates for when old systems will be turned off

Completing migrations is fundamentally about complexity management. Every incomplete migration represents a choice to accept ongoing complexity in exchange for avoiding short-term effort. But this trade-off rarely makes sense in the long term. Sustainable engineering organizations must be willing to invest in foundational work that doesn't directly deliver features but enables future productivity⁸.

When Complexity is Just Bad Design

While we've discussed how complexity can be justified by domain needs or organizational structure, it's crucial to recognize when complexity is simply the result of poor design. Before labeling something as "bad complexity," gather objective data. As Michael Feathers emphasizes in "Working Effectively with Legacy Code," the first step is to understand what the code actually does and how it behaves in production.

Here are telltale signs that complexity is unjustified, along with how to measure and address each:

Inconsistent Patterns: When similar problems are solved in different ways throughout the codebase, it's often a sign of accidental complexity. If you have three different approaches to caching, or error handling follows different patterns in different modules, that's probably bad design rather than essential complexity.

Unclear Ownership: If no one can confidently say who owns a piece of code or who should make changes to it, that's a red flag. While bounded contexts can be complex, they should have clear ownership and responsibility.

Tight Coupling Without Justification: When components are tightly coupled but there's no clear domain reason for that coupling, it's likely poor design. If your user authentication system needs to know about the details of your payment processing, that's probably unnecessary complexity.

Complexity Without Benefits: If a system is complex but doesn't provide any corresponding benefits (better performance, clearer domain modeling, reduced coordination overhead), it's probably over-engineered. Every piece of complexity should have a clear purpose.

Historical Accidents: If the current design is the result of historical accidents or "temporary" solutions that became permanent, that's usually bad complexity. For example, if you're using a message queue because you once needed it for a specific feature but now it's just adding overhead, that's probably unjustified complexity.

The key question to ask is: "What would we lose if we simplified this?" If the answer is "nothing important" or "we'd actually gain clarity and maintainability," then the complexity is probably unjustified. On the other hand, if simplifying would mean losing important domain guarantees, breaking team boundaries, or sacrificing necessary performance characteristics, then the complexity might be essential.

Remember that good design often looks simple in retrospect. The best architectures make complex problems seem simple, not the other way around.

Tools for Assessing Complexity

How can developers assess whether a proposed complexity is worth it? Here are five complementary techniques: using a novelty budget, monitoring cognitive overhead, doing a cost / benefit analysis, considering long-term maintenance trade-offs, and applying systems thinking to organizational constraints.

The Novelty Budget: Spend Your Complexity Wisely

Every new technology or pattern you introduce into a project carries a cost. There's the immediate cost of learning it, integrating it, and the ongoing cost of debugging and maintaining it.

One concept that has proven valuable is the idea of a "novelty budget." The idea, as my very excellent friend Mark Wotton puts it, is to decide up front how much novelty (new tech or experimental approaches) your project can handle, and "apportion it out in the ways that give the biggest bang-for-buck"⁹.

If you spend your entire budget on shiny new frameworks and tools, you risk overwhelming the project with unknowns. Instead, spend your novelty budget sparingly: pick one or two areas to innovate, and keep the rest of your stack as boring and battle-tested as possible. This approach reduces the risk that you'll get stuck fighting on too many fronts.

Dan McKinley framed this as "innovation tokens"– each team has a limited number to spend, say three, and every time you choose cutting-edge or unfamiliar tech, you spend one¹⁰. Want to write your backend in a brand-new language? That's a token. Use an experimental NoSQL database? Another token. Roll out your own homebrew analytics system? That's probably all your tokens gone.

The supply of tokens is not unlimited, especially for a young team or project. So spend them where it counts. If you're a payments company, maybe innovating in payment tech is worth it, but using the coolest UI framework probably isn't.

This philosophy helps avoid accidental complexity that stems from the lure of the new and shiny. Boring technology is underrated; as McKinley points out, boring often means well understood– not only its capabilities but its failure modes are known quantities. That predictability is valuable when you're trying to reduce accidental complexity.

Mind the Cognitive Overhead

A critical aspect of system complexity is cognitive overhead: how much mental effort is required for a developer to understand, work on, and operate the system. You can think of it as the number of different "things" you need to keep in your head at once to make a change.

Modern software organization thinking emphasizes designing systems with team cognitive load in mind. Cognitive load can be categorized into three types¹¹:

Intrinsic load (inherent to the work, e.g. basic computer science concepts, language syntax)
Germane load (related to the domain and problem at hand, essentially the essential complexity we want the team to focus on)
Extraneous load (the overhead that isn't directly adding value, often corresponding to accidental complexity)

Our aim should be to minimize extraneous cognitive load so that developers can invest brainpower in the germane (domain-driven) complexities. If adding a fancy caching layer speeds up the app but forces every new contributor to spend days deciphering cache invalidation logic, that's a hefty cognitive tax.

One way to gauge cognitive overhead is to consider how many distinct concepts or components a developer must touch to implement a typical feature. If each requires pivoting between multiple complex mental contexts, the total load can become crushing.

For example, many teams have jumped on the microservices bandwagon believing smaller codebases automatically mean simpler systems. Then they discover that while each service is simple, the system as a whole is more complex and difficult to reason about. The complexity didn't vanish; it moved to the "space between" services– the network, the orchestrations, the deployments.

Martin Fowler warns that microservices come with a significant "microservice premium"– overhead that only pays off for systems complex enough to need independent scaling and deployments¹². For a small application, that premium is just accidental complexity.

The best systems often have a kind of visual clarity: a new developer can draw a mental map with a few boxes and arrows and basically understand what talks to what. Aim for the simplest architecture that adequately addresses the problem. Simplicity isn't just about fewer lines of code– it's about how easily a human mind can build a correct model of the system's behavior.

Cost / Benefit Analysis: Is the Complexity Paying Its Way?

Weighing costs against benefits is second nature in many parts of business, but sometimes we engineers forget to apply it to technical choices. A disciplined cost/benefit analysis for complexity asks: What specific benefit are we getting by adding this piece of complexity, and is it worth the cost?

The Extreme Programming mantra YAGNI, ("You Aint't Gonna Need It",) advises us not to implement features or components until they are truly necessary. If the benefit is purely speculative ("we might need this when we have 10× users"), then maybe you ain't gonna need it... at least not now. If the benefit addresses a clear and present issue, then it's more justified.

Consider introducing a second database versus using one. The benefit might be faster queries for a specific feature, or easier horizontal scaling for that subset of data. The costs include additional operational burden, data duplication or synchronization, more complex backup/restore processes, and developers needing expertise in both systems.

For a small app with modest traffic, a well-chosen index in your existing SQL database might achieve acceptable performance without a whole new system. But for a large app where search functionality is central, the benefits of a purpose-built search engine clearly outweigh the costs.

Treat complexity as an investment– demand a worthwhile return for every unit of complexity you add. If the return isn't there, don't spend it.

Long-Term Maintenance Trade-offs

When assessing complexity, consider the long-term maintenance trade-offs. Choices made today will echo in maintenance costs (or savings) for years to come. This is where "pay me now or pay me later" often comes up.

You can choose to keep a system simple today (low complexity) at the cost of potentially paying more in rework later, or you can invest more effort now (higher upfront complexity) to save yourself pain down the road. Both approaches can be valid, depending on context, but the decision should be conscious.

A classic example is technical debt. If you rush out an implementation with copy-pasted code everywhere and minimal tests, you avoided complexity in the short term. But as the codebase grows, the lack of abstraction leads to inconsistencies and harder changes, and the lack of tests makes refactoring risky. What was "simple" initially becomes very complex to deal with later.

One helpful approach is to regularly ask: "What happens if we don't do this now? What happens if we do?" Consider the worst-case scenario for a piece of complexity. If you add an external service and its vendor goes out of business, what's your fallback? If you build a highly complex in-house framework and the lone expert leaves, can the rest of the team pick it up?

Long-term trade-offs demand that we project our decisions into the future: How hard will this be to change later? Are we optimizing for now at the expense of later, or vice versa, and is that intentional?

Systems Thinking: The Organizational Lens

The final tool for assessing complexity is systems thinking: the ability to understand how technical decisions interact with organizational realities. This perspective asks not just "Is this technically sound?" but "How does this fit into the broader sociotechnical system?"

When evaluating complexity through a systems lens, consider:

Team Capacity and Growth: How does this complexity align with your team's current capabilities and expected growth? A solution that works perfectly for a team of 5 might become unmanageable for a team of 15.

Communication Patterns: Does this design support or hinder the communication patterns your organization needs? If your solution requires frequent coordination between teams that rarely interact, you're creating organizational friction.

Decision-Making Authority: Who has the authority to make changes to this system, and how does that align with who needs to make changes? Complex systems often emerge when decision rights and information don't flow to the same places.

Failure Modes: How does this complexity fail, and who bears the cost of those failures? A technically simple solution that creates organizational chaos during outages might be more expensive than a technically complex solution that fails gracefully.

The key insight from systems thinking is that sustainable complexity serves both technical and organizational needs¹³. When evaluating complexity, ask not just whether it solves the immediate technical problem, but whether it supports the kind of organization you're trying to build.

Navigating the Social Dynamics: When Someone Claims Your Solution is "Accidental Complexity"

One of the most challenging aspects of complexity management isn't technical– it's social. You've carefully designed a solution that addresses real domain needs, organizational constraints, and long-term maintainability. Then a colleague looks at your work and declares it "accidental complexity" that should be simplified or eliminated.

This scenario is particularly common when a new team member joins and sees unfamiliar patterns, or when someone from another team reviews your architecture. The temptation is to get defensive or dismissive, but defensive reactions rarely lead to good outcomes.

A Framework for Constructive Dialogue

Step 1: Listen and Understand Their Perspective

Before defending your solution, genuinely try to understand their viewpoint:

"What specific aspects seem unnecessarily complex to you?"
"What simpler approach do you have in mind?"
"What's your experience with similar problems in other contexts?"

Often, their concerns reveal legitimate issues you hadn't considered, or they highlight areas where your solution could be better documented.

Step 2: Acknowledge Valid Points

If they've identified genuine issues, acknowledge them explicitly:

"You're right that this configuration system is hard to understand. We should improve the documentation."
"That's a fair point about the cognitive overhead. Let me walk through why we made this trade-off."

This demonstrates that you're open to feedback and focused on finding the best solution rather than defending your ego.

Step 3: Provide Context, Not Justification

The key difference between providing context and being defensive is the tone and intent. Context-sharing is educational and collaborative; defensiveness is protective and adversarial.

Instead of: "You don't understand the requirements. This complexity is necessary because of X, Y, and Z."

Try: "Let me share some context about the constraints we were working with. We discovered that X was a hard requirement because of [specific business need]. Y became necessary when we hit [specific scaling issue]."

Frame it as sharing information that will help them understand the problem space, not as proving them wrong.

Step 4: Explore Alternatives Together

Once you've shared context, invite them to help find better solutions:

"Given these constraints, what approach would you suggest?"
"Are there ways we could achieve the same guarantees with less complexity?"

This shifts the conversation from criticism to collaborative problem-solving.

Embrace an Experimental Mindset

One of the most effective ways to move past theoretical debates about complexity is to propose well-scoped experiments. Instead of arguing about whether approach A or B is better in the abstract, suggest trying one approach in a limited context to gather real data.

Frame experiments as learning opportunities, not commitments:

"What if we try your approach for this one service and see how it works out?"
"Let's implement both approaches for this feature and compare the maintenance overhead after a month."

Key principles for effective complexity experiments:

Start small and contained: Choose a bounded context where the impact is limited
Define clear success criteria upfront: What would make this experiment successful?
Set a time boundary: "Let's try this for two sprints and then evaluate"
Plan the rollback: Before starting, agree on how you'll revert if the experiment doesn't work out

This experimental approach transforms complexity discussions from adversarial debates into collaborative investigations. Instead of "your solution is too complex," the conversation becomes "let's figure out the right level of complexity for this problem together."

Senior engineers understand that the goal is the best outcome for the system, not protecting their personal contributions. This means being willing to acknowledge when your solution could be improved, even if it means admitting that your initial approach wasn't optimal. The ability to separate your professional worth from any particular technical decision is a hallmark of engineering maturity.

Conclusion

Look, complexity in software is unavoidable. Our job isn't to eliminate it (good luck with that), but to manage it thoughtfully. The key is learning to tell the difference between complexity that's solving real problems and complexity that's just making our lives harder.

Next time you (or a teammate) start complaining that something is "way too complicated," take a step back. Ask why it's there. Maybe you'll find out it's genuinely over-engineered and can be simplified. But maybe you'll discover it's handling some gnarly business requirement or preventing a class of failures you didn't know about. Either way, you'll make a better decision than if you just assumed it was bad design.

The best engineers I know have learned to separate their ego from their code. Your value isn't in how clever your abstractions are– it's in your ability to make good trade-offs that help the system (and the team) in the long run. Sometimes that means defending necessary complexity, sometimes it means ruthlessly cutting the cruft. Either way, it's not about protecting your personal contributions.

Using tools like a novelty budget keeps our penchant for new tech in check. Keeping an eye on cognitive load ensures we don't overload our fellow developers with too many moving parts. A sober cost/benefit analysis forces justification of every abstraction or component in terms of real value. Thinking about maintenance trade-offs reminds us that software is a marathon, not a sprint. And applying systems thinking helps us understand how technical decisions interact with organizational realities, ensuring our solutions work for both the code and the people who maintain it.

The bottom line: don't be afraid of complexity, but make sure you understand why it's there. Cut the stuff that isn't earning its keep, but don't shy away from complexity when the problem actually demands it. Good design isn't about having no complexity. it's about having the right complexity in the right places.

So next time you're looking at some code architecture that induces a sense of eldritch horror in your innermost being, resist the urge to immediately label it "over-engineered." Ask the hard questions first: Why is this here? What would break if we removed it? What problem is it actually solving?

Develop that intuition for when complexity is justified versus when it's just cruft, and you'll make way better decisions than if you just follow some blanket rule about "keeping things simple."

Because in the end, complexity isn't the enemy—unjustified complexity is. Learn to tell the difference, and your systems will be both simpler and more robust.

Fred Brooks, "No Silver Bullet – Essence and Accident in Software Engineering," 1986. Brooks defines essential complexity (inherent to the problem domain) versus accidental complexity (created by the software implementation or tools). Essential complexity is caused by the problem to be solved, and nothing can remove it. ↩
Larry Tesler's Law of Conservation of Complexity (also echoed by Larry Wall's "Waterbed Theory of Complexity"). Tesler's Law states that every system has an inherent amount of complexity that cannot be removed — only shifted around. If you make one part of a system simpler, some other part might have to absorb additional complexity. ↩
Nicole Forsgren, Jez Humble, and Gene Kim, "Accelerate: The Science of Lean Software and DevOps," 2018. Their research demonstrates that high-performing teams design systems to minimize coordination overhead, enabling independent work and faster delivery. ↩
Comment by user kthejoker2 on Hacker News, discussing essential vs. accidental complexity: "The domain complexity is something you can't remove, but complexity you've chosen to introduce through tooling should be hard thought about." ↩
Will Larson, "An Elegant Puzzle: Systems of Engineering Management," 2019. Larson's central thesis is that many seemingly technical problems are actually organizational problems in disguise, requiring solutions that address both technical and organizational constraints. ↩
Larson describes how team size and structure directly influence the complexity of systems they can effectively maintain. Architecture should reflect organizational reality rather than fighting against it. ↩
Larson argues that organizational constraints should be treated as design constraints—fundamental parameters that shape what solutions are viable, rather than obstacles to work around. ↩
Larson emphasizes that sustainable engineering organizations must be willing to invest in foundational work that doesn't directly deliver features but enables future productivity. ↩
Mark Wotton, "You Need a Novelty Budget," 2018. Introduces the idea of explicitly limiting how much new or unproven technology you adopt in a project: "you decide how much tolerance you have for library code you've never deployed before, and apportion it out." ↩
Dan McKinley, "Choose Boring Technology," 2015. McKinley introduces innovation tokens as a metaphor: "Let's say every company gets about three innovation tokens... If you choose to use MongoDB, you just spent one." Both sources advise spending your "complexity budget" on only a few high-impact innovations. ↩
Matthew Skelton & Manuel Pais, "Team Topologies," 2019. They emphasize designing software architectures to fit within a team's cognitive capacity, defining intrinsic load (basic required knowledge), germane load (domain-specific understanding), and extraneous load (unnecessary overhead). ↩
Martin Fowler, "Monolith First," 2015. Fowler notes that microservices come with a "Microservice Premium"—overhead that only pays off for systems complex enough to need independent scaling and deployments: "Almost all successful microservice stories start with a monolith that got too big." ↩
Will Larson, "An Elegant Puzzle: Systems of Engineering Management," 2019. The key insight is that sustainable complexity serves both technical and organizational needs. When evaluating complexity, ask whether it supports the kind of organization you're trying to build. ↩