
Here's a thing that happened on a product we were building with a team last year.
Three engineers — each working with AI coding tools — shipped three features in parallel over two days. All three worked perfectly in isolation. They'd each written clean code, good tests, sensible architecture. Individually, each feature was exactly what we'd asked for.
When we merged them, the application broke in ways none of them could have predicted.
Not merge conflicts. Git handled the merge without complaints. The code compiled. The test suites passed. But the product was incoherent. Feature A had restructured the data model for a customer profile page. Feature B had built a notification system that referenced the old data model. Feature C introduced a settings panel with a UX pattern that directly contradicted the flow Feature A had established. Three well-built features that, together, created a product nobody would understand how to use.
This is the orchestration problem. And it's the one that's going to define the next era of software delivery.
Thanks for reading! Subscribe for free to receive new posts and support our work.
For most of software's history, coordination happened implicitly. Developers worked on a shared codebase in rough sequence — one person picked up a feature, worked on it, committed it, and moved on. The next person's work started from a codebase that included the first person's changes. Context was shared through osmosis: overhearing conversations, seeing commits in the morning, picking up assumptions in standup. It wasn't formal. It didn't need to be, because the pace of work made informal coordination sufficient.
AI broke that model. Not because AI is bad at coordination — it doesn't coordinate at all.
AI generates code within whatever context you give it. If that context is stale, incomplete, or siloed, the output will betechnically correct and systemically wrong. And when you have multiple engineers working with AI simultaneously, each operating in their own context, you get parallel work that diverges fast.
This is a problem that distributed systems engineers recognized decades ago: when you parallelize, you need explicit coordination mechanisms. Eventual consistency, consensus protocols, conflict resolution strategies — these aren't academic concepts. They're the engineering response to a fundamental truth:parallel work without shared truth produces incoherent results.
The irony is sharp. The faster your team ships individual features, the more you need to invest in orchestration.
Speed without coordination doesn't produce a product. It produces a pile of features.
I've started classifying the conflicts I see on AI-assisted teams into three tiers, because not all conflicts are equal and they require different solutions.
Tier 1: Syntactic conflicts.These are your traditional merge conflicts — two people edited the same file, Git complains, someone resolves it. AI-assisted development produces more of these because more code is being written in parallel, but they're the easiest to handle. Tooling exists. Most teams manage these fine.
The syntactic tier is where most teams think the problem ends. It doesn't.
Tier 2: Semantic conflicts.Two features make incompatible assumptions about the system. The code merges cleanly and the tests pass, but the features don't work together correctly. Maybe they disagree about a data model. Maybe one feature assumes a user flow that the other feature changed. Maybe they handle an error case differently — one retries, one fails hard — and the interaction between them creates a behavior that neither developer intended.
Semantic conflicts are insidious because they're invisible to automated checks. Git doesn't know about your domain model. Your test suite tests each feature in isolation. The conflict only surfaces when a user encounters the intersection — or when a sharp-eyed engineer catches it in review, if the reviewer has enough context to see it.
On one product, we shipped a feature that looked perfect in testing but broke for about 15% of users. The cause: two AI-assisted features had been built simultaneously, and each made different assumptions about how a particular status field was used. Both were defensible interpretations. Neither was wrong individually. Together, they created a state that the application didn't know how to handle. We only found it because we were monitoring user behavior metrics after launch — not just error rates, but actual outcomes.
Tier 3: Strategic conflicts.This is the one nobody's talking about, and it's the most expensive. Two features optimize for different goals that were never made explicit. One feature optimizes for conversion — shorter flows, fewer clicks, faster to completion. Another feature optimizes for retention — more engagement, more depth, more surface area. Both are well-built. Both serve their stated purpose. Together, they pull the product in two directions. The user experience becomes incoherent not because anything is broken, but because the product is trying to be two things that don't fit together.
Strategic conflicts happen when intent isn't explicit. When two engineers — or two AI-assisted workflows — don't share a clear understanding ofwhythey're building what they're building, they'll build things that pull in different directions. No amount of code review catches this, because each piece of code is correct within its own unstated assumptions. The conflict is in the assumptions, not the implementation.
The most effective pattern I've found for preventing the kinds of conflicts I just described is something I think of as contracts in the codebase. Not legal contracts — structural agreements about interfaces, data models, behavioral expectations, and boundaries.
This isn't a new concept. API contracts, type systems, and interface definitions have existed for as long as software has. What's new is that they becomemandatorywhen parallel work is the default mode of operation.
Here's what this looks like in practice. Before two engineers (or two AI-assisted workflows) work on related parts of the system simultaneously, they spend time — thirty minutes, maybe an hour — defining the boundaries. What data flows between their features? What can each side assume about the other? What happens at the edges? What are the invariants that must hold no matter what either side does?
We've started requiring every piece of work to begin with a clear statement of intent — what we're building, why, what success looks like, and what's out of scope. The first time I introduced this on a team, the reaction was predictable: \"This feels like overhead. Why can't we just start building?\" By the end of the second week, the same engineers were saying \"I can't imagine starting without this.\" Because they'd experienced the alternative — three days of rework because two features made incompatible assumptions that nobody caught until integration.
The analogy to microservices architecture is exact. The industry spent years learning, painfully, that you can't have independent services without explicit contracts between them. A service that makes assumptions about another service's internal state will eventually break when that state changes. The solution — clearly defined APIs, versioned contracts, explicit schemas — is now standard practice. But somehow, when it comes to parallel development within a team, we've been relying on vibes.
It's the same lesson. Parallel work without explicit contracts produces the same class of failures whether you're building microservices or features. The only question is whether you learn the lesson before or after it costs you a week of rework.
So if contracts prevent conflicts, who maintains the contracts? Who makes sure intent is clear, interfaces are defined, and the overall system stays coherent as multiple streams of work advance simultaneously?
A new role is emerging on the best teams I work with. It's not a project manager — project managers track timelines and tasks. It's not a scrum master — that role is dissolving for reasons I covered last week. It's closer to a delivery lead who owns three things: intent, outcomes, and learning.
They make sure every piece of work starts with a clear definition of what success looks like. They're the person who catches strategic conflicts before they happen — \"wait, Feature A is optimizing for conversion and Feature B is optimizing for retention. Are we aligned on which matters more right now?\" They track whether shipped work actually produced the intended outcomes, and they feed that learning back into how the team plans the next cycle.
This role isn't about authority. It's about coherence.
The person filling it doesn't need to be the most senior engineer or the most experienced product person. They need to be the person who asks \"are we still building the same thing?\" often enough that the answer stays yes.
Beyond the role, the teams getting orchestration right share a handful of practices:
Decision records that prevent re-litigation.When a material decision is made — an architecture choice, a scope call, a priority tradeoff — they write down what was decided, what alternatives were considered, and why this option was chosen. Not a formal document. A few sentences in a shared place that anyone can find. The payoff is enormous: when someone (or some AI tool) encounters a question that was already answered, the answer is there. No re-argument. No \"I thought we decided differently.\" No drift.
Structured context that agents can reference.AI tools are only as good as the context they're given. The teams that get the best output from AI-assisted development maintain a structured, current source of truth: the domain model, the system architecture, the constraints, the decisions that have been made. Not a wiki that someone wrote six months ago and nobody updates. A living reference that gets maintained as a first-class artifact. When an engineer gives their AI tool a task, the tool has access to the same context the rest of the team shares. This is the single biggest leverage point I've found for improving AI-assisted output:not better prompts, better context.
Safety gates that scale with risk.Not every change carries the same blast radius. A copy change and a payment flow change shouldn't go through the same process. Define tiers. Low-risk changes move fast — automated checks, maybe a quick async review. High-risk changes get more scrutiny — manual review, rollback validation, monitoring plan in place before deploy. The key is that the bar isn't arbitrary. It's calibrated to the actual risk of the change. This lets you be fast where speed is safe and careful where caution is warranted.
Outcome measurement that replaces status updates.The old question was \"what did you do this sprint?\" The better question is \"did the system improve?\" Not \"did we ship\" — shipping is trivially easy now. \"Did users behave differently? Did the error rate change? Did the metric we cared about actually move?\" This isn't a nice-to-have. It's the only way to know whether your team is building the right things or just building things fast.
I want to close with the thing that surprises people most when I describe how the best AI-assisted teams work.
They havemoreupfront coordination, not less.
The temptation, when AI makes building fast, is to skip the coordination and just start shipping. Build first, integrate later, figure it out as you go. That works for demos and prototypes. It doesn't work for products. Every team I've seen try the \"just ship and integrate\" approach with parallel AI-assisted work has ended up spending more time on rework than they would have spent onupfront coordination— sometimes three or four times longer.
The best teams invest time in defining constraints before parallel execution begins. They spend thirty minutes on intent and boundaries to save three days of rework. They document decisions to prevent three weeks of re-litigation. They maintain shared context to prevent the slow, invisible drift that turns a coherent product into a collection of features that happen to share a codebase.
Clarity before speed. That's the principle. Not because speed doesn't matter — it matters enormously. But because speed without clarity produces chaos, and chaos at AI-assisted velocity creates messes that take longer to untangle than the speed saved.
We spent twenty years optimizing for developer productivity.AI just handed us a 10x on that metric, maybe more.The next twenty years will be about orchestration — making sure that 10x in individual output doesn't become 10x in systemic chaos. The teams that figure this out first will build things the rest of the industry can't. Not because they have better AI tools. Because they have better systems for making sure fast work stays coherent.
That's the bottleneck now. Not coding. Orchestration. And the teams that treat it with the same rigor they used to apply to engineering excellence are the ones that will pull away.
If this resonated, subscribe. We're writing about what's actually changing in software delivery — no hype, no hand-waving, just what we're seeing with real teams on real products.
Written by Skip Marshall
Learn more about our team