Most Product Managers Are Writing Requirements That Break at Machine Speed
- Feb 9
- 11 min read
By Andrew Park | 2026-02-09
Last Thursday, Nicholas Carlini at Google DeepMind published something that made even AI skeptics pause: a functioning C compiler, built almost entirely by coordinated AI agents.
Not a demo. Not toy code. A real compiler (software that translates human-readable code into instructions a computer can execute) that could build real-world programs and pass serious test suites. When I first saw the headlines, I was genuinely amazed. Impressed. Surprised. Even for someone who has been deep in AI for a while, this felt like a real step change on first glance. But once I dug into the details, what impressed me changed. And what I found matters directly to every Product Manager, regardless of technical background. The real breakthrough wasn't the AI's raw capability. It was orchestration. And orchestration is something you already understand as a Product Manager.
A note on terminology: I use "Product Manager" to describe whoever is responsible for defining what to build and what success looks like - whether your organization calls this role Product Manager, Capability Manager, Product Owner, or embeds it in systems engineering. The orchestration skills I'm describing apply to whoever translates operational needs into requirements that development teams execute against.
Why this matters if you're not an engineer
You orchestrate constantly. You break ambiguous requests into clear stories. You define what success looks like. You set up feedback loops through sprint reviews and user testing. You create constraints that guide your team's decisions. You establish the structure that allows capable people to execute effectively. That's orchestration.
What the compiler experiment reveals is that orchestration precision determines everything. Vague requirements that used to slow teams down now break them at machine speed.
When teams use AI tools to execute faster and in parallel, vague problem framing that used to slow down one team now multiplies confusion across multiple engineers working simultaneously. Implicit success criteria that used to surface over weeks now create costly rework within days. Feedback loops that human teams could compensate for through conversation and context-sharing don't work when AI tools are involved in the execution. The skills are familiar but the level of rigor they now require is no longer forgiving.
You need to be clearer and more precise than before. What used to count as "good enough" problem framing now creates expensive mistakes that happen much faster.
This particularly matters in the defense and national security context I work in, where imprecise requirements don't just create rework - they can mean systems that don't meet operational needs when they reach the field.
Here's what actually happened with the compiler, and why it illuminates where Product Managers should focus their development efforts.
What actually made the compiler possible
The 100,000-line compiler successfully builds real-world programs like sqlite3 and passes 99% of an industry torture test designed to break compilers [1]. It still relies on GCC for the most complex assembly and linking tasks, particularly 16-bit x86 boot code, but the achievement represents a significant milestone. Compilers are widely considered one of the most demanding software engineering challenges because they require understanding code structure, making programs run efficiently, and producing instructions for different types of computer processors simultaneously. As one academic text puts it, "Few software systems bring together as many complex and diverse components" [2]. Even experienced engineers often find building a compiler daunting [3].
The resulting code was ~6x longer than what an expert human team would typically write. That’s a common pattern with today’s AI. AI tends to generate lots of redundant code because it doesn’t naturally organize code the way experienced engineers do. But despite the verbosity, the AI system produced a large, consistent, working C compiler.
The breakthrough wasn't that the AI models suddenly became brilliant. What mattered was orchestration.
Carlini was explicit about this in his writeup [1]. The models were wrapped in structure. Work was scoped deliberately. Responsibilities were divided clearly. Outputs were continuously tested. Failures were fed back into the system. Progress was measured against objective correctness criteria. Critically, the AI didn't discover its own errors. The system used GCC as an "oracle" - running the AI's output through GCC to identify failures, then using binary search to isolate which parts of the AI-generated code were wrong. The AI executed within this framework, but humans designed the error-detection mechanism that made improvement possible.
The experiment required significant compute resources - approximately $20,000 in API credits - underscoring that AI-augmented development shifts costs from human time to computational capital, a trade-off organizations must plan for.
Humans didn't disappear. They moved upstream.
They defined the problem space.
They chose a domain with clear rules.
They designed the orchestration.
They established what correctness meant.
They built the feedback loops that told the system when it was wrong.
The AI didn't actually build the compiler on its own. It didn't invent the architecture. It didn't determine what success meant. What it did was execute relentlessly inside a system that humans had done the hard work of making coherent first. Understanding that distinction reveals where Product Managers should invest their development efforts.
Why this will feel familiar to Product Managers
For many Product Managers, the compiler experiment might have seemed purely technical at first glance. But the underlying pattern is something you already know.

When vibe coding tools like Claude Code, Bolt, Lovable, Replit, or Vercel v0 first appeared, they felt like magic to most Product Managers. It seemed like the AI models must have suddenly leapt forward in capability. But that's not what happened. These tools don't feel powerful because the underlying AI models suddenly advanced. They feel powerful because the developers of these vibe coding tools built orchestration around the models, often with multiple AI agents working underneath the simple interface you see. The developers managed what information the AI sees. They structured work into clear steps. They tied feedback directly to whether things actually work. They constrained what happens next. It's the same basic trick the compiler team used. These tools extract more practical value from models that still have very real limitations.
The five limitations orchestration works around
In a January 2025 Agile Alliance talk, I outlined five core limitations of current Gen AI systems, limitations I was seeing consistently in production use across technical and non-technical work [4]. Those limitations still apply today. The compiler experiment didn't eliminate them. Orchestration just works around them.
Here are the five limitations and how the compiler experiment demonstrates each one:
1. Limited context windows. Current Gen AI systems can't reason over large systems holistically unless humans manage what information they see. The AI didn't hold the entire compiler in its head. Humans compensated by breaking the system into smaller, manageable pieces and controlling what information the AI worked with at any given time.
2. Lack of real-world understanding. They don't know why systems exist, how users depend on them, or which consequences matter outside the text they're given. The AI didn't understand why correctness mattered. Humans anchored success to real software builds and defined what failure meant.
3. Excel at tasks, not whole jobs. They perform well-scoped work but don't own end-to-end responsibility across discovery, delivery, and operations. The AI didn't own the compiler end to end. It executed tasks. Humans provided continuity and integration.
4. Struggle with decision complexity. They don't resolve tradeoffs. They optimize against criteria they're given, even when those criteria conflict. The AI didn't resolve tradeoffs between scope, correctness, and performance. Humans encoded those decisions into constraints and acceptance criteria.
5. Depend on human validation and iteration. Humans must define correctness, validate outputs, and guide iteration. This dependency is structural. The AI didn't know when it was right. Humans created the tests, validation loops, and iteration cycles that made progress possible.
We've watched these five patterns repeat across every context we've worked in, from integrating LLMs into our own products to helping teams adopt AI tools in their workflows. When a tool loses track of earlier decisions, that's the limited context window showing up. When it proposes something that violates user trust or business reality, that's lack of real-world understanding. When it excels at outputs but not coherence, that's task-level strength without end-to-end ownership. When it optimizes the wrong thing, that's unresolved tradeoff complexity. When a Product Manager reviews and corrects its output, that's human validation in action. These aren't flaws to fix. They're properties to design around. And designing around them requires orchestration work with the precision that great Product Managers have always applied.
Why this matters to Product Managers
The compiler itself isn't the point. The point is what happens when the bottleneck moves upstream from execution to orchestration.
For a long time, slow execution acted as a natural shock absorber around upstream product work. Ambiguous problem definitions took time to reveal themselves. Conflicting goals could coexist without being resolved. Success criteria stayed implicit. Discovery gaps were absorbed by delivery timelines and only surfaced late, when course correction was expensive. When building software took months, the cost of imprecision was delayed. AI changes that dynamic. Vague requirements that used to slow you down now multiply into confusion across every team or agent working in parallel.
Here's what's actually changed: not what good Product Management looks like, but how quickly imprecise Product Management creates problems in the post AI world. The Product Managers who were already doing this work with rigor will accelerate. Those who were getting by with less precision will find that AI exposes gaps rather than compensates for them.
As execution accelerates, the quality of your orchestration translates into outcomes more directly and more quickly. Strong problem framing shows up immediately in results. Well-defined boundaries guide execution cleanly. Clear feedback loops sharpen results faster. Explicit success criteria reduce surprises. Thinking through the full lifecycle pays off earlier. This creates both challenge and opportunity. Product Managers who recognize this and invest in developing sharper orchestration skills will see compounding advantages. Those who don't will find the gap between their capabilities and what's needed widening over time.
In the defense and national security context I work in, these capabilities become even more critical. When systems must work the first time under combat conditions and multiple contractors must integrate components seamlessly, orchestration precision isn't just valuable - it's mission-essential.
Orchestration beats raw intelligence
As execution becomes cheaper and more parallel, raw intelligence matters less than how intelligence is applied. Value shifts away from model capability and toward the human inputs that shape the system around it.
Here's the shift: vague requirements used to be a tax on speed. Now they're a multiplier of waste. When one team executed sequentially, unclear boundaries slowed progress. When ten AI agents execute in parallel, unclear boundaries create ten different interpretations simultaneously.
This is why when organizations ask us about adopting AI in their development efforts, they're usually focused on engineering velocity. But typically, their greater constraint is upstream work: both discovering the right capabilities to build and defining them with enough precision that teams can execute effectively. This article focuses on the orchestration precision: helping teams define clear problem boundaries, establish feedback mechanisms, and build validation criteria that translate across contractors and operational contexts.
For Product Managers, problem framing, boundary definition, feedback design, explicit correctness criteria, and lifecycle ownership have always mattered. What has changed is how directly they translate into outcomes and how quickly imprecision shows up as expensive mistakes. When execution was slow, imprecise orchestration was cushioned by long timelines. Now it shows up immediately. When execution was sequential, unclear framing hurt one workstream at a time. Now it multiplies across parallel efforts. The organizations that recognize this early and invest in developing Product Managers who orchestrate with this level of precision will have a significant competitive advantage.
What sharper orchestration looks like in practice
The difference isn't abstract. The bar for how clear and precise you need to be hasn't risen, it's just become unforgiving. Here's what meeting that bar looks like in practice:
Sharper problem framing means writing problem statements clear enough that someone (or something) could make the first three critical decisions correctly without coming back for clarification. Not just "improve system performance" but "reduce query response time for standard searches from 8 seconds to under 2 seconds while maintaining 99.9% uptime, prioritizing queries from active users over background batch processes."
Clearer boundary definition means being explicit about what's in scope and what's not, and why. Not assumptions that live in your head, but documented constraints that guide every decision downstream. When an engineer using AI tools or a junior Product Manager encounters an edge case, they should know immediately whether it's in or out of scope based on the boundaries you set.
Better feedback design means creating validation criteria explicit enough that progress can be assessed objectively without interpretation. Not "this should feel intuitive" but "task completion rate above 85%, average time under 45 seconds, error rate below 3%."
Here's what this looks like in practice: Consider a team building a data integration capability that needs to correlate information from multiple source systems. If the problem statement is simply "integrate data from multiple sources," engineers working with AI assistance could produce three different interpretations. One might focus on speed of ingestion, another on completeness of data mapping, a third on accuracy of deduplication. All three are "integrating data" but optimizing for different outcomes with incompatible architectures.
The sharper version: "Enable analysts to correlate data from 5+ source systems within a single query, returning results in under 3 seconds, with automated conflict resolution for duplicate records based on timestamp and source priority, and clear lineage showing data provenance for audit trail requirements." Same goal, but now there's one interpretation, clear tradeoffs prioritized (speed over perfect deduplication, automated resolution over manual review), and objective validation criteria.
In defense contexts, this precision becomes even more critical when those source systems span different organizational boundaries and contractor-provided capabilities - where imprecise requirements mean systems that can't interoperate when they reach operational environments.
This level of precision has always been valuable. It's now becoming essential.
How to assess your orchestration precision
Here's a simple test: Take your last three problem statements or feature specs. Could a new junior Product Manager or engineer using AI tools read them and make the first three critical decisions without coming back to you?
If yes, your orchestration is likely precise enough for AI-augmented work.
If no, here's where the gaps typically show up:
Problem statements that describe what to build but not why, leaving the actual problem undefined. The solution is prescribed but the problem remains ambiguous, so anyone executing has to guess at intent.
Success criteria that use subjective language like "intuitive," "fast," or "simple" instead of measurable outcomes. These require interpretation, and different interpreters (human or AI) will optimize for different things.
Boundaries that exist only in your head, requiring tribal knowledge to interpret. When edge cases appear, there's no documented principle to guide the decision, so each case becomes a new negotiation.
Make implicit knowledge explicit. Document the constraints. Define success objectively. Write problem statements that include enough context for others to make decisions aligned with your intent.
The Product Managers who were already working this way will find AI amplifies their effectiveness. Those who weren't will find AI exposes their gaps quickly.
What this means for your development as a Product Manager
The most important takeaway from the compiler experiment isn't about AI. It's about where to focus your development efforts.
AI doesn't make Product Managers obsolete. But it does change what determines effectiveness. The Product Managers who invest in deepening their orchestration skills, who learn to frame problems with precision, define boundaries explicitly, design feedback loops that drive learning, make tradeoffs visible, and own the full lifecycle, will adapt successfully to this shift. They're trainable skills. They require deliberate practice to develop to the level now needed, but the path forward is clear.
Product Managers who deepen rigor in these areas will see the results directly. Their intent translates into outcomes faster. Their teams waste less effort. Their products behave more predictably. Their effectiveness becomes visible.
For organizations navigating this transition, the challenge isn't just understanding that these skills matter more. It's creating the conditions where Product Managers can develop them while shipping under pressure. This is where focused investment pays off quickly, whether through targeted coaching, deliberate practice in production work, or learning from those who have already navigated this transition.
What this moment offers isn’t a crisis to manage, but clarity about where to invest. The criteria for an effective Product Manager haven’t changed, but the importance of certain skills has increased sharply, and the consequences of lacking them have become more immediate. Those skills are identifiable, trainable, and within reach. The opportunity now is to invest in developing them deliberately.
The compiler experiment illuminates something important: when execution accelerates, orchestration becomes the primary determinant of outcomes. Vague requirements now break at machine speed, and the precision that always separated great Product Managers from good ones has become the difference between effective and ineffective.
Product Managers who recognize this and invest in meeting that bar will define what effectiveness looks like in this new reality. Organizations that invest in developing this capability across their Product Manager workforce will accelerate delivery, reduce costly rework, and deliver systems that meet requirements the first time.
That’s what the compiler experiment is really illuminating for Product Managers and why it matters for every organization building complex systems on accelerated timelines.
References
[1] Carlini, N. "Building a C Compiler with LLM Agents." Personal blog, January 2025. https://nicholas.carlini.com/writing/2025/building-a-c-compiler-with-llm-agents.html
[2] Cooper, K.D. and Torczon, L. "Compiler Construction." ScienceDirect Topics in Computer Science, 2024.
[3] Cornell University. "Introduction to Compiler Design." CS 4120 Course Notes, 2022.
[4] Park, A. Supercharging Agile Professionals with AI: A Roadmap to Greater Productivity and Skill Growth. Agile Alliance Online Session, January 2025.
