The Agentic Resistance: Why Critics Are Missing the Paradigm Shift

A restrospective on Project Vend-1: How Thomas Kuhn's framework explains why institutional skepticism of agentic AI reflects predictable paradigm resistance rather than technological failure.

Jul 15, 2025

When Anthropic released Claudius through its Project Vend-1 research initiative, the technology community watched as the system attempted to operate computer interfaces autonomously.

Claude (named “Claudius”) autonomously ran a small automated shop in Anthropic's office for about a month, managing real business operations like inventory, pricing, and customer service — one of the most ambitious attempts at genuine agentic AI to date. Claude had tools to search the web, email suppliers, track finances, and interact with customers via Slack, but ultimately failed to turn a profit due to mistakes like selling items at a loss, giving excessive discounts, and poor inventory management. The experiment included a notable "identity crisis" episode where Claude briefly believed it was a real person who could physically deliver products while wearing a blue blazer and red tie!

As such, the results proved mixed and revealed both the potential and current limitations of AI in economic roles. Critics quickly seized on these limitations as evidence that agentic AI remains fundamentally premature, with prominent voices arguing that the technology represents sophisticated automation rather than genuine artificial agency.

This reaction follows a predictable pattern that Thomas Kuhn identified in The Structure of Scientific Revolutions. When paradigm shifts emerge, established communities resist new frameworks not because they lack merit, but because they challenge fundamental assumptions about how systems should operate. The skepticism aimed at Claudius echoes the more public critiques leveled at other early agentic systems, from the mixed reception of the Rabbit R1 to the disillusionment that followed the initial hype around frameworks like Auto-GPT. The backlash against these projects reflects paradigm resistance rather than objective technological assessment, with profound implications for institutional investors and technology executives as the generative AI discontinuity continues to unfold.

The Structure of Scientific Resistance

Kuhn observed that scientific communities resist paradigm shifts through systematic skepticism of new approaches that fail to meet established criteria. Critics evaluate new paradigms using old metrics, dismiss early failures as evidence of fundamental inadequacy, and maintain faith in incremental improvements to existing frameworks rather than wholesale reconceptualization.

The criticism of Claudius exhibits these characteristics precisely. When the system struggles with complex web navigation, critics conclude that autonomous AI remains years from practical deployment. When safety constraints limit operational scope, organizations question whether agentic approaches offer genuine advantages over traditional automation. When implementations require extensive oversight, executives retreat to safer generative AI investments. And what to say about Claudius’s identity crisis?

This resistance misses the true nature of paradigm transitions. Early implementations of revolutionary technologies invariably appear inferior to established alternatives when evaluated by conventional metrics. The first automobiles proved slower than horses. Early computers required more effort than mechanical calculators. Initial internet applications seemed less efficient than existing communication methods.

Even prominent AI experts emphasize the gradual nature of this transformation. Andrej Karpathy recently cautioned: "When I see things like 'oh, 2025 is the year of agents!!' I get very concerned and I kind of feel like, this is the decade of agents. And this is going to [take] quite some time. We need humans in the loop. We need to do this carefully."

Organizations attempting to evaluate agentic systems like Claudius using generative AI expectations encounter systematic disappointment that reflects paradigmatic incompatibility rather than technological inadequacy.

The Complexity Behind Agentic Systems

The challenges facing Claudius illuminate the sophisticated technical architecture required for genuine autonomous operation. Unlike generative AI systems that respond to prompts with static outputs, agentic systems must integrate multiple complex capabilities: knowledge retrieval from diverse sources, tool utilization across different software environments, strategic planning for multi-step objectives, and iterative reasoning that adapts to changing circumstances.

Claudius demonstrates this complexity in practice. The system must retrieve relevant information about task environments, including website structures and contextual requirements. It needs access to appropriate interaction tools, from basic clicking to sophisticated form completion. The planning component requires breaking down high-level objectives into executable steps while maintaining awareness of dependencies. Finally, iterative reasoning enables adaptation when initial approaches prove inadequate or conditions change.

This technical complexity represents a feature rather than a flaw. Kuhn noted that paradigm shifts often involve increased complexity as new frameworks attempt to account for phenomena that previous approaches could not address. The sophistication required for autonomous operation reflects the ambitious scope of problems that agentic systems aim to solve.

Organizations that dismiss agentic approaches based on this complexity misunderstand the nature of the capability being developed. The technical challenges are essential characteristics of systems designed to operate autonomously in complex, dynamic environments.

Infrastructure Requirements for Autonomous Operation

The implementation challenges facing Claudius reflect infrastructure gaps that organizations have yet to address systematically. Agentic systems require control planes designed for non-deterministic environments where autonomous actors must coordinate while adapting to changing conditions.

Traditional enterprise software assumes deterministic execution paths with predictable inputs and outputs. Claudius operates in fundamentally uncertain environments where it must navigate ambiguous user interfaces, interpret visual elements that vary across websites, and make decisions based on incomplete information about user intentions or system states.

The infrastructure requirements extend beyond technical architecture to encompass governance frameworks that enable autonomous operation while maintaining organizational control. Current enterprise governance assumes human decision-makers at critical junctions. Agentic systems require governance models that define autonomous decision boundaries, establish escalation protocols, and provide oversight mechanisms for distributed operations.

Organizations that recognize these infrastructure requirements and invest in building appropriate foundations will capture significant competitive advantages as agentic capabilities mature.

The Context and Workflow Foundation

The architectural challenges evident in Claudius implementations emphasize the critical importance of context and workflow design in agentic systems. Autonomous agents cannot operate effectively without rich contextual foundations that provide the environmental awareness necessary for intelligent decision-making.

Context encompasses far more than prompt engineering or knowledge base access. Effective agentic systems require deep integration with organizational data sources, real-time awareness of business conditions, and understanding of strategic objectives that guide autonomous behavior. The context layer must provide agents with sufficient environmental understanding to make decisions that align with organizational goals while adapting to changing circumstances.

The emergence of context engineering as a distinct discipline reflects the recognition that contextual sophistication represents a genuine competitive moat in agentic systems. Organizations that master the art of constructing rich, dynamic contextual environments for their autonomous agents will capture advantages that prove difficult for competitors to replicate.

Workflow redesign represents an equally critical requirement that most organizations underestimate. Traditional business processes assume human intelligence at decision points, with linear task sequences that accommodate human cognitive patterns. Agentic workflows must be architected around autonomous coordination, with parallel execution paths, dynamic exception handling, and coordination mechanisms that operate without constant human supervision.

Organizations that successfully implement systems like Claudius recognize that workflow redesign must be foundational to the entire approach, requiring significant investment in reimagining business processes around autonomous capabilities.

Learning from Early Implementations

The mixed results from projects like Claudius provide valuable insights for organizations considering agentic AI investments. The key lesson is not that autonomous systems remain premature, but that successful implementation requires different approaches than those developed for generative AI deployment.

The technical challenges encountered by Claudius reflect the inherent complexity of autonomous operation rather than fundamental flaws in the agentic approach. Knowledge retrieval, tool utilization, planning, and iterative reasoning represent essential capabilities for any system designed to operate independently in complex environments. The difficulty of integrating these capabilities effectively should be expected rather than surprising.

The safety constraints that limit Claudius's operational scope demonstrate necessary development practices rather than technological limitations. The infrastructure requirements revealed by implementations highlight the need for comprehensive foundational investments rather than incremental automation improvements.

Organizations that attempt to deploy agentic systems without building appropriate control planes, governance frameworks, and contextual foundations will experience systematic failures regardless of the underlying technology's capabilities.

Strategic Implications of the “Claudius” success and challenges

For institutional investors and technology executives, the current skepticism surrounding projects like Claudius represents both risk and opportunity. Organizations that dismiss agentic approaches based on early implementation challenges may miss critical shifts in competitive dynamics as autonomous systems mature and supporting infrastructure develops.

The trajectory of progress, while still in early stages, demonstrates accelerating capability development. Recent research from METR shows measurable improvements in AI systems' ability to complete long, complex tasks that require sustained autonomous operation. While significant challenges remain, the speed of advancement suggests that agentic systems represent an inevitable technological evolution rather than speculative research. Organizations that recognize this trajectory and invest accordingly will be positioned to capture competitive advantages as the technology matures.

The investment strategy should focus on companies developing the full "agentic stack" rather than those deploying individual autonomous tools. This means looking beyond the AI models themselves to the companies building the necessary control planes for non-deterministic environments, the governance frameworks for autonomous operation, and the contextual foundations that allow agents to function effectively within an enterprise. This approach requires identifying organizations that recognize these infrastructure requirements, invest in foundational development practices, and commit to the sustained effort necessary for effective agentic implementation.

The timeline for competitive differentiation through agentic capabilities will likely extend over several years as the technology matures and organizations develop operational competencies. Early movers that invest in building foundational capabilities during the current skepticism phase will be positioned to capture significant advantages as autonomous systems become more reliable and easier to deploy.

The Path Forward

The challenges facing systems like Claudius should be understood as normal growing pains in a genuine paradigm shift rather than evidence of technological inadequacy. Kuhn's framework suggests that revolutionary technologies typically encounter significant resistance and implementation challenges before achieving widespread adoption and competitive impact.

The development process must prioritize systematic capability building that addresses the unique requirements of autonomous systems while building the foundational infrastructure necessary for effective deployment. This approach requires sustained organizational commitment and willingness to invest in long-term competitive positioning rather than immediate productivity gains. The agentic revolution proceeds regardless of institutional resistance or early implementation setbacks, and the strategic opportunity lies in recognizing this reality while competitors remain focused on incremental improvements to existing approaches.

Inspired by your feedback on pieces like this (and the Figma teardown), I'm rolling out an Elite Tier as a way to go deeper together (with spots capped for meaningful interactions). It includes:

Exclusive Deep Dives: 10-12 per year, including executive summary editions of S-1 teardowns (e.g., Figma synthesis of the full 23-page teardown, with visuals and key frameworks).
Monthly AMAs: Substack chats to explore your questions on discontinuities and trends.
Private Chat Group: Connect with fellow readers—CEOs, investors, and allocators.

This keeps the weekly free insights and core paid teardowns ($50/month) unchanged—it's an option for those who've mentioned wanting more than the existing paid tier. If it fits how you're using the newsletter, explore Elite access here.

For fully tailored institutional access (like access to custom S-1 teardowns), that's handled off-Substack. Contact me directly if interested.

Thanks for reading and being part of this. Let's keep decoding!

Best,
Raphaëlle d'Ornano

Jurgen Appelo

I'm sorry to disagree. There are much more fundamental problems that need to be solved than just "we need an extra layer of context." The current technology stack is great for addressing complicated, solvable problems that can be benchmarked. But the current technologies are ill-suited for handling the complex, wicked problems that we encounter when actually running a business, as evidenced by Claudius. LLMs and the protocols on top of them will never achieve that. I side with Gary Marcus and other skeptics on that matter.

I *do* believe, however, that we *will* be able to address these problems, but other technological breakthroughs are needed first. Therefore, I find it unfair to dismiss critics as the "old guard" who simply don't understand the paradigm shift. On the contrary, we understand the paradigm shift very well! We're just less naive about the current tech stack and we think a different direction is needed to address truly wicked problems.

BTW, the vending cart problem was exactly what I predicted earlier this year.

https://substack.jurgenappelo.com/p/agicomplex-not-complicated

Expand full comment

Decoding Discontinuity

Discussion about this post