On February 2, 2025, Andrej Karpathy—OpenAI co-founder and former head of AI at Tesla—wrote on social media: "There's a new kind of coding I call vibe coding, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists."[1] This seemingly casual coinage was selected ten months later as Collins Dictionary's Word of the Year for 2025—lexicographers observed a dramatic explosion in usage of the term across a 24-billion-word corpus. One year later, by February 2026, 41% of all code globally was generated by AI, 92% of American developers used AI coding tools daily, and 87% of Fortune 500 companies had adopted some form of AI-assisted development platform.[2] In Y Combinator's Winter 2025 batch, 25% of startups had codebases that were 95% AI-generated—achieving $10 million in revenue with teams of fewer than ten people.[3] These figures paint the picture of a seemingly unstoppable technological revolution. Yet another set of data from the same period tells a starkly different story: METR's randomized controlled experiment found that senior developers were actually 19% slower when using AI tools; GitClear observed a 48% increase in copy-paste patterns and a 60% decrease in refactoring across 211 million lines of code; a Georgetown University report revealed that 40% of AI-generated code contained known security vulnerabilities; and one in five organizations experienced a serious cybersecurity incident caused by AI-generated code. Through my experience leading Meta Intelligence in AI software development, and my prior work in technology governance research at the University of Cambridge, I have come to appreciate a profound truth: Vibe Coding is not merely a new programming methodology—it is triggering an existential crisis for the profession of software engineering.
I. The Birth and Evolution of Vibe Coding: From Word of the Year to Professional Methodology
When Karpathy introduced Vibe Coding, he was describing a personal, low-pressure AI-assisted programming experience—he used AI tools to build weekend projects without reviewing the generated code line by line, instead "just seeing if the results work." This attitude is perfectly reasonable in itself—much like a carpenter occasionally using power tools to quickly assemble a shelf without pursuing the ultimate craft of mortise-and-tenon joinery. However, Vibe Coding rapidly evolved from Karpathy's personal habit into an industry-wide movement. By the end of 2025, it was no longer a fringe subculture but the standard narrative Silicon Valley venture capitalists used to evaluate startups.
Y Combinator's data provides the most compelling evidence of this movement. In its Winter 2025 batch (W25), 25% of startups had codebases that were 95% AI-generated.[3] YC president Garry Tan declared: "This is not a fad, it's not going away, this is the mainstream way of programming in the future." The W25 batch achieved the fastest growth rate in YC fund history—an overall 10% week-over-week growth rate—seemingly validating Vibe Coding's commercial viability. Yet a more precise question is: does the rapid growth of these startups stem from the quality of their AI code, or from their advantages in market timing, business models, and founder capabilities? Correlation does not equal causation—a fundamental principle worth remembering in any rigorous analysis.
Intriguingly, Karpathy himself declared Vibe Coding "passe" one year later in February 2026, proposing "Agentic Engineering" as its mature successor.[4] His new thesis: "Programming via LLM agents is becoming the default workflow for professional practitioners—but with greater oversight and review." The conceptual evolution from Vibe Coding to Agentic Engineering reflects an important cognitive correction—the industry began to recognize that "forgetting the code even exists" might work for weekend projects, but not for production systems that underpin business operations.
The market size of AI coding tools confirms the depth of this transformation. GitHub Copilot has reached 20 million cumulative users, including 1.3 million paid subscribers—with 46% of code among active users generated by Copilot, a significant increase from the 27% at its 2022 launch.[5] Even more striking is Cursor (developed by Anysphere)—this AI-first code editor crossed $100 million in annual recurring revenue (ARR) in January 2025, surpassed $500 million by June, broke through $1 billion by year-end, and reached a valuation of $29.3 billion.[6] The overall AI coding tools market reached $7.37 billion in 2025 and is projected to grow to $325 billion by 2040.[2] The sheer scale of these figures makes clear that AI-assisted programming is not a marginal trend to be ignored—it has become foundational infrastructure for the software industry.
II. The Quality Crisis of AI-Generated Code: What the Data Reveals
Beneath the narrative of productivity gains, the empirical data on code quality presents a deeply unsettling picture.
GitClear's large-scale analysis provides the most comprehensive empirical study to date on AI's impact on code quality. By analyzing the change history of 211 million lines of code from 2020 to 2024, the study identified three structural quality degradation trends.[7] First, copy-paste patterns increased by 48%—developers (or AI) increasingly favored copying existing code over designing reusable abstractions. Second, refactoring code decreased by 60%—meaning developers spent less time improving the structure and maintainability of existing code. Third, code churn (the rate at which newly written code is modified within two weeks) rose from 3.1% in 2020 to 5.7% in 2024—indicating that a growing share of code was found to be problematic and required modification shortly after submission. Of even greater structural significance, the proportion of code changes attributable to refactoring fell from 25% in 2021 to under 10% in 2024. Within the professional standards of software engineering, refactoring is a critical activity for maintaining the long-term health of a codebase—a sharp decline in refactoring is analogous to a city halting road maintenance: no problems in the short term, but systemic infrastructure decay in the long run.
METR's randomized controlled experiment yielded perhaps the most disruptive finding.[8] The research team recruited 16 senior open-source developers—maintainers of repositories with an average of over 22,000 GitHub Stars and more than one million lines of code—for a rigorous randomized controlled trial. Developers were randomly assigned to complete tasks with or without AI tools (Cursor Pro with Claude 3.5/3.7 Sonnet). The result was counterintuitive: developers using AI tools were actually 19% slower at completing tasks. Even more thought-provoking was the cognitive bias revealed—before the experiment, developers predicted AI tools would make them 24% faster; after the experiment, they self-reported believing they had been 20% faster. But objective measurement showed they were actually 19% slower. This "39-percentage-point gap between perception and reality" exposes a dangerous cognitive trap—AI tools may make developers feel more productive, but for complex tasks requiring deep understanding, the actual effect may be the opposite.
The security vulnerability data is equally alarming. A research report from Georgetown University's Center for Security and Emerging Technology (CSET) found that approximately 40% of 1,689 programs generated by GitHub Copilot contained known security vulnerabilities from the MITRE CWE Top 25.[9] Aikido Security's survey of 450 organizations across the US and Europe found that one in five organizations experienced a serious cybersecurity incident caused by AI-generated code—the rate was even higher in the US at 43%, compared to 20% in Europe (a discrepancy that may reflect the protective effect of the EU's more stringent regulatory environment).[10] The most common vulnerability types included cross-site scripting (XSS), SQL injection, hardcoded credentials, and path traversal. What is particularly concerning is that these are all well-known, foundational vulnerabilities in the field of software security—AI is not introducing novel types of security flaws, but rather mass-reproducing the mistakes that human engineers had learned to avoid over decades.
Developers' own attitudes reflect this contradiction. Survey data shows that only 3% of developers have high trust in AI-generated code; 71% refuse to merge AI code without human review; 63% report having spent more time debugging AI-generated code than it would have taken to write it from scratch; and 53% have discovered security issues that passed initial review.[2] These data points collectively paint a paradoxical reality: 92% of developers use AI tools daily, yet only 3% have high trust in their output. This is not the profile of a user base satisfied with its tools—it more closely resembles a group of professionals compelled by market pressure to adopt a technology that has not yet matured.
III. The Technical Debt Tsunami and Cognitive Debt: A Dual Crisis for Software Engineering
The quality issues in AI-assisted development, when projected across time, amplify into a tsunami of technical debt. Forrester predicts that by 2026, 75% of enterprises will face moderate to high severity technical debt—directly attributable to AI-driven rapid development.[11] Stack Overflow published a bluntly titled analysis in January 2026: "AI can 10x developers...in creating tech debt."[12]
The economics of technical debt merit close examination. While AI tool vendors claim a 50% increase in development speed, a detailed analysis of actual first-year costs reveals that AI-assisted development's total cost is approximately 12% higher than traditional development—owing to a 9% overhead in additional code review, a 1.7x testing burden, and a 2x code rewrite rate.[13] By the second year, without proactive technical debt management, maintenance costs will surge to four times those of traditional development. This means the "speed" of Vibe Coding is not free—it is essentially borrowing future maintenance costs to spend in the present, much like credit card spending: deferred payment does not mean no payment.
However, deeper and harder to detect than technical debt is an entirely new concept proposed in February 2026 by Professor Margaret-Anne Storey of the University of Victoria—"cognitive debt."[14] Cognitive debt refers to the systemic erosion of human understanding of code when AI writes it on our behalf—the context behind design decisions, the interaction logic between system components, the boundary conditions for error handling—all gradually drain away. Unlike technical debt (which resides in the code and can be repaid through refactoring), cognitive debt resides in people's minds. Once a team has lost its shared understanding of a system, the only way to repay cognitive debt is to re-read and re-comprehend the entire codebase—which is often more time-consuming than writing it from scratch.
Storey cited a real-world case: a student team using AI tools progressed rapidly during the first six weeks of their project—AI quickly generated large volumes of code, and features shipped fast. But by weeks seven and eight, the team hit a wall—no member could explain how design decisions had been made or how the system's components worked together. They possessed a system that "could run," but no one "understood" it. This was not technical debt—the code itself may have been clean, functional, and even fully covered by tests. But the team had "lost the plot." Simon Willison and Martin Fowler, two thought leaders in the software engineering community, subsequently expanded on this concept, sparking widespread industry discussion.
The concept of cognitive debt has profound implications for corporate digital resilience strategy. When an enterprise's core systems increasingly depend on AI-generated code while the human capacity to understand that code simultaneously atrophies, the organization faces not merely technical risk but structural fragility in organizational resilience—should AI tools become unavailable (due to service outages, vendor policy changes, or regulatory restrictions), the enterprise may find itself in possession of a system that no one can modify. In my prior research on digital infrastructure conducted for the World Bank, we repeatedly emphasized that "system resilience" depends not only on the robustness of the technology but also on the depth of capability within the maintenance team—cognitive debt is eroding the latter.
IV. The Disappearance of Junior Engineers: How AI Is Dismantling the Software Talent Pipeline
If cognitive debt represents the hidden organizational cost of AI-assisted development, the contraction in junior engineer hiring is its structural consequence at the industry level—and its impact may prove more far-reaching than any technical issue.
The data is stark. Entry-level hiring at the world's top fifteen tech companies fell by 25% between 2023 and 2024.[15] Entry-level tech positions in the UK decreased by 46% in 2024, and are projected to decline by 53% by the end of 2026. Some US datasets show a 67% drop in junior developer positions. New graduate hiring at major global tech companies has fallen by over 50% within three years. 54% of engineering leaders plan to reduce entry-level hiring. Meanwhile, the number of computer science graduates and coding bootcamp enrollees continues to rise—creating a dangerous "talent pipeline paradox": supply increasing while demand plummets.
On the surface, the logic behind this trend is straightforward: AI can perform most of the work previously done by junior developers (writing boilerplate code, fixing simple bugs, running tests, generating documentation), and it does so faster and at lower cost. But this logic overlooks a structural issue—junior developers are not merely "people who complete entry-level tasks" but rather the "pipeline for cultivating future senior developers." If companies drastically reduce entry-level hiring, they will face a severe shortage of senior engineers five to ten years from now. This is a classic tragedy of the commons—individually, each company's decision to cut junior hiring is a rational cost optimization; but when the entire industry does so collectively, it will destroy the talent regeneration mechanism of software engineering.
Stack Overflow's analysis pointed more directly to the generational impact: "AI vs. Gen Z"—AI has fundamentally altered the career trajectory of Generation Z developers.[16] In the past, a new developer's growth path proceeded from writing simple features to gradually participating in more complex system design, building over years of practice a deep understanding of software architecture, systems thinking, and engineering trade-offs. But in a world where AI writes most of the code, this "learning by doing" path has been severed—if you have never written a complete feature module from scratch, never traced a hard-to-reproduce bug by hand, never spent late nights poring over an obscure piece of code to decipher its design intent, how can you develop the professional intuition needed to judge the quality of AI output?
The depth of this problem lies in the fact that it affects not only individual career development but also the transmission of knowledge across the entire software engineering profession. Much of software engineering's core knowledge—architectural judgment, systems thinking, debugging intuition—constitutes tacit knowledge that cannot be conveyed through documentation or textbooks, but can only be acquired through extended practice and mentorship. When AI replaces the practice opportunities of junior engineers, these channels for transmitting tacit knowledge are severed. In my prior research on higher education reform and industry-academia collaboration, I repeatedly observed a pattern: when the apprenticeship system in a professional field is disrupted, that field typically experiences a systemic decline in quality one generation later. Software engineering is now facing the same risk.
V. From Programmer to System Orchestrator: Redefining the Software Engineering Profession
In response to AI's comprehensive impact on software engineering, the industry is attempting to redefine the profession. Gartner predicts that 80% of software engineers must upskill in AI-assisted development by 2027.[17] A widely discussed framework is the role transformation "from coder to orchestrator"—developers no longer primarily write code, but instead design system architectures, delegate tasks to AI agents, review the quality of AI output, and make the engineering judgments that AI cannot—precisely the core challenge facing CTO leadership in the AI era.
This transformation has already materialized at the tooling level. OpenClaw enables users to direct AI through messaging apps to complete entire development workflows—from bug fixes to pull request generation. Claude Code, as an in-terminal AI coding agent, provides full-codebase contextual understanding directly within the developer's working environment. Devin 2.0, positioned as an "autonomous software engineer," can independently manage Git repositories, write tests, and submit pull requests. On the SWE-bench benchmark, Devin autonomously solved 13.86% of real GitHub issues—a figure that may seem modest, but given that these are real engineering problems requiring deep understanding, it marks a significant advance in AI's autonomous programming capability.
However, the "system orchestrator" role transformation faces a fundamental paradox—how do you maintain a deep understanding of code without actually writing it? This is akin to someone who has never worked in a kitchen attempting to serve as head chef at a Michelin-starred restaurant—you may be able to describe the dish you want, but you lack the professional intuition to judge the quality of execution. The 2025 DORA report's findings corroborate this paradox: 59% of developers reported that AI had a positive impact on code quality, but the actual bug rate (measured as bugs per pull request) remained unchanged—you build faster, but bugs occur at the same rate.[18] In other words, AI amplifies productivity while proportionally amplifying the volume of errors produced.
The automated code review market reflects the industry's response to this quality gap. The automated code review market surged from $550 million to $4 billion in 2025. However, the most advanced AI code review tools currently available (such as CodeRabbit) achieve only a 46% accuracy rate in detecting real-world runtime bugs—meaning more than half of all issues still require human reviewers to catch.[19] A deeper challenge emerges: when AI generates code far faster than humans can review it, the bottleneck for quality assurance shifts from writing code to understanding code—this is the concrete manifestation of cognitive debt. Projections indicate that 2026 will see a 40% quality gap—the volume of code entering the pipeline will exceed the verification capacity of review personnel.
In my view, the core challenge facing software engineering is not "whether AI can write code"—it clearly can, and it will only get better. The real challenge is: after AI takes over an ever-greater share of the "hands-on work," how do humans maintain their judgment? This question extends beyond software engineering—it is a structural issue that all knowledge work will confront in the AI agent economy. A surgeon who only watches a surgical robot operate without ever holding the scalpel will see their judgment atrophy; a pilot who relies solely on autopilot without ever manually flying the aircraft will see their emergency response capabilities decline. A software engineer who only reviews AI output without ever writing code themselves will gradually see their professional intuition for judging code quality grow dull.
A prediction from Gartner adds an unsettling footnote to this discussion: by 2026, due to the erosion of critical thinking skills caused by generative AI use, 50% of global organizations will be forced to mandate "AI-free" skills assessments.[17] This is not anti-technology Luddism, but a pragmatic recognition: as tools grow ever more powerful, the people who use them need to understand the domain those tools operate in more, not less. The future of Vibe Coding is not a binary choice between "more AI" or "less AI," but rather how to design the right collaborative architecture between AI's formidable productivity and humanity's irreplaceable judgment. This requires not only technological innovation but also institutional design, educational reform, and organizational restructuring—a systemic engineering effort.
References
- Karpathy, A. (2025). There's a new kind of coding I call 'vibe coding.' X/Twitter; CNN. (2025). Collins Word of the Year: Vibe Coding. cnn.com
- Second Talent. (2026). Vibe Coding Statistics. secondtalent.com
- TechCrunch. (2025). A quarter of YC W25 startups have 95% AI-generated codebases. techcrunch.com
- The New Stack. (2026). Vibe Coding Is Passe. thenewstack.io
- Quantumrun Foresight. (2025). GitHub Copilot Statistics. quantumrun.com
- CNBC. (2025). Cursor AI startup Anysphere raises $2.3B at $29.3B valuation. cnbc.com
- GitClear. (2025). AI Assistant Code Quality 2025 Research Report. gitclear.com
- METR. (2025). Early 2025 AI Experienced Open-Source Developer Study. metr.org
- Georgetown CSET. (2025). Cybersecurity Risks of AI-Generated Code. georgetown.edu
- IT Pro / Aikido Security. (2026). AI-generated code is now the cause of one in five breaches. itpro.com
- CFO Dive / Forrester. (2025). Tech debt tsunami building amid AI craze. cfodive.com
- Stack Overflow. (2026). AI can 10x developers...in creating tech debt. stackoverflow.blog
- Pixelmojo. (2026). Vibe Coding Technical Debt Crisis 2026-2027. pixelmojo.io
- Storey, M.-A. (2026). Cognitive Debt: A New Challenge in AI-Assisted Development. margaretstorey.com
- CIO. (2025). Demand for Junior Developers Softens as AI Takes Over. cio.com
- Stack Overflow. (2025). AI vs. Gen Z. stackoverflow.blog
- Gartner. (2024). 80% of Software Engineers Must Upskill in AI by 2027. gartner.com
- DORA. (2025). 2025 DORA Report. dora.dev
- Qodo. (2026). Best AI Code Review Tools 2026. qodo.ai