Public Knowledge Will Survive in Exactly One Place

How AI Eliminated the Incentive That Built the Internet, and Who Owns What Comes Next

Jun 05, 2026

In January 2026, Stack Overflow recorded 6,309 new questions. In November 2022, the month ChatGPT launched, it recorded 108,563. That is not a decline. It is a near-extinction: a 94% collapse in three years, regressing to 2009 levels despite the global developer population expanding more than tenfold over the same period.

Stack Overflow did not fail. It was not outcompeted. It was the most successful developer knowledge platform ever built, and the thing that destroyed it was the thing it unknowingly helped create. Every question answered publicly on Stack Overflow became training data. Every debugging session, every architectural argument, every explanation of why the obvious solution does not work became part of the substrate that large language models learned from. The platform built the tool that made the platform unnecessary. And now the questions that once flowed publicly into a permanent, searchable, community-owned archive flow privately into a chat window that produces no residue the world can read.

This is not a platform story. It is a physics story.

The Incentive

The open web was never a cultural achievement. It was a coordination equilibrium held together by a single economic fact: if you wanted an answer, an audience, a reputation, or a correction, you had to think in public. There was no private alternative.

A question on Stack Overflow was not an act of community. It was a request for help from the only entity capable of providing it. A blog post was not a gift to the world. It was a bid for visibility in the only distribution system that existed. A Wikipedia edit was not civic virtue. It was the only way to fix something that was wrong where everyone could see it.

The behavior that built the internet’s knowledge layer was not generosity. It was necessity.

This is the founding fact the industry has not internalized about what AI breaks. Not a feature. Not a use case. The necessity. The moment an agent can answer a question privately, instantly, and without friction, the economic logic of public posting evaporates. The debugging session that once produced a blog post becomes a conversation with a model. The question that once justified a Stack Overflow thread becomes a private query resolved in seconds. The public sphere does not shrink because people lose interest in contributing. It shrinks because the incentive that made contribution rational disappears.

When the incentive disappears, the supply disappears with it. Not gradually. Structurally.

The Vortex

The collapse is not a linear event. It is a loop, and every node is already active.

AI absorbs queries privately. Public platforms lose traffic. Platforms lose contributors. New knowledge stops accumulating publicly. Training data stagnates. Models become less calibrated to current events and emerging patterns. Users rely more on AI to navigate uncertainty. AI absorbs more queries. The loop closes and tightens with each rotation.

To understand what this looks like at the level of actual behavior, consider what changed for a working developer between October 2022 and October 2023. In October 2022, if you encountered an obscure API error, you searched Stack Overflow. You found a thread from three years ago where someone had hit the same wall. You read the accepted answer, the dissenting comment below it, the edit that corrected the accepted answer eighteen months later. You added your own comment if the solutions failed. That interaction produced a public artifact that the next developer searching the same error would find. The question became part of the permanent record. The knowledge compounded.

By October 2023, the same developer opened a chat window instead. The question was answered in thirty seconds. The answer was never posted anywhere. The next developer with the same problem did the same thing. The thread that would have existed on Stack Overflow was never created. The public artifact was never produced. The knowledge was consumed privately and left no trace. Multiply that interaction by the 27 million software developers now using AI coding tools daily, and the 94% collapse in Stack Overflow question volume is not surprising. It is the mathematically inevitable result of removing the necessity that generated the questions in the first place.

Wikipedia lost over 1.1 billion monthly visits between 2022 and late 2025, a 23% decline the Wikimedia Foundation confirmed in October 2025 and attributed directly to AI search summaries satisfying information needs before users reach the site. New editor registrations fell 36% over nine years, from 317,000 per month in 2016 to 202,000 per month in 2025. Veteran editors compensated by increasing individual output: total edit volume rose 37%, average edits per new user doubled. A shrinking, aging cohort working harder to maintain a corpus fewer people need to visit, producing no new contributors because the novices who would have become contributors are getting their answers elsewhere and never developing the habit of public editing that would have drawn them in.

The platforms that depended on that traffic are collapsing behind it. Business Insider cut 21% of staff in May 2025 after organic search traffic fell 55% over three years. The Planet D shut down entirely after traffic collapsed 90% following Google’s AI Overviews rollout. Chegg filed an antitrust lawsuit directly naming AI Overviews after reporting a 49% decline in non-subscriber traffic in a single year. Helen Havlak, publisher of The Verge: “The extinction-level event is already here.”

The referral mechanism that funded all of it is gone. SparkToro and Datos found 58.5% of all Google searches in the US end without a single click to the open web. That rate rose from 56% to 69% between May 2024 and May 2025. On queries triggering AI Overviews: 83% zero-click. In ChatGPT Search: 82%. In Perplexity: 93%. For every 1,000 searches performed on Google, only 374 result in a user reaching an external website.

Each node feeds the next. The loop does not need to be accelerated. It only needs to continue.

The Price of Scarcity

When a resource stops being abundant, it acquires a price. The price is the proof the scarcity is real.

Reddit in 2021 was the canonical example of free public knowledge. Millions of threads. Decades of accumulated human experience across every domain imaginable. It was freely crawlable, freely indexable, and freely usable as training data. Its entire value proposition as a knowledge resource depended on it being open. That openness was not a policy choice. It was the product. Reddit was worth something as a data source precisely because its users had spent years posting publicly under the assumption that their posts were contributions to a commons.

In January 2024, Reddit disclosed data licensing arrangements totaling $203 million in aggregate contract value ahead of its March 2024 IPO. Google contracted approximately $60 million per year for real-time Reddit API access to feed its Gemini training pipeline. OpenAI signed a comparable deal at approximately $70 million per year. What had been free to access became a $130 million annual revenue line from a single buyer pair. The commons had been valued at zero because its contributors treated it as a commons. The moment its contributors stopped needing to contribute publicly, the platform they had built became a priced asset controlled by its corporate owners, not by them.

The publishing licensing market followed the same logic. News Corp signed with OpenAI for up to $250 million over five years. The New York Times contracts with Amazon for $20 to $25 million per year. Aggregate AI lab spending on data that was previously freely scrapable has crossed $1 billion and is accelerating. At the same time, 35.7% of the top 1,000 highest-trafficked websites now explicitly block GPTBot, choosing to withhold rather than license. The scraping era that built AI is over. What follows is an access economy where the highest-quality human text is either locked behind commercial agreements that small players cannot afford or blocked entirely.

Epoch AI’s June 2024 analysis estimated the total effective stock of quality-adjusted human-generated public text at approximately 300 trillion tokens, projected to be fully depleted between 2026 and 2032. Elon Musk stated at CES that AI has “basically used up all the real-world training data available” since 2024. Goldman Sachs’ chief data officer stated that AI models have run out of training data, warning it “could stunt the development of artificial intelligence.”

High-quality human prose is no longer a positive externality of the open web. It is a priced, closing asset. The labs that can afford to license it will. The open-source ecosystem that cannot will train on synthetic outputs generated by the very models that consumed the original data, which researchers warn leads to model drift and semantic collapse: models trained recursively on their own generation losing contact with the empirical texture of a world they no longer observe directly.

The Cognitive Monopoly

The queries that used to generate traffic and fund public knowledge production now resolve inside a handful of private interfaces. Google Search’s share of total digital queries has eroded to 77.9%. ChatGPT has captured 17.6%. Traditional Google search volume per user fell 20% year over year in Q4 2025. ChatGPT processes 2.5 billion daily prompts across 900 million weekly active users. Meta AI surpassed 1 billion monthly active users. AI-driven internet traffic grew 187% across 2025.

These are not usage statistics. They are a map of where human cognition is now externalizing itself.

The companies that own these interfaces own something without historical precedent: a monopoly on freshness. Not on distribution, compute, or infrastructure. On the only input to intelligence that cannot be synthesized or retroactively reconstructed: the current state of what humans are thinking, asking, misunderstanding, and discovering. Every new concept, every emerging scam, every cultural shift, every novel misunderstanding that once would have been posted publicly and indexed for anyone to find now flows into a private channel whose contents are visible only to the operator.

A model trained on yesterday’s world cannot understand tomorrow’s. The only way to maintain epistemic continuity with a moving world is to own the channels where that world now externalizes its cognition. Those channels are Google’s Gemini embedded in Search and Android, where every query refines a training signal that competitors cannot access. Microsoft Copilot wired into Windows and Office, sitting inside the workflow of hundreds of millions of enterprise users whose operational cognition now flows through Microsoft’s infrastructure. OpenAI’s ChatGPT as the default cognitive endpoint for 900 million weekly users, processing 2.5 billion daily prompts that constitute the largest ongoing dataset of human uncertainty ever assembled.

Apple’s version of this monopoly is the most structurally defensible because it is the most architecturally distinct. Apple Intelligence does not aggregate user queries into a cloud training corpus. It operates entirely on-device, processing personal context across two billion devices: indexing local email, messages, calendars, photos, and contacts to resolve complex queries without the underlying data ever leaving the device. For requests requiring larger compute, Apple routes to Private Cloud Compute infrastructure running on Apple Silicon, where user data is processed in transient memory under cryptographic attestation that makes it inaccessible even to Apple engineers. This architecture gives Apple access to the most intimate layer of personal cognition, the layer that lives in the actual contents of your inbox and messages rather than your search queries, while remaining structurally impossible for any competitor to replicate. You cannot license Apple’s on-device personal context. You cannot crawl it. You cannot buy access to it. It exists only on the device, and it is cryptographically sealed.

Open models train on yesterday’s world. Closed models train on today’s. The benchmark parity that open-source AI achieved in 2025 and 2026 does not change this. Kimi K2.6 outperforms GPT-5.4 on SWE-bench Pro. DeepSeek R1 leads on AIME reasoning. Meta’s Llama 4 delivers GPT-4 level reasoning on consumer hardware. The capability gap has closed. But benchmarks measure performance on problems defined by the historical internet: academic papers, coding contests, and evaluation sets built from text that existed before the vortex began. They do not measure calibration to the current world: the new attack vectors, new APIs, new cultural patterns, new misunderstandings that emerge between training runs and live only in the private interfaces where humans now think. Parity on benchmarks hides divergence in calibration. The open-source community will not fail because it lacks brilliance. It will stagnate because the world it needs to learn from has stopped writing in public. The gap will be invisible until it isn’t, and by the time it becomes legible, the private cognition layer will have compounded its advantage for years.

The Last Commons

The open web was the last commons. Not because it was a noble experiment. Because it was the only viable surface on which public knowledge could accumulate at scale.

The incentive that held it together was not altruism. It was friction. Remove the friction, and the behavior that depended on it disappears. What replaces it is a private transaction: every question, every confusion, every correction routing through an interface owned by a company with no obligation to make the output visible, no incentive to preserve the process, and every incentive to use the interaction to improve a proprietary model compounding its advantage with each query.

For twenty years, the production of knowledge on the internet was a public act: contested in public, corrected in public, preserved in public, accessible to anyone with a browser. The architecture was not designed that way. It emerged that way because public posting was the only rational behavior when there was no private alternative. That condition no longer holds. The vending machine analogy from a different context applies here with precision: the hum was never permanent. It was always the sound of a system that existed because nothing better had been built yet. Something better was built. The hum stopped.

When necessity disappeared, so did the commons.

The decisive variable going forward is freshness. Who controls it is already visible in the ownership structure of the interfaces where cognition now flows. The repricing has begun. The benchmark scores just have not caught up yet.

Productics by Igor

Discussion about this post

Ready for more?