Why AI Isn't Sophisticated Copy-Paste

Look, I know that writing rebuttals to Register opinion pieces is low-hanging fruit. But when the claim is that AI is just “sophisticated copy and paste” that will never truly think, it’s worth picking that fruit.

The author invokes John Searle’s Chinese Room argument—a thought experiment where someone follows rules to respond to Chinese messages without understanding Chinese. Therefore, the author concludes, AI merely manipulates symbols without semantic understanding.

But here’s the problem: the Chinese Room is irrelevant to whether AI is actually doing copy-paste. That’s an empirical question, not a philosophical one.

If AI merely copies and pastes, where exactly is it copying these solutions from?

What Does Reasoning Look Like?

Before examining the evidence, let’s be clear about what I mean by reasoning: decomposing novel problems into sub-goals, generating solution steps, verifying correctness, and backtracking when wrong—all on problems outside the training distribution. Not retrieving answers. Not pattern matching against stored examples. Actually working through problems that have never been seen before.

If AI is just copy-paste, it should fail on genuinely novel tasks. Let’s test that claim.

Testing the Copy-Paste Claim

Let’s examine what modern AI systems actually accomplish, starting with problems that literally didn’t exist when the models were trained.

International Mathematical Olympiad Gold Medals (July 2025)

Both OpenAI and Google DeepMind models achieved gold medal performance on the 2025 IMO, solving 5 of 6 problems under official exam conditions: two 4.5-hour sessions, no internet access, natural language proofs. With 26 contestants scoring higher, the models tied for 27th place among 630 competitors—a 45-way tie that included some of the world’s most elite young mathematicians.

IMO problems require multi-page proofs with self-checking steps that verify logical consistency at each move. The 2025 problems were created specifically for that competition. While historical IMO problems from 2000-2024 existed in training data, the 2025 problems required novel proof strategies and combinations that had never been seen before.

There was nothing to copy. The problems didn’t exist in any training data. So where’s the source material?

Novel Algorithmic Discovery

Perhaps mathematics competitions could still involve pattern matching. What about discovering something genuinely new?

DeepMind’s AlphaTensor discovered matrix multiplication algorithms more efficient than the state of the art, improving on Strassen’s 50-year-old algorithm for the first time since its discovery. It started with zero knowledge about existing algorithms and through reinforcement learning discovered provably correct algorithms that outperform human-designed ones.

Days after AlphaTensor’s release, mathematicians used its algorithms as starting points and improved them further—confirming they were genuinely novel.

The algorithms didn’t exist before. Mathematicians verified they were new. If this is copy-paste, what’s being copied?

Competitive Programming at Expert Level

Maybe these are one-off achievements in specific domains. What about sustained performance under time pressure?

o3 achieved an ELO of 2727 on Codeforces (using high compute settings), placing it in the top 175 competitive programmers globally. These are timed competitions with novel algorithmic challenges designed to be unseen. Competitors face fresh problems they’ve never encountered, with the clock running.

Where’s the repository being copied from?

Adaptation to Genuinely Novel Tasks

The strongest test: benchmarks explicitly designed to prevent memorization.

o3 scored 75.7% on ARC-AGI-1, matching average human performance. ARC tests abstraction and reasoning on visual puzzles the model has never encountered. While this required significant compute and newer benchmarks remain challenging, ARC was specifically created to resist any form of pattern matching or retrieval.

If it’s copy-paste, these benchmarks shouldn’t be solvable at all.

How It Actually Works: Mechanisms of Reasoning

So how do these systems actually solve novel problems? What distinguishes reasoning from retrieval?

Backtracking and Self-Correction

Modern reasoning models learn to recognize and correct their mistakes. They break down difficult steps into simpler ones and try different approaches when the current one isn’t working. Backtracking sentences (like “Wait…”) cause models to revisit earlier conclusions, boosting final-answer accuracy.

This isn’t pattern matching. This is active error detection and recovery—the model notices its own logical mistakes and revises its approach. Copy-paste systems don’t second-guess themselves and backtrack.

General Reasoning, Not Domain Memorization

Nothing in the training pipeline specifically targeted geometry or number theory. Instead, models were trained to break tough goals into smaller claims, check each claim, and learn from failed attempts. The approach generalizes across mathematical domains without domain-specific modules.

If this were memorization, models would need explicit training on every problem type. Instead, we see transfer: principles learned in one domain applying to completely different domains.

When AI Actually Does Copy-Paste, We Can Tell

The author cites OpenAI’s recent Erdős problem embarrassment—where a model was falsely claimed to have solved unsolved problems but had merely found existing solutions in obscure papers. Fair point. That was sophisticated literature search, not original mathematics. OpenAI’s claims were retracted within hours after mathematician Thomas Bloom and others publicly corrected the record.

But that’s exactly why the IMO achievement matters. When AI does retrieval versus reasoning, we can tell the difference. The mathematical community recognized within hours that the Erdős case was retrieving existing work. The IMO gold medals are fundamentally different—those problems didn’t exist in any literature. There was nothing to retrieve.

About “Stealing” Content

The Register author mentions discovering that ChatGPT had “learned by stealing words” from their earlier Linux articles. This raises an important distinction between training and copying.

When models train on text, they don’t store articles to paste later. They learn statistical patterns across billions of documents—how technical concepts relate, how arguments are structured, how explanations flow. If a model’s output resembles a specific article, that’s concerning for attribution and should be addressed. But it’s different from retrieval.

The test is simple: ask the model to solve a problem that didn’t exist when it was trained. The 2025 IMO problems, the novel algorithms, the unseen Codeforces challenges—these are that test. And the results show reasoning, not retrieval.

The Chinese Room Misses the Point

The author leans on Searle’s Chinese Room to argue AI lacks “true understanding.” But the Chinese Room asks whether there’s phenomenal experience—“something it’s like” to be an AI processing information.

That’s orthogonal to whether AI can solve novel IMO problems, discover new algorithms, or debug code.

Whether there’s subjective experience inside the system is philosophically fascinating. But it’s practically irrelevant. A system that demonstrates mathematical reasoning on genuinely new problems is demonstrating reasoning capability, regardless of what the internal experience feels like—if there even is one.

The test isn’t consciousness. The test is capability.

Conclusion

The original article’s logic goes: “AI can’t truly understand because of a 45-year-old philosophy thought experiment, therefore everything it does must be copy-paste.” It invokes the Survival Game framework from one research group, projecting AGI by 2100. But this is one proposed evaluation method among many, not scientific consensus.

Meanwhile, AI systems solve competition mathematics that didn’t exist in training data. They self-correct logical errors. They discover novel algorithms that improve on decades-old human designs. They adapt to genuinely new problem classes.

When AI does just retrieve existing work (like the Erdős incident), we recognize it immediately. The IMO achievements are different. Those solutions came from reasoning, not retrieval.

Whether there’s “true understanding” inside remains an open philosophical question. But the empirical question—is it copy-paste?—has a clear answer.

No.