Does AI really think?

Skeptics say that LLMs can’t “actually think”, they are just computing probabilities. Proponents respond that that’s all thinking is, and a model like GPT-4 is thinking just as much as we are, even if it’s not as good at some tasks (but better at others!).

The proponents have some good evidence in their favor. For instance, researchers have trained a small language model to add numbers, using the same predict-next-word structure to answer questions like “35 plus 92 equals _____”. The resulting neural network appears to have developed circuits which perform arithmetic in a mathematically efficient way; in a very real sense, it learned to “understand” arithmetic. In another experiment, a model was trained on transcripts of Othello games, and appeared to develop circuits representing board positions: without ever having the rules of Othello explained to it, it learned that there is an 8x8 board and how to translate sequences of moves into an arrangement of pieces on that board.

If the proponents are correct, then future iterations of GPT and other large language models may continue to get better at deep understanding and avoiding hallucinations, until they match or surpass human beings in these regards. Alternatively, it may be that human-level understanding requires some other approach to designing and training a neural network. I lean toward the latter view, but I wouldn’t rule out either possibility.

The Two-System Brain - Why LLMs Are Data-Hungry and Biological Learners Are Not — the deepest version of this question. LLMs have a general learner (cortex-equivalent) but lack the brain’s steering subsystem (subcortical reward functions). Whether that missing piece is “what thinking really is” or just an efficiency trick is the crux of the debate.
RL Scaling Follows Pre-Training - The Generalization Inflection Ahead — RL adds goal-directed behavior to base models. If “thinking” requires pursuing objectives (not just predicting tokens), RL scaling is what closes the gap. Dario Amodei expects full SWE capability in 1-2 years.
Jevons Paradox vs Cognitive Displacement - The Unresolved Tension — if AI does “think” well enough to substitute for knowledge workers, the displacement thesis gets stronger. If it merely pattern-matches without genuine understanding, Jevons holds (AI augments, humans still needed for judgment).
Adversarial Domains Resist AI - Markets, Politics, and the Stable-Label Problem — domains where the answer changes in response to the prediction. If AI “thinks” by pattern-matching on historical data, adversarial domains are a structural limit. If it genuinely reasons, the limit weakens.
TAM of Intelligence is Infinite — the bullish framing assumes AI cognition is real enough to substitute for human cognition across unbounded use cases.
Verification Cost Inversion - AI Makes Creation Cheap and Trust Expensive — even if AI thinks, verifying that it thought correctly becomes the bottleneck. Trust costs rise as creation costs fall.
Adam Marblestone on Dwarkesh Patel — the neuroscience perspective on what’s missing from current architectures

Does AI really think?

Related Notes