|6 min read

AlphaGo Beats Lee Sedol: AI's Deep Blue Moment

DeepMind's AlphaGo just defeated one of the greatest Go players in history, and this feels fundamentally different from Deep Blue

Last week, something happened that the AI community did not expect for at least another decade. DeepMind's AlphaGo defeated Lee Sedol, one of the greatest Go players in history, four games to one in a five-game match in Seoul. I have been glued to the commentary streams, reading every technical analysis I can find, and I still do not think the magnitude of this has fully registered.

Let me explain why this is not just another computer-beats-human-at-game story.

Why Go Is Different

When IBM's Deep Blue defeated Garry Kasparov at chess in 1997, it was a landmark moment, but it was also, in some sense, an engineering victory. Chess has roughly 10^47 possible game positions. That is an enormous number, but Deep Blue could evaluate 200 million positions per second using specialized hardware and sophisticated evaluation functions hand-tuned by chess grandmasters. It was brute force, refined by human expertise.

Go has approximately 10^170 legal board positions. That is not just a bigger number. It is a number so large that brute-force search is physically impossible. There are more possible Go positions than atoms in the observable universe. You cannot build a computer fast enough to search the Go game tree the way Deep Blue searched chess.

This is why AI researchers have used Go as a benchmark for decades. It was supposed to be the game where human intuition had an unassailable advantage. Go masters describe their play in terms of "shape" and "feeling" and "thickness," concepts that resist precise mathematical definition. The best human players develop an intuitive sense for board positions that no amount of computation was supposed to replicate.

AlphaGo replicated it.

How AlphaGo Works

The technical approach is what makes this genuinely exciting. AlphaGo combines two deep neural networks with Monte Carlo tree search, and the result is something that plays Go in a way that looks almost human.

The first network, called the policy network, takes a board position as input and outputs a probability distribution over possible moves. It was initially trained on millions of moves from expert human games, then further refined by playing millions of games against itself. This network essentially gives AlphaGo "intuition" about which moves are worth considering.

The second network, called the value network, takes a board position and estimates the probability of winning from that position. Instead of searching all the way to the end of the game, which is computationally impossible, AlphaGo uses this network to evaluate intermediate positions.

Monte Carlo tree search then combines these networks to explore promising lines of play. The policy network narrows the search to plausible moves, and the value network evaluates positions without requiring full rollouts. The result is a system that searches deeply but selectively, much like a human expert who considers only a handful of candidate moves at each turn.

Move 37

The moment that stunned the Go world came in Game Two. On move 37, AlphaGo placed a stone on the fifth line of the right side of the board, a move that virtually no human player would consider. The commentators were confused. The experts thought it was a mistake.

It was not a mistake.

Over the next hundred moves, the strategic implications of move 37 became apparent. It was a brilliant, creative, deeply unconventional play that gave AlphaGo a subtle positional advantage that compounded over the course of the game. Fan Hui, the European Go champion who served as a commentator, called it "beautiful."

This is the moment that separates AlphaGo from Deep Blue. Deep Blue played chess at a superhuman level, but its moves were recognizably within the space of human chess theory. AlphaGo played a move that expanded the space of what we thought was possible in Go. It was not just good; it was genuinely creative.

Lee Sedol's Victory

Lee Sedol won Game Four, and the way he did it was fascinating. On move 78, he played what has been called the "hand of God" move, a wedge play that AlphaGo apparently did not evaluate correctly. After that move, AlphaGo made several uncharacteristic errors, suggesting that the system's evaluation function had a blind spot in certain types of positions.

This tells us something important: AlphaGo is not omniscient. It has systematic weaknesses that a sufficiently creative opponent can exploit. But the fact that it took one of the best players in the world, playing at the absolute peak of his abilities, to find those weaknesses tells us just how strong the system is.

Lee Sedol's post-match comments were gracious and insightful. He said the experience changed how he thinks about Go. He learned from AlphaGo, particularly from move 37, and suggested that human Go players will study AlphaGo's games the way they study the games of past masters.

What This Means for AI

I wrote about Watson winning Jeopardy a few years ago, and I compared it to Deep Blue's chess victory. AlphaGo feels like a different category entirely.

Watson and Deep Blue were impressive feats of engineering, but they were essentially hand-engineered systems that combined human expertise with computational power. AlphaGo learned. The policy network was trained on human games, yes, but the real strength came from self-play: millions of games where AlphaGo played against itself and discovered strategies that humans had never considered.

The deep learning revolution that started with image recognition and speech processing has now reached into a domain that was supposed to require genuine intelligence: strategic reasoning under uncertainty with a vast search space. If neural networks can learn Go, what else can they learn?

I think we are going to look back on this week the way we look back on Deep Blue's victory in 1997, but with a crucial difference. Deep Blue showed that computers could outperform humans through speed and specialized engineering. AlphaGo showed that computers can learn to outperform humans through something that looks uncomfortably like understanding.

The Feeling in the Room

I am finishing my time in graduate school, and I spend my days surrounded by researchers working on machine learning and computational problems. The mood this week has been electric. People who were skeptical about deep learning's ability to handle complex strategic reasoning are reconsidering. People who thought artificial general intelligence was pure science fiction are quietly updating their estimates.

Nobody is saying machines are conscious or that general AI is around the corner. But something shifted. The boundaries of what we thought required human intelligence just moved significantly, and they moved in the direction of machines.

For someone about to enter the tech industry, this is both thrilling and sobering. The tools we will build in the next decade will be shaped by the kind of learning that AlphaGo demonstrated. And if move 37 taught us anything, it is that those tools might surprise us in ways we cannot predict.

Share: