What AlphaZero ‘knows’ about Chess

I’m a lousy chess player.  When I was 12, my 7 year old brother beat me regularly.  Fortunately chess ability doesn’t correlate with intelligence, as another pair of brothers will show.   The younger brother could beat the older one at similar ages, but as the years passed, the older brother became a Rhodes scholar, while the severely handicapped younger brother (due to encephalitis at one month of age) is currently living in a group home.

A fascinating paper [ Proc. Natl. Acad. Sci vol. 119 e2206625119 ’22 ] opens the black box of AlphaZero, a neural net which is currently the world champion, to see what it ‘knows’ about chess as it relentlessly plays itself to build up expertise.

The paper is highly technical and I don’t claim to understand all of it (or even most of it), but it’s likely behind a paywall so you’ll have to content yourself with this unless you subscribe ($235/year for the online edition). The first computer chess machines used a bunch of rules developed by expert chess players.  Neural nets require training.  For picture classification they required thousands and thousands of pictures, and feed back about whether they got it right or wrong.  Then the probability of firing between elements of the net (neurons) was adjusted up if the answer was correct, down otherwise.  This is supervised learning.

Game playing machines are unsupervised, they just play thousands and millions of games against themselves (AlphaZero played one million).  Gradually they get better and better, so they beat humans and earlier rule based machines.  A net that has played 32,000 games beats the same net that has played 16,000 games 100 games out of 100 games.  However the 128,000 beats 64,000 only 64 times.

They they had a world chess champion (V.K.) analyze how the machines were playing.

Between 16,000 and 32,000 plays the net began to understand the relative values of the pieces (anything vs. pawns, queen vs. rook etc. etc.)

Between 32,000 and 64,000 king safety appeared

Between 64,000 and 128,00 games which attack was most likely to succeed appeared.

Showing that there is no perfect strategy, separate 1,000,000 runs of the machine settled on two variants of the (extremely popular) Ruy Lopez opening.

They studied recorded human games (between experts or they wouldn’t have been recorded) in the past 500 years.  Initially most people played the same way, with variants appearing as the years passed.  The neural net was just the opposite, trying lots of different things initially and subsequently settling on few approaches.

All in all, a fascinating look inside the black box of a neural net.

Post a comment or leave a trackback: Trackback URL.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: