AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph
**Abstract:** Andrew G. Barto and Richard S. Sutton, two prominent scholars in the field of artificial intelligence (AI), have been awarded the prestigious Turing Prize for their pioneering work in reinforcement learning. This technique, which they developed over decades, has become a cornerstone in the advancement of AI, notably enabling the triumph of AlphaGo in the complex game of Go and subsequent achievements in chess. **Key Events:** - **Award of Turing Prize:** The Association for Computing Machinery (ACM) announced that Andrew G. Barto and Richard S. Sutton are the recipients of the 2023 Turing Prize, often referred to as the "Nobel Prize of Computing." This recognition highlights their foundational contributions to the field of reinforcement learning (RL), a type of machine learning where an AI system learns to make decisions by interacting with its environment through trial and error. - **Development of Reinforcement Learning:** Barto and Sutton's work on RL dates back to the 1980s and 1990s, a period when the concept was still in its infancy. Their research laid the groundwork for algorithms that could learn from feedback signals, which are essential for training AI systems to perform tasks that require strategic decision-making. - **AlphaGo's Triumph:** One of the most significant applications of RL has been in the development of AlphaGo, an AI program created by Google's DeepMind. In 2016, AlphaGo made headlines by defeating world champion Lee Sedol in the ancient and complex board game of Go, a feat that was previously thought to be decades away from realization. This victory demonstrated the power of RL in handling tasks with a vast number of possible moves and a high degree of uncertainty. - **Further Achievements in Chess:** Building on the success of AlphaGo, DeepMind developed AlphaZero, an even more advanced version of the AI that used RL to teach itself chess from scratch. In 2017, AlphaZero outperformed the world's leading chess program, Stockfish, in a series of matches, showcasing the versatility and potential of RL in different strategic domains. **Key People:** - **Andrew G. Barto:** A professor emeritus at the University of Massachusetts Amherst, Barto has been a leading figure in the development of RL. His research has focused on the theoretical foundations and practical applications of RL, contributing to its integration into various AI systems. - **Richard S. Sutton:** Known as the "father of reinforcement learning," Sutton is a distinguished professor at the University of Alberta and a research scientist at DeepMind. His seminal work includes the development of the Q-learning algorithm, which is a fundamental method in RL for estimating the value of actions in a given state. **Key Locations:** - **University of Massachusetts Amherst:** Barto's academic home, where he conducted much of his early research on RL. - **University of Alberta:** Sutton's academic base, where he has continued to push the boundaries of RL theory and application. - **DeepMind:** A London-based AI research laboratory founded in 2010 and acquired by Google in 2014. DeepMind has been at the forefront of applying RL to challenging problems in games and beyond. **Time Elements:** - **1980s-1990s:** The initial period during which Barto and Sutton began their work on RL, laying the theoretical and practical foundations for the technique. - **2016:** The year AlphaGo defeated Lee Sedol in Go, marking a significant milestone in the application of RL to complex strategic games. - **2017:** The year AlphaZero, an advanced version of the AI, outperformed the top chess program, Stockfish, further demonstrating the power of RL. **Summary:** The 2023 Turing Prize has been awarded to Andrew G. Barto and Richard S. Sutton for their groundbreaking contributions to reinforcement learning (RL), a key area in artificial intelligence. RL is a machine learning approach where algorithms learn to make decisions through interaction with their environment, receiving feedback in the form of rewards or penalties. This technique has been instrumental in the development of AI systems capable of strategic decision-making, such as the famous AlphaGo, which in 2016 defeated the world champion in Go, and AlphaZero, which in 2017 surpassed the leading chess program, Stockfish. Barto, a professor emeritus at the University of Massachusetts Amherst, and Sutton, a distinguished professor at the University of Alberta and research scientist at DeepMind, have spent decades refining and advancing RL, contributing to its theoretical underpinnings and practical applications. Their work has not only revolutionized the field of AI but has also opened new possibilities for solving complex problems in a variety of domains.
