My colleagues David Silver, Aja Huang, and others have just published their excellent work on Go in Nature. The system is called AlphaGo, and it combines Monte-Carlo tree search with deep neural networks, trained by supervised learning and by reinforcement learning from self-play. The landmark achievement was beating the human European champion 5-0 in tournament games….
Month: January 2016
UCL course – 2016
Together with Joseph Modayil, this year I am teaching the part on reinforcement learning of the Advanced Topics in Machine Learning course at UCL. Lectures Note that there will be two lectures about AlphaGo on March 24. We will talk about AlphaGo in the context of the whole course at the normal place and time (9:15am in Roberts 412), and in addition…
Learning to predict independent of span
Rich Sutton and I wrote a paper about how to efficiently learn predictions that can range over many time steps. The focus of the paper is on algorithms whose computational complexity does not grow with the time span of the predictions. This is important because many predictive questions have a large or even infinite span. Another contribution of the paper is that we…