Uncategorized – Hado van Hasselt

AlphaGo

My colleagues David Silver, Aja Huang, and others have just published their excellent work on Go in Nature. The system is called AlphaGo, and it combines Monte-Carlo tree search with deep neural networks, trained by supervised learning and by reinforcement learning from self-play. The landmark achievement was beating the human European champion 5-0 in tournament games….

Learning to predict independent of span

Rich Sutton and I wrote a paper about how to efficiently learn predictions that can range over many time steps. The focus of the paper is on algorithms whose computational complexity does not grow with the time span of the predictions. This is important because many predictive questions have a large or even infinite span. Another contribution of the paper is that we…