|
此文章由 一条大鱼 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 一条大鱼 所有!转贴必须注明作者、出处和本声明,并保持内容完整
本帖最后由 一条大鱼 于 2020-4-23 22:42 编辑
The documentary about AlphaGo on YouTube is an exciting masterpiece. To me, the three layers structure of AlphaGo’s algorithm means more than a computer programming.
The algorithm has three layers as described at the around 47:20 of the clip:
1. Policy network is trained on high level games to imitate those Go players.
2. Value network evaluates board position to tell what the probability of winning is in this particular position.
3. Tree search tries to figure out what would happen in the future.
This algorithm is so similar to the thinking plan implemented by a great player.
1. Using policy network he read all the books in his hometown’s library to imitate the existing great players, when he was only a child.
2. Using valuation network, he know where to fish and his business partner knows it reversely.
3. Using tree search to figure out what probably wouldn’t change in the future.
The difference is that he figured the whole plan out 50 years ago. Luckily, he is still famous to most of us.
|
|