Presentation is loading. Please wait.

Presentation is loading. Please wait.

2004/11/13GPW20041 What Shogi Programs Still Cannot Do - A New Test Set for Shogi - Reijer Grimbergen and Taro Muraoka Department of Informatics Yamagata.

Similar presentations


Presentation on theme: "2004/11/13GPW20041 What Shogi Programs Still Cannot Do - A New Test Set for Shogi - Reijer Grimbergen and Taro Muraoka Department of Informatics Yamagata."— Presentation transcript:

1 2004/11/13GPW20041 What Shogi Programs Still Cannot Do - A New Test Set for Shogi - Reijer Grimbergen and Taro Muraoka Department of Informatics Yamagata University

2 2004/11/13GPW20042 Outline The importance of testing Test sets for chess Test sets for shogi A new test set for shogi Problem area analysis Some new results Differences between humans and computers Conclusions and future work

3 2004/11/13GPW20043 The importance of testing Game programming A program should play strongly More common is the reverse approach: minimize the number of bad moves Testing can help determine problem areas Incremental testing Save positions that the program did not handle well Drawbacks Test set is program-specific Positions selected subjectively

4 2004/11/13GPW20044 The importance of testing The requirements of a test set Testing a wide variety of potential problem areas Not specific for one program Test design in games Mainly done for chess Current test sets for shogi have shortcomings Shogi research is at a point where focusing the effort could be a great help Proposing a new test set for shogi

5 2004/11/13GPW20045 Test sets for chess The Bratko-Kopec test set 12 tactical positions and 12 strategic positions Designed to compare human and computer performance in chess Thus far, no program can solve all positions Reinfeld’s Win at chess 300 tactical positions Used as a first test for new programs LCT II 35 positions Good balance between strategic, tactical and endgame positions An ELO rating can be calculated from the solved positions The Lindner test set A set of positions that are considered hard for computers to solve

6 2004/11/13GPW20046 Test sets for shogi The Matsubara-Iida test set 48 positions taken from professional games Selected by an expert player Aims at judging the strength of shogi programs First given to human players to establish a connection with playing strength Problems with the Matsubara-Iida test set Judging programming strength can be established more accurately by playing on the internet No ELO calculation like in LCT II Subjective selection leaves doubts about test balance What is difficult for computers is not necessarily difficult for humans and vice versa, so connection with playing strength is unreliable

7 2004/11/13GPW20047 Test sets for shogi Other test sets for shogi Yamashita’s test set (10 positions) Tanase’s test set (19 positions) Problems with these test sets Too small Program specific Unclear if there is only one solution

8 2004/11/13GPW20048 A new test set for shogi What do we want from a test set? 1. As general as possible 2. Points to as many problem areas as possible Find positions that can not be solved by the best programs Finding weaknesses instead of measuring strength

9 2004/11/13GPW20049 A new test set for shogi Positions selected from Shukan Shogi Every week six next-move problems Middle game positions and endgame positions Different tactical themes: winning material, attack, defense and mating Our goal: create a test set of 100 positions The programs we used AI Shogi 2003 Todai Shogi 5 Gekisashi 2 Conditions 30 seconds on 2 GHz Pentium 4

10 2004/11/13GPW200410 A new test set for shogi This was not easy! More than 1500 positions needed to be checked to find our test set Additional feature The percentage of respondents who solved the problem is given Differences between what is difficult for humans and difficult for computers

11 2004/11/13GPW200411 Problem area analysis Why are the positions difficult? Using the analysis tools in Todai Shogi, Gekisashi and AI Shogi to find problem areas Our first analysis indicates seven problem areas Horizon effect due to consecutive checks Not calling the tsume shogi solver deep in the search tree Inaccurate evaluation function Incorrect forward pruning Mate with unpromoted pieces Insufficient hardware speed Problems with time allocation

12 2004/11/13GPW200412 Problem area analysis Horizon effect and tsume shogi Problem 750-3 Solved: 16% Solution 2 四銀、 1 四玉(同 歩、 2 三金、同玉、 3 ニ角成)、 3 五金 Program replies Todai: 1 五歩(敗 勢) Gekisashi: 3 ニ角成 (後手優勢) AI Shogi: 3 五金

13 2004/11/13GPW200413 Problem area analysis Horizon effect and tsume shogi The problem Horizon checks after 2 四銀、 1 四 玉、 3 五金 The same position without horizon checks can be solved by all programs

14 2004/11/13GPW200414 Problem area analysis Horizon effect and tsume shogi Another problem: tsume shogi deep in the search tree Gekisashi with more time 2 四銀、 1 四玉、 3 五金、 7 九銀、同玉、 2 五 桂、 1 五歩、同馬、同銀(- 1192 ) White has mate in 9 after 同玉 and black has a mate in 3 after 2 五桂 !

15 2004/11/13GPW200415 Problem area analysis Evaluation and forward pruning Problem 755-3 Solved: 51% Solution 2 二金、同金、 2 三角 成、 3 三金、同馬 Program replies Todai: 2 一角成、 4 一 玉、 6 一金(勝勢) Gekisashi: 6 八銀、 5 六成銀、 3 七桂、 6 六 銀、 2 五桂、 5 四歩、 2 一角成、 4 一玉(先 手勝勢) AI Shogi: 6 八銀、 5 八 成銀、 2 一角成、 4 一 玉

16 2004/11/13GPW200416 Problem area analysis Evaluation and forward pruning The problem: an incorrect evaluation After 2 一角成、 4 一玉 the white king can escape, but this can not be assessed Evaluating the chances of escaping an attack is difficult? Another problem: forward pruning Consecutive sacrifices 2 二金 and 2 三角成 Multiple sacrifices not searched deep enough?

17 2004/11/13GPW200417 Problem area analysis Unpromoted pieces Problem 935-2 Solved: 95% Solution 1 三歩不成、 2 六銀 直、( 1 四歩は反 則) 1 四玉 Program replies Todai: 5 二と(敗 勢) Gekisashi:8 四桂 (後手勝勢) AI Shogi: 投了 (!)

18 2004/11/13GPW200418 Problem area analysis Unpromoted pieces The problem here seems a special case of forward pruning Promoting a major piece or a pawn is almost always better than not promoting Non-promotions of these pieces are pruned to improve search efficiency Not a high priority problem, but could have consequences for thinking in opponent time When there is no difference between promoting and non- promoting a piece, non-promoting makes thinking in opponent time useless My advice : play the non-promotion to win some time!

19 2004/11/13GPW200419 Problem area analysis Other problem areas Insufficient hardware speed Some positions could be solved by giving the program more time Improved hardware speed will automatically solve these positions Time allocation In some positions, the programs would play very quickly These positions were deleted from our test set However, it might be a different problem area: when to cut off the search?

20 2004/11/13GPW200420 Problem area analysis Overview Problem AreaPositions Insufficient hardware speed31 Inaccurate evaluation function20 Incorrect forward pruning19 Horizon effect18 Tsume shogi11 Mate using unpromoted pieces6 Reason unclear7

21 2004/11/13GPW200421 Some new results New program versions have been released Todai Shogi 6 and 7, Gekisashi 3 and AI Shogi 2004 Results of Todai 6 on the test set Solved 6 positions The problem areas of these positions was different Inaccurate evaluation function (2 positions) Insufficient hardware speed (2 positions) Horizon effect (1 position) Reason unclear (1 position)

22 2004/11/13GPW200422 Differences between humans and computers How difficult are the positions for human players? Almost half of the positions (46) can be solved by more than 50% of the human respondents There are 14 positions that can not be solved by computers, but by more than 80% of the humans Human percentage Positions 0 – 10%0 11 – 20%12 21 – 30%18 31 – 40%10 41 – 50%13 51 – 60%16 61 – 70%7 71 – 80%9 81 – 90%9 91 – 100%5

23 2004/11/13GPW200423 Conclusions and future work We have proposed a set of 100 positions that is general and points to specific problem areas in computer shogi As more positions get solved, we intend to replace them with new positions Further investigate of the unsolved positions for which the problem could not be determined Making further comparisons between what is difficult for humans and difficult for computers

24 2004/11/13GPW200424 Finally Download the test set here gamelab.yz.yamagata-u.ac.jp/RESEARCH/shogitestset.zip Let me know about your results


Download ppt "2004/11/13GPW20041 What Shogi Programs Still Cannot Do - A New Test Set for Shogi - Reijer Grimbergen and Taro Muraoka Department of Informatics Yamagata."

Similar presentations


Ads by Google