1 Solving Awari using Large-Scale Parallel Retrograde Analysis John W. Romein Henri E. Bal Vrije Universiteit, Amsterdam.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Opening Workshop DAS-2 (Distributed ASCI Supercomputer 2) Project vrije Universiteit.
Big Data Working with Terabytes in SQL Server Andrew Novick
Operating Systems Operating Systems - Winter 2009 Chapter 3 – Input/Output Vrije Universiteit Amsterdam.
Application of Artificial intelligence to Chess Playing Capstone Design Project 2004 Jason Cook Bitboards  Bitboards are 64 bit unsigned integers, with.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Mahgul Gulzai Moomal Umer Rabail Hafeez
CMPT 300: Operating Systems I Dr. Mohamed Hefeeda
1 SC'03, Nov. 15–21, 2003 A Million-Fold Speed Improvement in Genomic Repeats Detection John W. Romein Jaap Heringa Henri E. Bal Vrije Universiteit, Amsterdam.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
CS 213 Commercial Multiprocessors. Origin2000 System – Shared Memory Directory state in same or separate DRAMs, accessed in parallel Upto 512 nodes (1024.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Dr. Mohamed Hefeeda.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
Transposition Driven Work Scheduling in Distributed Search Department of Computer Science vrijeamsterdam vrije Universiteit amsterdam John W. Romein Aske.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
1 I/O Management in Representative Operating Systems.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
CS246 Search Engine Scale. Junghoo "John" Cho (UCLA Computer Science) 2 High-Level Architecture  Major modules for a search engine? 1. Crawler  Page.
Introduction to LHDT 526 Website and Education Software Evaluation Instructor Candace Chou, Ph.D.
Abstract Load balancing in the cloud computing environment has an important impact on the performance. Good load balancing makes cloud computing more.
Copyright © 2011 Partek Incorporated. All rights reserved. Statistics Visualizations Annotations Start-to-Finish Analysis of Integrated Genomics.
Technology Expectations in an Aeros Environment October 15, 2014.
Andrew Holdsworth Director, Real World Performance Oracle Corporation Aris Prassinos Distinguished Member of Technical Staff Motorola James Haluska Systems.
CISC 235: Topic 6 Game Trees.
林俊宏 Parallel Association Rule Mining based on FI-Growth Algorithm Bundit Manaskasemsak, Nunnapus Benjamas, Arnon Rungsawang.
Module 13: Maintaining Software by Using Windows Server Update Services.
University of Illinois at Urbana-Champaign NCSA Supercluster Administration NT Cluster Group Computing and Communications Division NCSA Avneesh Pant
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
CSCI 161: Introduction to Programming 1
HPCVL High Performance Computing Virtual Laboratory Founded 1998 as a joint HPC lab between –Carleton U. (Comp. Sci.) –Queen’s U. (Engineering) –U. of.
Parallelization of the Classic Gram-Schmidt QR-Factorization
JLab Scientific Computing: Theory HPC & Experimental Physics Thomas Jefferson National Accelerator Facility Newport News, VA Sandy Philpott.
Jacquard: Architecture and Application Performance Overview NERSC Users’ Group October 2005.
A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.
Introduction to Microsoft Windows 2000 Welcome to Chapter 1 Windows 2000 Server.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
The Million Point PI System – PI Server 3.4 The Million Point PI System PI Server 3.4 Jon Peterson Rulik Perla Denis Vacher.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
1 MSRBot Web Crawler Dennis Fetterly Microsoft Research Silicon Valley Lab © Microsoft Corporation.
Computer Software Operating Systems – Programs. Computer Language - Review We learnt that computers are made up of millions of tiny switches that can.
Gravitational N-body Simulation Major Design Goals -Efficiency -Versatility (ability to use different numerical methods) -Scalability Lesser Design Goals.
Massive Semantic Web data compression with MapReduce Jacopo Urbani, Jason Maassen, Henri Bal Vrije Universiteit, Amsterdam HPDC ( High Performance Distributed.
Enterprise Network Systems Client/ Server Mark Clements.
European Laboratory for Particle Physics Window NT 4 Scaling/Performance Tests Alberto Di Meglio CERN IT/DIS/NCS.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,
CERN - European Organization for Nuclear Research FOCUS March 2 nd, 2000 Frédéric Hemmer - IT Division.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
Parallel Computing on Wide-Area Clusters: the Albatross Project Aske Plaat Thilo Kielmann Jason Maassen Rob van Nieuwpoort Ronald Veldema Vrije Universiteit.
AMS02 Software and Hardware Evaluation A.Eline. Outline  AMS SOC  AMS POC  AMS Gateway Computer  AMS Servers  AMS ProductionNodes  AMS Backup Solution.
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
Get the Most out of SQL Server Standard Edition Or How to be a SQL Miser.
Optimizing Distributed Actor Systems for Dynamic Interactive Services
Manchester HEP group Network, Servers, Desktop, Laptops, and What Sabah Has Been Doing Sabah Salih.
JDAT Production Hardware
Transposition Driven Work Scheduling in Distributed Search
Global Communication Media
CS246 Search Engine Scale.
CS246: Search-Engine Scale
Dynamic Verification of Sequential Consistency
Solving Awari using Large-Scale Parallel Retrograde Analysis
Presentation transcript:

1 Solving Awari using Large-Scale Parallel Retrograde Analysis John W. Romein Henri E. Bal Vrije Universiteit, Amsterdam

2 introduction: awari ' 3500-year old board game ' best-known mancala variant # wari, owari, wale, awale,... ' determine score for 889,063,398,406 positions # retrograde analysis # 144 CPUs, 72 GB RAM, 1.4 TB disks, Myrinet

3 outline ' rules of awari ' databases ' (parallel) retrograde analysis ' performance ' verification ' new game insights ' www: awari oracle

4 rules of awari ' sow counterclockwise ' capture if last, enemy pit contains 2 or 3 stones ' goal: capture majority of stones

5 awari databases ' build n-stone databases (n = 0, 1,..., 46, 48)  entry  board # entry contains score (-n... +n) # south to move

6 scores ' best move depends on remaining stones # not on captured stones!  final result =  captured stones + score ' score = eventual division of remaining stones score = +2 (8  6) south to move

7 database construction: retrograde analysis ' MiniMax tree (DCG) ' search state space bottom-up initial state final states

8 10-bit retrograde analysis ' best score (7 bits) + nr. unknown children (3 bits) ' inform parent if score becomes known ? 1 1 ? 1 1 ?

9 2-bit retrograde analysis ' 2 bits/entry in RAM: Win/Draw/Loss/Unknown ' search n times with widening window (-i, i) PROCEDURE CreateDatabase(n) IS FOR i IN 1... n DO Window := (-i, i); SetLeaves();// handle terminal states and captures BottomUpSearch(); CollectScores();

10 bottom-up search WWWW UW U WU PROCEDURE CheckState(node) IS IF state [node] = unknown AND AllChildrenAreWins(node) THEN state [node] := loss; SetParentsToWin(node); CheckStateOfGrandParents(node); ÔLÔL ÔWÔW

11 parallel retrograde analysis WWWWUW U WU ÔLÔL ÔWÔW ' partition database ' receive queue with work ' migrate work (asynchronously) ' global termination detection

12 performance (1/3) ' 72 x # dual 1.0 GHz Pentium III # 1 GB RAM # 20 GB disk # 2.0 Gb/s Myrinet ' Myrinet switch

13 performance (2/3) ' 48-stones: 15 hours ' total:51 hours

14 performance (3/3) ' communication  20  30 MB/s send + receive per SMP node  1.4  2.1 GB/s through switch # 130 TB in total = 1.0 Pb ! ' disk I/O   10 TB in total

15 verification ' hardware: # ECC RAM, cache, and Myrinet memory # CRC communication and disk checksums ' software: # 2 algorithms give identical results (up to 41 stones) # recomputed using 64 SMPs # NegaMax integrity check # compared statistics with others (up to 36 stones)

16 new awari insights ' awari is a draw ' best opening move: F4 # other opening moves are losing! ' to capture is not always the best choice # in 22% of cases, it is not

17 the awari oracle ' web server (being worked on) # lookup positions # interactive play # download # statistics ' requires 5 x 160 GB disks '

18 conclusions * awari is solved and is a draw * parallel retrograde analysis # overlap computation, communication and disk I/O * required: # score determination of 889,063,398,406 positions # large parallel system – 51 hours computation time – 1.0 Pb communication