Pattern Matching in Computer Go Ling Zhao University of Alberta August 7, 2002.

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Formal Language, chapter 4, slide 1Copyright © 2007 by Adam Webber Chapter Four: DFA Applications.
Regular Expressions and DFAs COP 3402 (Summer 2014)
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Formal Language, chapter 9, slide 1Copyright © 2007 by Adam Webber Chapter Nine: Advanced Topics in Regular Languages.
Deterministic Selection and Sorting Prepared by John Reif, Ph.D. Analysis of Algorithms.
CS 171: Introduction to Computer Science II
ITEC200 Week 11 Self-Balancing Search Trees. 2 Learning Objectives Week 11 (ch 11) To understand the impact that balance has on.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture2: Non Deterministic Finite Automata Prof. Amos Israeli.
Introduction to Computability Theory
Progressive Strategies For Monte-Carlo Tree Search Presenter: Ling Zhao University of Alberta November 5, 2007 Authors: G.M.J.B. Chaslot, M.H.M. Winands,
Indexing Positions of Moving Objects Using B + -trees 4-th WIM meeting, Aalborg 2002 Laurynas Speičys
1 CS 430: Information Discovery Lecture 4 Data Structures for Information Retrieval.
Trees and Red-Black Trees Gordon College Prof. Brinton.
Self-Balancing Search Trees Chapter 11. Chapter 11: Self-Balancing Search Trees2 Chapter Objectives To understand the impact that balance has on the performance.
Fall 2007CS 2251 Self-Balancing Search Trees Chapter 9.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
Self-Balancing Search Trees Chapter 11. Chapter Objectives  To understand the impact that balance has on the performance of binary search trees  To.
General Trees and Variants CPSC 335. General Trees and transformation to binary trees B-tree variants: B*, B+, prefix B+ 2-4, Horizontal-vertical, Red-black.
1 A Lempel-Ziv text index on secondary storage Diego Arroyuelo and Gonzalo Navarro Combinatorial Pattern Matching 2007.
Data Structures & Algorithms Radix Search Richard Newman based on slides by S. Sahni and book by R. Sedgewick.
Binary Trees Chapter 6.
On the Use of Regular Expressions for Searching Text Charles L.A. Clarke and Gordon V. Cormack Fast Text Searching.
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
Balanced Trees Ellen Walker CPSC 201 Data Structures Hiram College.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
Overview of Previous Lesson(s) Over View  Strategies that have been used to implement and optimize pattern matchers constructed from regular expressions.
CS Data Structures Chapter 5 Trees. Chapter 5 Trees: Outline  Introduction  Representation Of Trees  Binary Trees  Binary Tree Traversals 
Module 5 Planning for SQL Server® 2008 R2 Indexing.
CS Data Structures Chapter 10 Search Structures.
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Chapter 13 B Advanced Implementations of Tables – Balanced BSTs.
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Binary Tree. Contiguous versus Linked List Insertion in Contiguous list needs a lot of move. For big chunks of records it is time consuming. Linked List.
Starting at Binary Trees
Symbol Tables and Search Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Cellular Automata FRES 1010 Eileen Kraemer Fall 2005.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
Foundation of Computing Systems
Bijective tree encoding Saverio Caminiti. 2 Talk Outline Domains Prüfer-like codes Prüfer code (1918) Neville codes (1953) Deo and Micikevičius code (2002)
Overview of Previous Lesson(s) Over View  Algorithm for converting RE to an NFA.  The algorithm is syntax- directed, it works recursively up the parse.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
LIMITATIONS OF ALGORITHM POWER
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
CS 154 Formal Languages and Computability February 9 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
Distance Computation “Efficient Distance Computation Between Non-Convex Objects” Sean Quinlan Stanford, 1994 Presentation by Julie Letchner.
AVL DEFINITION An AVL tree is a binary search tree in which the balance factor of every node, which is defined as the difference between the heights of.
B+-Trees.
Chapter Trees and B-Trees
Chapter Trees and B-Trees
Chapter 2 FINITE AUTOMATA.
CSCE350 Algorithms and Data Structure
Advanced Associative Structures
4. Properties of Regular Languages
Non-Deterministic Finite Automata
High-Performance Pattern Matching for Intrusion Detection
Presentation transcript:

Pattern Matching in Computer Go Ling Zhao University of Alberta August 7, 2002

Outline Motivations Formalization A straightforward approach DFA approach Tree approach Conclusion

Motivations Patterns are a very important representation of knowledge Human players also use patterns But, they are stronger on inexact matching Computers are good at exact matching Pattern databases can be very large (>10,000)

Formalization A pattern is a set of points in an area of the board where each point (x,y) has a state in {Empty, Black, White, OutBound, DontCare} 2D -> 1D for both the board and patterns GnuGoExplorer (early version?)

2D->1D ?X?.O? ?OO X: black stone O: white stone.: empty stone ?: don’t care *: don’t care, can even be out of bound B C4A D Pattern: Scanning path: OO?X.?*O*?*?

What is the problem? Given a set of patterns, try to find all matching in the board Some issues 1. Efficiency 2. Scalability

Isomorphic patterns Patterns can be rotated and mirrored. The color of stones can also be reversed. A pattern can be presented in 16 forms

Straightforward approach For every point in the board For every transformation of patterns Try to match it Outcome A set of patterns matched in the board P: # of patterns C: average cost to match a pattern Computation: 361*16 * P * C

Pattern Matcher for Goliath Pattern size: 5 X 5 (25 points), can be extended to 5 X types for a point {White, Black, Empty, DontCare} Three 32-bit integers: one for the bitmap of black stones, one for white stone, and one for empty point. Each pattern is represented by an array of 3 32-bit integers Mark Boon, "Pattern Matcher for Goliath", Computer go 13, winter

Matching Pattern test if (position & pattern == pattern) or equivalently test if (position & ~pattern == 0)

Implementation Issues Try the mirrored and rotated patterns (8 of them) First match the integer for empty points 99% will result in a mismatch Try matching the integers for white and black stones swap white and black colors and try matching the integers for white and black stones Only need 8 arrays for every pattern

Implementation Issues (cont’d) Incremental update: after a stone is added to the board, only need to try influenced positions. Demand that every pattern has no DontCare points in the 5 interior points only 243 (3^5) possibilities Database is organized as 243 lists of patterns Result in 100 times faster in general

Problems scalability problem: think about 1,000,000 patterns Lots of patterns may share the same prefix Need to remove redundant comparisons

DFA Approach DFA – Deterministic Finite Automaton A finite set of states and a set of transitions from state to state which are caused by input symbols. For each state, there is a unique transition on each symbol. GnuGo Manual, , 2002

A DFA to recognize ????..X A DFA to recognize ????..X and XXO

More DFA

Construct DFA Question: given two DFAs to recognize two patterns, how to build one DFA to recognize both patterns? synchronized product: B = L X R State in B are the couples (l,r) with l in L and r in R. The transition of B is the set of transitions (l1, r1)-a->(l2,r2) if l1-a->l2 in L and r1-a->r2 in R.

Example

Discussions In the worst case, the size of DFA is exponential in the number of patterns In practical situations, the size tends to be stable. Find the minimum-state DFA to recognize all patterns (optimization)

Tree Approach Use sampling to differentiate patterns Reduce board-pattern matching Martin Mueller’s Ph.D. Thesis, 75-78, 1995

Tree Approach Each pattern is covered by a grid of 4 X 4 tiles. Each point can have one of the four values Empty, Black, White, DontCare. A 32-bit integer is used to represent a tile. A Patricia tree index for differentiating the patterns is built.

Patricia Tree empty -- initial state insert ababb: ababb insert ababa: 1st difference is at position 5 [5] -- i.e. test position #5 a b ababa ababb insert ba: [1] a b [5] ba a b ababa ababb

Patricia Tree in Explorer 1.Sample positions and lead to different branches according to the value. 2.For each node, traverse both the subtree with matching color and the DontCare subtree. 3.Matching will end up with either mismatch leaves or pattern leaves. If sample points are matched, still need to do a full match for all points in the pattern.

Implementation Issues Compare candidate patterns for each starting positions and orientation of the board. Incremental Update Adaptive tree: after a pattern-board mismatch, replace the pattern leaf with a new node branching at the index where the mismatch occurs. Balance the Patricia tree

Wild Thoughts Three approaches have their own advantages: The straightforward one is simple The DFA one is powerful The Tree one is very flexible Which one is better? How to combine them?