Lambert Schomaker KI2 - 2 Kunstmatige Intelligentie / RuG
2 Outline Date1 st hour2 nd hour 6 nov Planning, N&R #11-13 (LS) idem 13 nov Knowledge-based symbolic methods (LS) #19.6, #21 Example: geometric modeling & matching (MB) 20 nov Statistical symbolic methods 1 (LS) #17 Example: spam filter 27 nov Statistical symbolic methods 2 (LS) Example: autoclass 4 dec Heterogeneous-information integration Example: writer identification, sat. images 11 dec Grammar inductionArticles 18 dec Misc. topicsMisc. applications jan(exam)
3 Knowledge-based symbolic methods Assumption: the Turing / Von Neumann computer is a universal computation engine… …therefore it can be used at all levels of information processing: provided an appropriate algorithm can be designed which operates on appropriate representations
4 Knowledge-based symbolic methods provided an appropriate algorithm can be designed… which operates on appropriate representations…
5 Knowledge-based symbolic methods …provided an appropriate algorithm can be designed… mechanisms: recursion, hierarchic procedures search algorithms parsers matching algorithms string manipulation. numerical computing signal processing image processing statistical processing
6 Knowledge-based symbolic methods …which operates on appropriate representations… stacks linear strings and arrays matrices linked lists trees
7 Knowledge-based symbolic methods …which operates on appropriate representations… stacks linear strings and arrays matrices linked lists trees is indeed succesful in many information processing problems
Example: double spiral problem in inner or outer spiral?
Example: double spiral problem in inner or outer spiral? difficult for, e.g., neural nets
Example: double spiral problem in inner or outer spiral? Answer: outside difficult for, e.g., neural nets
Example: double spiral problem in inner or outer spiral? How? -flood fill algorithm? -other?
Example: double spiral problem in inner or outer spiral? -Find the right representation! odd/even count is not sensitive to shape variations of the spiral: a general solution = Outside count edges
Example: double spiral problem in inner or outer spiral? Outside
14 Culture If it doesn’t work, you didn’t think hard enough You have to know what you do You have to prove that & why it works Even neural networks work on top of the Turing/von Neumann engine (it will always win) If you’re smart, you can often avoid NP-completeness Use of probabilities is a sign of weakness
15 Strong points Scalability is often possible Convenience: little context dependence, no training Reusability Transformability (compilation) Algorithmic refinement once it is known how to do a trick (e.g., graphics cards and DSPs in mobile phones: ugly code but highly efficient)
16 Challenges Knowledge dependence is expensive –not a problem in “ IT ” application design –a challenge to AI Uncertainty Noise Brittleness
17 Solutions More and more representational weight: (UML, Semantic Web, XML solves everything) Symbolic learning mechanisms: –induction: version spaces grammar inference –decision tree learning –rewriting formalisms Active hypothesis testing (what if…, assume X…)
18 Example In Reading Systems (optical character recognition), only a small part of the algorithm concerns problems of image processing and character classification Most of the code is concerned with the structure of the text image: –where are the blobs? –are these blobs text, photo or graphics? –how to segment into meaningful chunks: characters, words? –what is the logical organization (reading order) in the physical organization of pixels? Knowledge-based approaches are a necessity!
Name of conference Programme committee Brief description of conference Submission details
23 Example of layout analysis Knowing the type of a text block strongly reduces the number of possible interpretations Example: “address block” Address: –name of person –street, number –postal code, city
prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland Amsterdam 7/7/2003
address prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland
address person name street codes+city country prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland
address titles initials surname street street,,, digits 4 digits 2 upper case city name country name prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland
…. (address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city) is-above (country)) Content Layout prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland etc.
…. (address (title is-left-of initials is-left-of surname) is-above (street name is-left-of number) is-above (city) is-above (country)) Content Layout prof dr. L.R.B. Schomaker Grote Appelstraat TS Groningen Nederland etc. HELPS TEXT CLASSIFICATION HELPS TEXT SEGMENTATION