Relational Data Mining in Finance Haonan Zhang CFWin03-37 03/04/2003.

Slides:



Advertisements
Similar presentations
Explanation-Based Learning (borrowed from mooney et al)
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Machine Learning: Lecture 9
The Logic of Intelligence Pei Wang Department of Computer and Information Sciences Temple University.
Logic Use mathematical deduction to derive new knowledge.
INTRODUCTION TO MODELING
Knowledge Representation and Reasoning Learning Sets of Rules and Analytical Learning Harris Georgiou – 4.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
처음 페이지로 이동 Chapter 11: Analytical Learning Inductive learning training examples n Analytical learning prior knowledge + deductive reasoning n Explanation.
A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web by Livia Predoiu, Heiner Stuckenschmidt Institute of Computer Science,
Constraint Logic Programming Ryan Kinworthy. Overview Introduction Logic Programming LP as a constraint programming language Constraint Logic Programming.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 21 Jim Martin.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
1 Boolean Satisfiability in Electronic Design Automation (EDA ) By Kunal P. Ganeshpure.
A First Attempt towards a Logical Model for the PBMS PANDA Meeting, Milano, 18 April 2002 National Technical University of Athens Patterns for Next-Generation.
Simple Linear Regression
Week 9 Data Mining System (Knowledge Data Discovery)
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
LEARNING FROM OBSERVATIONS Yılmaz KILIÇASLAN. Definition Learning takes place as the agent observes its interactions with the world and its own decision-making.
Machine Learning: Symbol-Based
Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.
Building Knowledge-Driven DSS and Mining Data
CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Lecture 22 Jim Martin.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Data Mining Techniques
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Inductive Logic Programming Includes slides by Luis Tari CS7741L16ILP.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Notes for Chapter 12 Logic Programming The AI War Basic Concepts of Logic Programming Prolog Review questions.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Friday, February 4, 2000 Lijun.
November 10, Machine Learning: Lecture 9 Rule Learning / Inductive Logic Programming.
1 Discovering Robust Knowledge from Databases that Change Chun-Nan HsuCraig A. Knoblock Arizona State UniversityUniversity of Southern California Journal.
A Learning System for Decision Support in Telecommunications Filip Železný, Olga Štěpánková (Czech Technical University in Prague) Jiří Zídek (Atlantis.
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
For Monday Finish chapter 19 No homework. Program 4 Any questions?
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
For Wednesday Read 20.4 Lots of interesting stuff in chapter 20, but we don’t have time to cover it all.
For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?
Machine Learning Concept Learning General-to Specific Ordering
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
Data Mining and Decision Support
Copyright Paula Matuszek Kinds of Machine Learning.
Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate.
Chap. 10 Learning Sets of Rules 박성배 서울대학교 컴퓨터공학과.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
CS 9633 Machine Learning Explanation Based Learning
CS 9633 Machine Learning Inductive-Analytical Methods
Boosting and Additive Trees (2)
Probabilistic Horn abduction and Bayesian Networks
Version Space Machine Learning Fall 2018.
Presentation transcript:

Relational Data Mining in Finance Haonan Zhang CFWin /04/2003

Organization Motivation & Introduction Background Problem statement Solution Outcome Conclusion and future work

Motivation & Introduction Motivation: Analyze finance data, find and extract hidden patterns and relations between data. Thus the result will make good support for decision making in finance. To predict financial markets is a complex and challenging task because several reasons:  The dimensionality of the problem is high.  The relationships among independent and dependent variable are weak and non-linear. Data mining is a process that analyze data from different perspectives and explicitly show the interaction between data with a given confidence. It is suitable for predict financial markets

Motivation & Introduction cont.. Traditional data mining methods have their limitations, such as lack of knowledge representation and limited forms of background knowledge. The Inductive Logic Programming (ILP) and Relational Data mining (RDM) can overcome these limitations.  The ILP can naturally incorporate background knowledge and relations between objects into the leaning process;  The RDM can discover hidden relations (general first order relations) in numerical and symbolic data using background knowledge (domain theory)

Background A predicate is a binary function. A predicate can be defined extensionally as a set of tuples for which the predicate is true, or intentionally as a set of (Horn) clauses for computing whether the predicate is true. A literal is a predicate or its negation A horn clause consists of two parts: a clause head and a clause body. A clause head is defined as a single predicate. A clause body is defined as a conjunction of literals.

Background cont.. FOIL (First Order Inductive Learning) algorithm  Input: a target relation to be learned, a set of positive, a set of negative examples of the relation, and a set of background relations.  Learning approach: separate-and-conquer approach: It learns a clause at a time, then it remove training examples covered by the clause, then begin to learn the subsequent clauses. FOIL tries to cover as much as positive training examples and cover no negative training examples. The algorithm will terminate when all positive training examples are covered.  Output: a set of clauses that describe the target relation.

Background cont.. The FOCL (First Order Combined Learner) algorithm extends the FOIL algorithm in several ways.  The FOCL constrains the search by using variable typing, and inter-argument constraints.  The FOCL uses background knowledge to improve the learning process, such as rules which are defined by a collection of examples, and a partial possible incorrect rule which is an initial approximation of the predicate to be learned.

Background cont.. The MMDR (Machine Method for Discovering Regularities) algorithm focuses on generating probabilistic first-order rules and measurement issues for numerical relational  MMDR permits various forms of background knowledge, such as constraints, predefined predicates and partial (may be incorrect) rules.  MMDR uses the statistical significance of hypotheses and the strength of data types scales, to limit the search space.

Problem statement The FOIL input a target relation to be learned, a set of positive and a set of negative examples, and a set of background knowledge. The output would be a set of clauses that describe the target relation using background knowledge

Solution separate-and-conquer approach

Solution cont.. Two kinds of literals can be appended to develop a clause.  gainful literals: literals may increase the covering of positive examples. Gainful literals are evaluated using information heuristic. The average information provided by discovery that literal of the bindings is When new literal m is added, suppose that some of the bindings are excluded, and k of then + bindings are not excluded. The total gain is

Solution cont..  determinate literals: A determinate literal introduces new variable. The new partial clause has the same binding for each positive binding of current clause, and at most one binding for each negative binding of current clause. Therefore, sometimes determinate literal has zero gain.

Solution cont.. Four forms of literals are considered can appear in a clause:

Solution cont.. In order to learn recursive theories without leading to infinite recursion, FOIL uses three approaches to assure that the recursive literals are risk free.  Ordering constants: the algorithm can discover an ordering of constant and order constants.  Ordering pairs of variables: when a type’s constant is ordered, the ordering of a pair of variable V i and V j of same type in a partial clause may also exist.  Ordering recursive literals: the ordering among variables can be extended to an ordering of literals involving a predicate and variables.

Flow Diagram

Outcome

Conclusion and future work The FOIL algorithm can effectively find the hidden relations between target relation and background knowledge and represent the target relation using background knowledge. The FOIL algorithm uses a very complex recursive control and backup scheme, which increase the complexity of the algorithm. Further implementation needs a better understanding of these schemes. Future work: implement the FOIL algorithm in parallel computers.