Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate.

Slides:



Advertisements
Similar presentations
Genetic Algorithms Chapter 3. A.E. Eiben and J.E. Smith, Introduction to Evolutionary Computing Genetic Algorithms GA Quick Overview Developed: USA in.
Advertisements

Genetic Algorithm.
CS6800 Advanced Theory of Computation
Using Parallel Genetic Algorithm in a Predictive Job Scheduling
Exact and heuristics algorithms
Genetic Algorithms Contents 1. Basic Concepts 2. Algorithm
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Non-Linear Problems General approach. Non-linear Optimization Many objective functions, tend to be non-linear. Design problems for which the objective.
TEMPLATE DESIGN © Genetic Algorithm and Poker Rule Induction Wendy Wenjie Xu Supervised by Professor David Aldous, UC.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Evolutionary Computational Intelligence
Optimization via Search CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 4 Adapted from slides of Yoonsuck Choe.
Genetic Algorithms GAs are one of the most powerful and applicable search methods available GA originally developed by John Holland (1975) Inspired by.
Genetic Algorithm for Variable Selection
COMP305. Part II. Genetic Algorithms. Genetic Algorithms.
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Genetic Algorithms Overview Genetic Algorithms: a gentle introduction –What are GAs –How do they work/ Why? –Critical issues Use in Data Mining –GAs.
Genetic Algorithm.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
Cristian Urs and Ben Riveira. Introduction The article we chose focuses on improving the performance of Genetic Algorithms by: Use of predictive models.
SOFT COMPUTING (Optimization Techniques using GA) Dr. N.Uma Maheswari Professor/CSE PSNA CET.
1 Local search and optimization Local search= use single current state and move to neighboring states. Advantages: –Use very little memory –Find often.
Improved Gene Expression Programming to Solve the Inverse Problem for Ordinary Differential Equations Kangshun Li Professor, Ph.D Professor, Ph.D College.
Estimation of Distribution Algorithms (EDA)
Zorica Stanimirović Faculty of Mathematics, University of Belgrade
Chapter 8 The k-Means Algorithm and Genetic Algorithm.
Genetic algorithms Charles Darwin "A man who dares to waste an hour of life has not discovered the value of life"
Introduction to Evolutionary Algorithms Session 4 Jim Smith University of the West of England, UK May/June 2012.
Chapter 4.1 Beyond “Classic” Search. What were the pieces necessary for “classic” search.
Derivative Free Optimization G.Anuradha. Contents Genetic Algorithm Simulated Annealing Random search method Downhill simplex method.
Kansas State University Department of Computing and Information Sciences CIS 732: Machine Learning and Pattern Recognition Friday, 16 February 2007 William.
Learning by Simulating Evolution Artificial Intelligence CSMC February 21, 2002.
Niching Genetic Algorithms Motivation The Idea Ecological Meaning Niching Techniques.
Evolution Programs (insert catchy subtitle here).
Genetic Algorithms ML 9 Kristie Simpson CS536: Advanced Artificial Intelligence Montana State University.
1 Genetic Algorithms K.Ganesh Introduction GAs and Simulated Annealing The Biology of Genetics The Logic of Genetic Programmes Demo Summary.
Chapter 9 Genetic Algorithms.  Based upon biological evolution  Generate successor hypothesis based upon repeated mutations  Acts as a randomized parallel.
Genetic Algorithms What is a GA Terms and definitions Basic algorithm.
Chapter 12 FUSION OF FUZZY SYSTEM AND GENETIC ALGORITHMS Chi-Yuan Yeh.
EE749 I ntroduction to Artificial I ntelligence Genetic Algorithms The Simple GA.
Genetic Algorithms. The Basic Genetic Algorithm 1.[Start] Generate random population of n chromosomes (suitable solutions for the problem) 2.[Fitness]
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
Neural Networks And Its Applications By Dr. Surya Chitra.
1 Contents 1. Basic Concepts 2. Algorithm 3. Practical considerations Genetic Algorithm (GA)
Genetic Algorithm Dr. Md. Al-amin Bhuiyan Professor, Dept. of CSE Jahangirnagar University.
Artificial Intelligence By Mr. Ejaz CIIT Sahiwal Evolutionary Computation.
CAP6938 Neuroevolution and Artificial Embryogeny Evolutionary Comptation Dr. Kenneth Stanley January 23, 2006.
By Ping-Chu Hung Advisor: Ying-Ping Chen.  Introduction: background and objectives  Review of ECGA  ECGA for integer variables ◦ Experiments and performances.
Genetic Algorithms for clustering problem Pasi Fränti
Genetic Algorithms An Evolutionary Approach to Problem Solving.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Genetic Algorithm(GA)
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
 Presented By: Abdul Aziz Ghazi  Roll No:  Presented to: Sir Harris.
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Chapter 14 Genetic Algorithms.
Genetic Algorithms.
Dr. Kenneth Stanley September 11, 2006
Genetic Algorithms GAs are one of the most powerful and applicable search methods available GA originally developed by John Holland (1975) Inspired by.
School of Computer Science & Engineering
C.-S. Shieh, EC, KUAS, Taiwan
Chapter 6: Genetic Algorithms
Genetic Algorithms GAs are one of the most powerful and applicable search methods available GA originally developed by John Holland (1975) Inspired by.
Advanced Artificial Intelligence Feature Selection
Modified Crossover Operator Approach for Evolutionary Optimization
Example: Applying EC to the TSP Problem
Genetic Algorithms Chapter 3.
A Gentle introduction Richard P. Simpson
Presentation transcript:

Discovering Interesting Patterns for Investment Decision Making with GLOWER-A Genetic Learner Overlaid With Entropy Reduction Advisor : Dr. Hsu Graduate : Min-Hong Lin IDSL seminar 2001/12/11

Outline Motivation Objective Genetic Search for Rules Genetic Rule Discovery Comparison of GLOWER to Tree and Rule Induction Conclusions Comments

Motivation Prediction in financial domains is difficult to model. The dimensionality is high. Relationships among variables are weak and nonlinear. Variable interactions can be significant.

Objective Describe and evaluate several variations of a new genetic learning algorithm (GLOWER) on a variety of data sets. Comparison of GLOWER variants. Comparison of GLOWER to Tree and Rule Induction.

Genetic Search for Rules Genetic algorithms (Packard, 1989; Goldberg, 1989, Holland, 1992) have been shown to be well suited to learning rule-like patterns. They have the ability to search large spaces for pattern without resorting to heuristics that are biased against term interactions.

Limitations of Genetic Search Lack of speed Randomness in creating the initial population They can be myopic after find one good solution

Benefits of Genetic Search Scour a search space thoroughly Allow arbitrary fitness functions in the search

The failings of greedy search Tree Induction algorithms are currently among the most widely used techniques in data mining. But greedy search conducted by TI algorithms can overlook good patterns. See next example

An example

An example(Cont’d)

Entropy Reduction Tree induction algorithms typically determine split points based on a heuristic such as entropy reduction

Entropy Reduction(cont’d) The entropy of a cluster i can be computed using the standard formula: The gain from a split is computed based on the difference between the entropy of the parent cluster and the entropy of the child clusters resulting from the split.

An example(Cont’d)

Evaluation of partial models Two commonly used metrics used to measure the goodness of individual rules are confidence and support. If N is the total number of examples in a data set, then, for a rule form A à B:

Genetic Rule Discovery Representation: Gene and Chromosome

Genetic Rule Discovery Process An initial population of patterns (chromosomes) is created randomly. Each chromosome is evaluated and ranked. The higher-ranked chromosomes are selected to participate in “mating” and “mutation” to produce new offspring. Mating involves exchanging parts of chromosomes (genes) between pairs. (Crossover) Mutating involves changing the value of a gene

Genetic Rule Discovery Process(Cont’d) By repeating these steps over and over, the search converges on populations of better patterns, as measured using some fitness function.

Genetic Rule Discovery (Cont’d) Selection (or reproduction) Crossover MutationCrossover Selection Probability of survival

Genetic Rule Discovery (Cont’d)

Two problems A1 is the dominant pattern in the data set, chromosomes typically gather in the neighborhood of A1. If such a pattern dominates the early populations, the algorithm is likely to overlook the other patterns. Approach( inductive strengthening ) Sequential Niching:SN Data Reduction:DR

Sequential Niching:SN Niching schemes have been described for dealing with multimodal problems (Goldberg and Richardson (1987), Oei et al. (1991). inductive strengthening (Provost & Buchanan, 1992): placing stronger restrictions on the search based on the rules that have already been induced. Genetic search can perform inductive strengthening by niching sequentially. After the search converges on a high-fitness schema, the evaluation function can be modified to penalize patterns corresponding to it.

Sequential Niching:SN(Cont’d)

Data Reduction:DR The most commonly used heuristic for inductive strengthening is the covering heuristic. we can apply the covering heuristic to genetic search. Instead of penalizing an area as classified when calculating the fitness function, the data corresponding to the area are removed from further consideration.

Compare DR to SN

Compare DR to SN(Cont’d) The SN is more likely to produce non-overlapping rules whereas the DR produces a hierarchy of rules

Compare DR to SN(Cont’d)

Comparison of GLOWER variants Sampling Design GLOWER Variants sequential niching sequential niching plus entropy reduction data reduction plus entropy reduction

Results

Comparison of GLOWER to Tree and Rule Induction To predict “earnings surprises” Date set : a history of earnings forecasts and actual reported earnings for S&P500 stocks between 1990/1~1998/9 60 independent variables were chosen based on fundamental and technical analysis.

Results

Results(Cont’d) GLOWER is more thorough in its search than RL which is more thorough than TI GLOWER outperforms in both confidence and support for predicting positive surprises. For negative surprises, the confidence levels are similar, but GLOWER is better in support. GA is much more suited to capturing symmetry in the problem space TI and RL find it harder to predict a negative earnings surprise than a positive one.

Discussion We are able to combine the speed of the traditional methods with the more comprehensive search of the GA. GA employ useful heuristics for achieving higher levels of support while maintaining high accuracy. The GA is about two or three orders of magnitude slower than a TI algorithm. Explainability is an important consideration in using knowledge discovery methods.

Conclusions Genetic learning algorithms are more thorough in their search than other rule learning algorithms for hard problems. Genetic algorithms have the flexibility to accommodate variable fitness functions. It provides comparison and of the heuristics used by different rule-mining techniques.

Comments GLOWER has the ability to uncover interesting patterns for difficult problems. Maybe the time complexity should be considered in the experiment.