On the Detection of Concept Changes in

Slides:



Advertisements
Similar presentations
Decision trees for stream data mining – new results
Advertisements

Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
RANSAC experimentation Slides by Marc van Kreveld 1.
Fundamentals of Forensic DNA Typing Slides prepared by John M. Butler June 2009 Appendix 3 Probability and Statistics.
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
Fast Detection of Denial-of-Service Attacks on IP Telephony Hemant Sengar, Duminda Wijesekera and Sushil Jajodia Center for Secure Information Systems,
Fast Detection of Denial-of-Service Attacks on IP Telephony Hemant Sengar, Duminda Wijesekera and Sushil Jajodia Center for Secure Information Systems,
Anonymizing Web Services Through a Club Mechanism With Economic Incentives Mamata Jenamani Leszek Lilien Bharat Bhargava Department of Computer Sciences.
Today Concepts underlying inferential statistics
Conditional probability
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
JASS04 - Sequential Pattern MatchingTobias Reichl1 Joint Advanced Student School 2004 Complexity Analysis of String Algorithms Sequential Pattern Matching:
Hypothesis Testing.
Introduction to Biostatistics and Bioinformatics
Department of Computer Science Provenance-based Trustworthiness Assessment in Sensor Networks Elisa Bertino CERIAS and Department of Computer Science,
Statistical Decision Theory
Percolation Effects on Electrical Resistivity and Electron Mobility [Mott Corporation, 2009] By: Jared Weddell University of Illinois at Chicago Department.
Gregory Gurevich and Albert Vexler The Department of Industrial Engineering and Management, SCE- Shamoon College of Engineering, Beer-Sheva 84100, Israel.
 2003, G.Tecuci, Learning Agents Laboratory 1 Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 5.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Algorithms for Wireless Sensor Networks Marcela Boboila, George Iordache Computer Science Department Stony Brook University.
Estimation of Number of PARAFAC Components
On optimal quantization rules for some sequential decision problems by X. Nguyen, M. Wainwright & M. Jordan Discussion led by Qi An ECE, Duke University.
23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,
1 Machine Learning and Data Mining for Automatic Detection and Interpretation of Solar Events Jie Zhang (Presenting, Co-I, SCS*) Art Poland (PI, SCS*)
Education 793 Class Notes Decisions, Error and Power Presentation 8.
Computer Science 101 A Survey of Computer Science Timing Problems.
Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.
An Index of Data Size to Extract Decomposable Structures in LAD Hirotaka Ono Mutsunori Yagiura Toshihide Ibaraki (Kyoto Univ.)
CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
An Effective Defense Against Spam Laundering Author: Mengjun Xie, Heng Yin, Haining Wang Presented At: CCS’ 06 Prepared By: Amit Shrivastava.
 2004, G.Tecuci, Learning Agents Center CS 785 Fall 2004 Learning Agents Center and Computer Science Department George Mason University Gheorghe Tecuci.
Mining Statistically Significant Co-location and Segregation Patterns.
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin Estimation and Confidence Intervals Chapter 9.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
Introduction to the Statistical Analysis Using SPSS
Sampling Distributions
Estimation of Gene-Specific Variance
Logic of Hypothesis Testing
Chapter 2 HYPOTHESIS TESTING
Hypothesis Testing: One Sample Cases
Inference for Proportions
Professor Ke-Sheng Cheng
Hypothesis Testing.
Dr.MUSTAQUE AHMED MBBS,MD(COMMUNITY MEDICINE), FELLOWSHIP IN HIV/AIDS
QianZhu, Liang Chen and Gagan Agrawal
Statistical Data Analysis - Lecture10 26/03/03
Recent Advances in Iterative Parameter Estimation
Sample Size Estimation
CORRELATION ANALYSIS.
Understanding and Exploiting Amazon EC2 Spot Instances
Statistical Process Control
Outlier Discovery/Anomaly Detection
Calculating Sample Size: Cohen’s Tables and G. Power
Stochastic Hydrology Hydrological Frequency Analysis (II) LMRD-based GOF tests Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
Odometer Task.
Approximate Confidence Interval for the Ratio of Normal Means
P-value Approach for Test Conclusion
ELEG 6203: "Wireles Networks" Wireless Networks December 04,2003
CONCEPTS OF ESTIMATION
Review: What influences confidence intervals?
Farzaneh Mirzazadeh Fall 2007
Process and Measurement System Capability Analysis
IEEE P Wireless RANs Date:
What determines Sex Ratio in Mammals?
Student: Mallesham Dasari Faculty Advisor: Dr. Maggie Cheng
Hypothesis Testing - Chi Square
Presentation transcript:

On the Detection of Concept Changes in Time-Varying Data Stream by Testing Exchangeability Shen-Shyang Ho and Harry Wechsler {sho, wechsler}@cs.gmu.edu Department of Computer Science George Mason University Problem: In a data streaming setting, data points are observed one by one. The concepts to be learned from the data stream may change infinitely often. How do we detect the changes efficiently?

A martingale framework was proposed and two tests are suggested [Ho, ICML 2005] Show that the test (MT1) based on Doob's Inequality is an approximation to the sequential probability ratio test (SPRT). Relationship between the threshold value with respect to the size and power of the test is established Mean delay time before a change being detected is estimated. Test (MT2) based on the Hoeffding-Azuma Inequality. Under some assumptions, MT2 has a lower false positive (i.e. Declaring a change when in fact there is no change) rate than MT1.