Download presentation
Presentation is loading. Please wait.
Published byScot Hamilton Modified over 9 years ago
1
Predictive Modeling in Data Management Byung S. Lee Computer Science University of Vermont http://www.emba.uvm.edu/~bslee/homepage/
2
Cost UDF Overview Funding: US Department of Energy. Title: Generating Cost Functions of User- Defined Functions. Phase 1: preliminary studies. Phase 2: core modeling techniques. Phase 3: applications.
3
How long would this one take to run? UDF CostUDF Problem
4
Phase 1 Approaches: –Off-line training with cost data sets generated in the same batch. –On-line training with cost data sets generated in incremental batches. (a.k.a. self-tuning) Models: –parametric or nonparametric regression.
5
Phase 1 UDFs: –Financial time series aggregate functions: median(time series, start date, end date) nth moving window average(time series, start date, end date, window size) –Keyword-based text search functions: “dog AND cat” “dog OR cat” “dog cat” within w words apart. –Spatial search operators: range(ref_point, distance) Window(lower_left_point, upper_right_point) KNN(ref_point, K)
6
Phase 2 Approaches: –On-line training with cost data points generated one at a time. –Assume limited main memory. Models: –Nonparametric techniques using multidimensional index structures.
7
Phase 2 Core modeling techniques: –Incremental edited k nearest neighbors. –Memory limited quadtrees. –Dr. Zhen He will give a quick overview of the recent phase 2 efforts.
8
Phase 3 Additional core modeling techniques. Abstraction of the problem to “efficient adaptive predictive modeling of incremental data.” Applications that need –Value predictions. –Class predictions.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.