Yang Hu University of Pittsburgh Department of Computer Science.

Slides:



Advertisements
Similar presentations
DISCOVERING EVENT EVOLUTION GRAPHS FROM NEWSWIRES Christopher C. Yang and Xiaodong Shi Event Evolution and Event Evolution Graph: We define event evolution.
Advertisements

19- 1 Chapter Nineteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.
Sept-Dec w1d21 Third-Generation Information Architecture CMPT 455/826 - Week 1, Day 2 (based on R. Evernden & E. Evernden)
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Routing in WSNs through analogies with electrostatics December 2005 L. Tzevelekas I. Stavrakakis.
On Constrained Optimization Approach To Object Segmentation Chia Han, Xun Wang, Feng Gao, Zhigang Peng, Xiaokun Li, Lei He, William Wee Artificial Intelligence.
Systems Engineering and Engineering Management The Chinese University of Hong Kong Parameter Free Bursty Events Detection in Text Streams Gabriel Pui Cheong.
1 Prediction-based Strategies for Energy Saving in Object Tracking Sensor Networks Yingqi Xu, Wang-Chien Lee Proceedings of the 2004 IEEE International.
Evolutionary Computational Intelligence Lecture 10a: Surrogate Assisted Ferrante Neri University of Jyväskylä.
Cluster Analysis.  What is Cluster Analysis?  Types of Data in Cluster Analysis  A Categorization of Major Clustering Methods  Partitioning Methods.
Software Quality Control Methods. Introduction Quality control methods have received a world wide surge of interest within the past couple of decades.
Snakes Goes from edges to boundaries. Edge is strong change in intensity. Boundary is boundary of an object. –Smooth (more or less) –Closed. –…
Data Mining Adrian Tuhtan CS157A Section1.
Rotation and Orientation: Affine Combination Jehee Lee Seoul National University.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Energy-efficient Self-adapting Online Linear Forecasting for Wireless Sensor Network Applications Jai-Jin Lim and Kang G. Shin Real-Time Computing Laboratory,
IT Job Roles Task 20. Software Engineer Job Description Software engineers are responsible for creating and maintaining software of various different.
Introduction to BIM BIM Curriculum 01.
Building Efficient Time Series Similarity Search Operator Mijung Kim Summer Internship 2013 at HP Labs.
Data Mining Techniques
Recommender Systems on the Web: A Model-Driven Approach Gonzalo Rojas – Francisco Domínguez – Stefano Salvatori Department of Computer Science University.
Operations and Supply Chain Management
Data Mining Chun-Hung Chou
Introduction Due to the recent advances in smart grid as well as the increasing dissemination of smart meters, the electricity usage of every moment in.
Introduction to Adaptive Digital Filters Algorithms
06 - Boundary Models Overview Edge Tracking Active Contours Conclusion.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Interpolation Tools. Lesson 5 overview  Concepts  Sampling methods  Creating continuous surfaces  Interpolation  Density surfaces in GIS  Interpolators.
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
1 Slow Intelligence Systems Session and Panel. 2 Panelists Erland Jungert Francesco Colace Tiansi Dong Shi-Kuo Chang (Moderator)
ISOMAP TRACKING WITH PARTICLE FILTER Presented by Nikhil Rane.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
Slide 5-1 Chapter 5 Applications Software for Businesses Introduction to Information Systems Judith C. Simon.
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
Journal of Visual Communication and Image Representation
Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen
TRANS: T ransportation R esearch A nalysis using N LP Technique S Hyoungtae Cho, Melissa Egan, Ferhan Ture Final Presentation December 9, 2009.
Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.
Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.
IEEE International Conference on Fuzzy Systems p.p , June 2011, Taipei, Taiwan Short-Term Load Forecasting Via Fuzzy Neural Network With Varied.
1 Decision Making ADMI 6510 Forecasting Models Key Sources: Data Analysis and Decision Making (Albrigth, Winston and Zappe) An Introduction to Management.
Demand Management and Forecasting Chapter 11 Portions Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 15 Forecasting. Forecasting Methods n Forecasting methods can be classified as qualitative or quantitative. n Such methods are appropriate when.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.
Chapter 11 – With Woodruff Modications Demand Management and Forecasting Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 3: Cost Estimation Techniques
Big data classification using neural network
Chapter 3: Cost Estimation Techniques
Discovering Computers 2010: Living in a Digital World Chapter 14
Chapter Nineteen McGraw-Hill/Irwin
Game Theoretic Image Segmentation
DEFECT PREDICTION : USING MACHINE LEARNING
Adrian Tuhtan CS157A Section1
Chapter 3: Cost Estimation Techniques
Chapter 3: Cost Estimation Techniques
Synthesis of Motion from Simple Animations
Presented By: Darlene Banta
FLOSCAN: An Artificial Life Based Data Mining Algorithm
Chapter Nineteen McGraw-Hill/Irwin
Building Topic/Trend Detection System based on Slow Intelligence
Rotation and Orientation: Affine Combination
Chapter 3: Cost Estimation Techniques
Yingze Wang and Shi-Kuo Chang University of Pittsburgh
Chap 4: Exponential Smoothing
Presentation transcript:

Yang Hu University of Pittsburgh Department of Computer Science

* Introduction to SIS * Topic Detection and Tracking (TDT) * Concept * Goals * Major Tasks * Methods * TDT based Power Efficiency Web Server * Motivation * Implementation * Conclusion

* Slow Intelligence System can provide a software development framework for general-purpose system with insufficient computing resources to gradually improve performance over time.

* It contains five stages Slow Intelligence System Enumeration EliminationAdaptationConcentration Propagation 2

* What is TDT * A DARPA-sponsored initiative to investigate the state of the art in finding the trend in a stream of broadcast news stories.

1. To develop automatic techniques for finding topically related material in streams of data. This could be valuable in a wide variety of applications where efficient and timely information access is important. Eg. (CNN or Yahoo News) 2. Make the computers able to map out data automatically finding story boundaries, determining what stories go with one another, and discovering when something new (unforeseen) has happened.

1. Story Segmentation - Detect changes between topically cohesive sections 2. Topic Tracking - Keep track of stories similar to a set of example stories 3. Topic Detection - Build clusters of stories that discuss the same topic 4. First Story Detection - Detect if a story is the first story of a new, unknown topic 5. Link Detection - Detect whether or not two stories are topically linked

* General Linear Abstraction of Seasonality (GLAS) * Henderson Filter (HF) * Lowess (LW) * Smoothing splines (SS) * Kalman Filter (KF)

* It’s a package currently used in Bank of England for seasonal adjustment and trend estimation. * The trend series is constructed using a moving – average of data with triangular shaped weighting pattern.

* It’s used in the X11-ARIMA and X-12-ARIMA packages which are also packages currently used in Bank of England. * The rational is the same as GLAS, but using a different weighting pattern.

* Lowess identifies a certain number of nearest- neighbors to a given point, x0, and assigns a weight to each neighbor based on the distance of that neighbor to the point. A value of the trend at x0 is then calculated based on these weights. * The number of nearest neighbors which are used is the smoothing parameter. * The bigger the number, the smoother the trend.

* The smoothing spline smoother is derived as the explicit solution to the functional minimization problem. * represents the smoothing parameter, which is the trade-off between the smoothness of the curve (the second derivative term in the integral) and the fidelity to the data (the residual sum of squares).

* This approach employs the idea of structural time series modeling where the unobserved component of trend is assumed to follow a well-defined stochastic process. * General form for the trend component is given below.

* Server power consumption is rapidly becoming a hot topic in the IT industry. * Over the last decade, power has emerged as a critical design constraint in modern computer architecture. In many cases system power consumption is increasing exponentially.

User RequestSIS Based TDT Knowledge Base Dispatcher SIS Coordinator

* SIS based TDT 1 st KB 2 nd KB EnumeratorEliminatorConcentrator

1 st KB Generate algorithm combinations Evaluate combinations based on KB records Save evaluation results Extensively select the combinations

* For most data centers, the cost of power has become a top budget item. In fact, in 2008, the average cost of power used by a server exceeded its purchase price (4). * Nationally, the EPA estimated data center power consumption to cost over $4.5 Billion a year in 2006, projected to grow to $7.4 Billion in 2011 (5). * One main reason is typically, due to lack of communication between the guys that pays the power bill, and the IT department that operates the servers.

1. Shih and Peng “Building Topic/Trend Detection System based on Slow Intelligence ” 2. Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y., "Topic detection and tracking pilot study: Final report" 3. Bianchi, M., Boyle, M., and Hollingsworth, D., "A comparison of methods for trend estimation" 4. Belady, Christian “In the Data Center, Power and Cooling Costs More Than the IT Equipment it Supports.” Electronics Cooling. Vol. 23, No. 1, February U.S. Environmental Protection Agency “EPA Report to Congress on Server and Data Center Energy Efficiency”.