Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia.

Slides:



Advertisements
Similar presentations
Di Yang, Elke A. Rundensteiner and Matthew O. Ward Worcester Polytechnic Institute VLDB 2009, Lyon, France 1 A Shared Execution Strategy for Multiple Pattern.
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
A Fast Growing Market. Interesting New Players Lyzasoft.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
5 Complex Event Processing (CEP) is the continuous and incremental processing of event streams from multiple sources based on declarative query.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Elastic Burst Detection: Applications Discovering intervals with an unusually large numbers of events. –In astrophysics, the sky is constantly observed.
Business Intelligence Andrew Davis Andria Zippler Jana Krinsky Tiffany Ferris.
Interpret Application Specifications
©Ian Sommerville 2004Software Engineering, 7th edition. Insulin Pump Slide 1 An automated insulin pump.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
EMBEDDED SOFTWARE Team victorious Team Victorious.
Chapter 2: Business Intelligence Capabilities
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
SERVER Betül ŞAHİN What is this? Betül ŞAHİN
IBM Research – Thomas J Watson Research Center | March 2006 © 2006 IBM Corporation Events and workflow – BPM Systems Event Application symposium Parallel.
Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Data Mining Techniques As Tools for Analysis of Customer Behavior
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
INTELLIGENT SYSTEMS BUSINESS MOTIVATION BUSINESS INTELLIGENCE M. Gams.
Sensor Data Management: Challenges and (some) Solutions Amol Deshpande, University of Maryland.
Opening Keynote Presentation An Architecture for Intelligent Trading  Alessandro Petroni – Senior Principal Architect, Financial Services, TIBCO Software.
An approach to Intelligent Information Fusion in Sensor Saturated Urban Environments Charalampos Doulaverakis Centre for Research and Technology Hellas.
Anomaly detection with Bayesian networks Website: John Sandiford.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
1 Types of Business Information Systems Different people have different information needs depending on their work.
Next-Generation IDS: A CEP Use Case in 10 Minutes 3rd Draft – November 8, nd Event Processing Symposium Redwood Shores, California Tim Bass, CISSP.
Marko Grobelnik Jozef Stefan Institute ( Ljubljana, Slovenia.
© 2012 IBM Corporation IBM Security Systems 1 © 2013 IBM Corporation 1 Ecommerce Antoine Harfouche.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
John Plummer Technical Specialist Data Platform Microsoft Ltd StreamInsight Complex Event Processing (CEP) Platform.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Efficient Elastic Burst Detection in Data Streams Yunyue Zhu and Dennis Shasha Department of Computer Science Courant Institute of Mathematical Sciences.
BUSINESS ANALYTICS AND DATA VISUALIZATION
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Kaskad Technology Korrelera for Market Surveillance Candyce Edelen November 8, 2006.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
Web-Mining …searching for the knowledge on the Internet… Marko Grobelnik Institut Jožef Stefan.
© 2006, National Research Council Canada © 2006, IBM Corporation Solving performance issues in OTS-based systems Erik Putrycz Software Engineering Group.
1/14/ :59 PM1/14/ :59 PM1/14/ :59 PM Research overview Koen Victor, 12/2007.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
BMQ-Index: Shared and Incremental Processing of Border Monitoring Queries over Data Streams Jinwon Lee Y. Lee, S. Kang, S. Lee, H. Jin, B. Kim and J. Song.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
1 Creating Situational Awareness with Data Trending and Monitoring Zhenping Li, J.P. Douglas, and Ken. Mitchell Arctic Slope Technical Services.
Streaming Semantic Data COMP6215 Semantic Web Technologies Dr Nicholas Gibbins –
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Big Data in Technical and Vocational Education (TVE)
Big Data.
IBM Tivoli Web Site Analyzer Training Document
Remote Monitoring solution
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
COS 518: Advanced Computer Systems Lecture 11 Michael Freedman
ece 627 intelligent web: ontology and beyond
Data Mining: Introduction
Introduction to Stream Computing and Reservoir Sampling
Data Warehousing Concepts
Big DATA.
Online Analytical Processing Stream Data: Is It Feasible?
COS 518: Advanced Computer Systems Lecture 12 Michael Freedman
Presentation transcript:

Dunja Mladenić Marko Grobelnik Jožef Stefan Institute, Slovenia ( Nov 27 th 2008, ICT 2008, Lyon

 Motivation ◦ Why?  Introduction ◦ Who? What?  Approaches ◦ How?  Applications ◦ …four example applications  Future Challenges  Further References

 Why one would need (near) real-time information processing? ◦ …because Time and Reaction Speed correlate with many target quantities – e.g.:  …on stock exchange with Earnings  …in controlling with Quality of Service  …in fraud detection with Safety, etc. ◦ Generally, we can say: Reaction Speed == Value  …if our systems react fast, we create new value!

 Who works with real time data processing? ◦ “Stream Mining” (subfield of “Data Mining”) dealing with mining data streams in different scenarios in relation with machine learning and data bases  ◦ “Complex Event Processing” is a research area discovering complex events from simple ones by inference, statistics etc. 

 What is Real-Time information processing? ◦ It is defined by a set of approaches enabling operations on the observed incoming stream of data: Transform QuerySegment Predict Model Capture Reality (Events)

 When dealing with streams is really a problem? ◦ …when we have an intensive data stream and complex operations on data are required!  In such situations usually… ◦ …the volume of data is too big to be stored ◦ …the data can be scanned thoroughly only once ◦ …the data is highly non-stationary (changes properties through time), therefore approximation and adaptation are key to success  Therefore, a typical solution is… ◦ …not to store observed data explicitly, but rather in the aggregate form which allows execution of required operations

 All typical applications are “mission critical” ◦ …they have intensive streams and complex queries  Example applications: ◦ Dynamic tracking of stock fluctuations ◦ Surveillance for frauds and money laundering ◦ Network traffic monitoring ◦ Sensor network data analysis ◦ Web click stream mining ◦ Power consumption measurement  Next slides show some concrete applications…

Source: Gehrke 07 and Cayuga application scenarios (Cornell University) Example Queries (Stream Triggers): Notify me when the price of IBM is above $83, and the first MSFT price afterwards is below $27. Notify me when some stock goes up by at least 5% from one transaction to the next. Notify me when the price of any stock increases monotonically for ≥30 min. Notify me whenever there is double top formation in the price chart of any stock Notify me when the difference between the current price of a stock and its 10 day moving average is greater than some threshold value Stock monitoring ◦ Stream of price and sales volume of stocks over time ◦ Technical analysis/charting for stock investors ◦ Support trading decisions

Alarms Server Alarms Explorer Server Live feed of data Operator Analysis of data Big board display Top predictions Telecom Network (~ devices) Alarms ~10-100/sec  Alarms Explorer Server implements three real-time scenarios on the alarms stream: 1.Root-Cause-Analysis – finding which device is responsible for occasional “flood” of alarms 2.Short-Term Fault Prediction – predict which device will fail in next 15mins 3.Long-Term Anomaly Detection – detect unusual trends in the network  …system is used in British Telecom

Log Files (~100M page clicks per day) Log Files (~100M page clicks per day) User profiles Content Stream of profiles Segments Trend Detection System Stream of clicks Trends and updated segments Campaign to sell segments $ Sales

Query Topic Trends Visualization Topics description US Elections US Budget Mid-East conflict NATO-Russia Result set

 Stream operators to become part of standard database systems  Dealing with streams of complex event types ◦ …e.g. documents and other less structured data  Dynamic social networks ◦ …i.e. when underlying data model is a graph getting updated with each event (hot research topic in data mining)  Semantic Streams ◦ …how to map low level stream events into higher level semantic concepts (e.g. in sensor networks)

 Data Stream Mining:  Complex Event Processing:  Real Time Computing:  Online Algorithms:  Worst Case Analysis:

 State of the Art in Data Stream Mining: Joao Gama, University of Porto ◦  Data stream management and mining: Georges Hebrail, Ecole Normale Superieure ◦