Christos Faloutsos CMU

Slides:



Advertisements
Similar presentations
KEY MANAGEMENT TECHNIQUES IN WIRELESS SENSOR NETWORKS JOHNSON C.LEE, VICTOR C.M.LUENG, KIRK H.WONG, JIANNANO CAO, HENRY C.B. CHAN Presented By Viplavi.
Advertisements

anywhere and everywhere. omnipresent A sensor network is an infrastructure comprised of sensing (measuring), computing, and communication elements.
Ed Duguid with subject: MACE Cloud
CMU SCS : Multimedia Databases and Data Mining Extra: intro to hadoop C. Faloutsos.
Machine Learning and Data Mining Course Summary. 2 Outline  Data Mining and Society  Discrimination, Privacy, and Security  Hype Curve  Future Directions.
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
CMU SCS Bio-informatics, Graph and Stream mining Christos Faloutsos CMU.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Virtual Clusters Supporting MapReduce in the Cloud Jonathan Klinginsmith School of Informatics and Computing.
The Glasgow Raspberry Pi Cloud: A Scale Model for Cloud Computing Infrastructures By: Mugasa Hatwib.
Last Words COSC Big Data (frameworks and environments to analyze big datasets) has become a hot topic; it is a mixture of data analysis, data mining,
GrIDS -- A Graph Based Intrusion Detection System For Large Networks Paper by S. Staniford-Chen et. al.
Some Thoughts on Sensor Network Research Krishna Kant Program Director National Science Foundation CNS/CSR Program.
CMU SCS KDD'09Faloutsos, Miller, Tsourakakis P0-1 Large Graph Mining: Power Tools and a Practitioner’s guide Christos Faloutsos Gary Miller Charalampos.
Data Management Information Management Knowledge Management Data and Applications Security Challenges Bhavani Thuraisingham October 2006.
1 Controversial Issues  Data mining (or simple analysis) on people may come with a profile that would raise controversial issues of  Discrimination 
Data Mining for Security Applications Dr. Bhavani Thuraisingham The University of Texas at Dallas January 2006.
CMU SCS Large Graph Mining Christos Faloutsos CMU.
Benchmarking MapReduce-Style Parallel Computing Randal E. Bryant Carnegie Mellon University.
Ch. 9. The Cloud of Things 1Ch. 9. CoT.  Current M2M/IoT solutions are focusing on communications and integration. Future Web of Things (WoT) evolution.
Application of Provenance for Automated and Research Driven Workflows Tara Gibson June 17, 2008.
CMU SCS Big (graph) data analytics Christos Faloutsos CMU.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
EVENT DETECTION IN TIME SERIES OF MOBILE COMMUNICATION GRAPHS
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Privacy and Data Mining What Do “They” Know About You?
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas Cyber Security Research on Engineering Solutions Dr. Bhavani.
Machine Learning BY UZMA TUFAIL MCS : section (E) ROLL NO: /31/2016.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
CMU SCS Panel: Social Networks Christos Faloutsos CMU.
CMU SCS KDD '09Faloutsos, Miller, Tsourakakis P8-1 Large Graph Mining: Power Tools and a Practitioner’s guide Task 8: hadoop and Tera/Peta byte graphs.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Big Data – Lendület kutatócsoport Andras Benczur Insitute for Computer Science and Control Hungarian Academy of Sciences
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
Book web site:
Geoffrey Fox Panel Talk: February
Chapter 1 Characterization of Distributed Systems
Automation Technologies SCADA SENSORS HMI
Ricardo Jimenez-Peris Universidad Politecnica de Madrid
Connected Maintenance Solution
We enable Digitalization Thomas Hahn CERN Openlab, March 2016
Intelligent IVI with AI
Data-Drive Analytics for Precision Medicine
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
15-826: Multimedia Databases and Data Mining
Connected Maintenance Solution
Modern Data Management
Introduction C.Eng 714 Spring 2010.
Liang Chen Advisor: Gagan Agrawal Computer Science & Engineering
به نام خدا Big Data and a New Look at Communication Networks Babak Khalaj Sharif University of Technology Department of Electrical Engineering.
Healthcare Cloud Security Stack for Microsoft Azure
Large Graph Mining: Power Tools and a Practitioner’s guide
COS 518: Advanced Computer Systems Lecture 12 Mike Freedman
R-MAT: A Recursive Model for Graph Mining
Data Warehousing and Data Mining
Data Mining: Concepts and Techniques
Genes to Function to Therapeutics
Data Mining: Concepts and Techniques
Healthcare Cloud Security Stack for Microsoft Azure
Presented by Bijendra Vishal
Data Mining: Introduction
GATES: A Grid-Based Middleware for Processing Distributed Data Streams
Data Mining: Concepts and Techniques
Privacy and Data Mining
Privacy and Data Mining
Panel on Research Challenges in Big Data
Personalization Personalized System Traditional System 3 2 1
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Presentation transcript:

Christos Faloutsos CMU SDM’07 Panel Data Mining Research: Current Status and Future Opportunities Christos Faloutsos CMU

Questions Q1: What are future challenges and opportunities for data mining that are not presently receiving as much attention as they deserve? Q2: Are there things we are doing now that we should be rethinking in considering future challenges and opportunities for data mining? SDM'07 C. Faloutsos, CMU

Past + current successes cross-disciplinarity: DM = Stat, ML, DB fascinating apps: Bio-informatics privacy security streams social network mining ... SDM'07 C. Faloutsos, CMU

Machine Learning to support Systems Biology: Subcellular Location - Bob Murphy Cell Images of many proteins Feature Extraction, Graphical Models, Clustering of proteins by pattern Combine to enable accurate simulation of cell behavior Generative Models for each pattern SDM'07 C. Faloutsos, CMU

Q1: Challenges to focus on Scalability – mining Tera and Peta bytes stream mining (anomaly, intrusion detection, sensors) graph mining (text/web mining, marketing, ...) autonomic systems search engines national security ... SDM'07 C. Faloutsos, CMU

Scalability Google: > 450,000 processors in clusters of ~2000 processors each target: hundreds of Tb, to several Peta-bytes Barroso, Dean, Hölzle, “Web Search for a Planet: The Google Cluster Architecture” IEEE Micro 2003 SDM'07 C. Faloutsos, CMU

E.g.: self-* system @ CMU >200 nodes 40 racks of computing equipment 774kw of power. target: 1 PetaByte goal: self-correcting, self-securing, self-monitoring, self-... PT bytes, self-*, linux, gigabit link? SDM'07 C. Faloutsos, CMU 7

SDM'07 C. Faloutsos, CMU

DM for Tera- and Peta-bytes Two-way street: <- DM can use such infrastructures to find patterns -> DM can help such infrastructures become self-healing, self-adjusting, ‘self-*’ SDM'07 C. Faloutsos, CMU

Q2: What to do differently emphasis on Systems – DM collaboration SDM'07 C. Faloutsos, CMU