Mining Large Data at SDSC Natasha Balac, Ph.D.. A Deluge of Data Astronomy Life Sciences Modeling and Simulation Data Management and Mining Geosciences.

Slides:



Advertisements
Similar presentations
Chapter 1 Business Driven Technology
Advertisements

Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
Microsoft A Vision for Health. Consumerism/ Choice A Challenging World Public Health Healthcare spend increasing as % of GDP spend Increasing social cost.
Nokia Technology Institute Natural Partner for Innovation.
1er. Escuela Red ProTIC - Tandil, de Abril, 2006 Introduction to Machine Learning Alejandro Ceccatto Instituto de Física Rosario CONICET-UNR.
Advanced Data Mining: Introduction
Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
DATA WAREHOUSING.
SLIDE 1IS 257 – Fall 2008 Data Mining and the Weka Toolkit University of California, Berkeley School of Information IS 257: Database Management.
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Data Mining By Archana Ketkar.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
Data mining By Aung Oo.
Business Intelligence
CIT 858: Data Mining and Data Warehousing Course Instructor: Bajuna Salehe Web:
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
Data Mining Techniques
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Divide and Conquer: Challenges in Scaling Federated Search Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC SearchEngine Meeting.
Data Processing Machine Learning Algorithm The data is processed by machine algorithms based on hidden Markov models and deep learning. They are then utilized.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Chapter 1 Introduction to Data Mining
1 1 Slide Introduction to Data Mining and Business Intelligence.
Integrating Digital and Mobile Health: From Next Generation Sensors to Cloud Analytics Speakers: Yohan Lee, PhD; Ernest Sohn DISCLAIMER: The views and.
Definition of Computational Science Computational Science for NRM D. Wang Computational science is a rapidly growing multidisciplinary field that uses.
Kingdom of Saudi Arabia Ministry of Higher Education Al-Imam Muhammad ibn Saud Islamic University College of Computer and Information Sciences Types of.
Data Mining By Dave Maung.
Vision + Focus + Execution Meiliu Lu, RVR 5016, For CSc 209 Spring 2003, 5/6/03.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
داده كاوي و كاربرد آن در پزشكي بنام خدا نام دانشجو : بابك رزاقي شماره دانشجويي : استاد راهنما : جناب آقاي دكتر توحيد خواه ( سمينار درس كاربرد.
Kislaya Prasad, PhD. Artificial Intelligence Ubiquitous Sensors 2.
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
CS507 Information Systems. Lesson # 11 Online Analytical Processing.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
LECTURE 2: DATA MINING. WHAT IS DATA MINING? 2 D ATA M INING AND D ATA W AREHOUSES ? It evolved in to being as the science of databases evolved Database.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Science Interview Questions 1.What do you mean by word Data Science? Data Science is the extraction of knowledge from large.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Presenter: Prof. Dimitris Mourtzis Advanced Manufacturing: Industry 4.0 and Smart Systems.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
There is an inherent meaning in everything. “Signs for people who can see.”
School of Computer Science & Engineering
Introduction to Data Mining
Introduction C.Eng 714 Spring 2010.
What is Pattern Recognition?
Data Warehousing and Data Mining
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
PBKM: A Secure Knowledge Management Framework
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Web Mining Department of Computer Science and Engg.
Data Mining: Concepts and Techniques
Presentation transcript:

Mining Large Data at SDSC Natasha Balac, Ph.D.

A Deluge of Data Astronomy Life Sciences Modeling and Simulation Data Management and Mining Geosciences Preservation and Archiving Today, data comes from everywhere – Scientific instruments – Experiments – Sensors and sensor nets – New devices And is used by everyone – Scientists – Consumers – Educators – General public IT environments must support unprecedented diversity, globalization, integration, scale, and use Turning the deluge of data into usable information requires an unprecedented level of integration, globalization, scale, and access

Why DATA MINING? Necessity is mother of invention Huge amounts of data Electronic records of our decisions – Choices in the supermarket – Financial records – Our comings and goings We swipe our way through the world – every swipe is a record in a database Data rich – but information poor Lying hidden in all this data is information!

What is DATA MINING? Extracting or “mining” knowledge from large amounts of data Data-driven discovery and modeling of hidden patterns (we never new existed) in large volumes of data Extraction of implicit, previously unknown and unexpected, potentially extremely useful information from data Fundamental idea: learn rules/patterns/relationships automatically from the data

Terminology Gold Mining vs. Sand Mining Knowledge mining from databases Knowledge extraction Data/pattern analysis Knowledge Discovery Databases (KDD) Predictive Modeling Machine Learning Business Intelligence

CRISP-DM (Cross Industry Standard Process for Data Mining) CRISP-DM Process Model

Data Mining Driven Engineering Product Design Incorporate parallel computing and data mining capabilities into engineering and optimizing product design models Complex challenges new product design –accurate acquisition/ interpretation of raw customer data –Integrating newly found knowledge in the engineering design process –developing analytical techniques that help reduce the computational time required to generate product portfolios. Mining paid search on-line customer preference data

A java based Data Driven Product Design (DDPD) Platform is developed that integrates the supercomputing resources at the SDSC with complex engineering design simulation platforms such as Matlab in an effort to streamline the product design and development process

Tools in the GUI Data Mining algorithms: Weka, Parallel Weka and Parallel C4.5, Parallel K-means Data Driven Product Design Platform utilizes Matlab’s powerful computation engine directly from the GUI. Optimization choices available from the user interface include Matlab, Tomlab, Excel Solver, Star-P, Parallel Matlab, Parallel CPLEX, etc.

Visual Representation of Data Mining results linking with serial optimization models

Thank You