SNS COLLEGE OF TECHNOLOGY

Slides:



Advertisements
Similar presentations
Data warehouse example
Advertisements

Chapter 14 The Second Component: The Database.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
Data Mining Techniques
Intelligent Data Analysis (IDA) by Josipa Kern, PhD Andrija Stampar School of Public Health Medical School University of Zagreb Zagreb, Croatia.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Chun-Hung Chou
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Chapter 9 Business Intelligence and Information Systems for Decision Making.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
Face Detection And Recognition For Distributed Systems Meng Lin and Ermin Hodžić 1.
Highline Class, BI 348 Basic Business Analytics using Excel, Chapter 01 Intro to Business Analytics BI 348, Chapter 01.
1 1 Slide Introduction to Data Mining and Business Intelligence.
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Some working definitions…. ‘Data Mining’ and ‘Knowledge Discovery in Databases’ (KDD) are used interchangeably Data mining = –the discovery of interesting,
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Data Mining By Dave Maung.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
Copyright © 2012, SAS Institute Inc. All rights reserved. ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Apache Hadoop Daniel Lust, Anthony Taliercio. What is Apache Hadoop? Allows applications to utilize thousands of nodes while exchanging thousands of terabytes.
Data Preprocessing Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Data Mining and Decision Support
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Table of Contents Introduction Why Data Analytics Data Analytics Terminology Predictive Analytics Data Analytics challenges Data Analytics Platform Data.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Foundations of information systems : BIS 1202 Lecture 4: Database Systems and Business Intelligence.
Apache Hadoop on Windows Azure Avkash Chauhan
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Introduction to Machine Learning, its potential usage in network area,
Big Data & Test Automation
CNIT131 Internet Basics & Beginning HTML
Machine Learning with Spark MLlib
Decision Support Systems
MapReduce Compiler RHadoop
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
ANOMALY DETECTION FRAMEWORK FOR BIG DATA
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Hadoop MapReduce Framework
Chapter 14 Big Data Analytics and NoSQL
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Vincent Granville, Ph.D. Co-Founder, DSC
Tabulations and Statistics
Data Warehousing and Data Mining
OMIS 665, Big Data Analytics
Overview of big data tools
Classification and Prediction
INNOvation in TRAINING BUSINESS ANALYSTS HAO HElEN Zhang UniVERSITY of ARIZONA
Course Introduction CSC 576: Data Mining.
Zoie Barrett and Brian Lam
Charles Tappert Seidenberg School of CSIS, Pace University
Dep. of Information Technology By: Raz Dara Mohammad Amin
Data Warehousing Data Mining Privacy
Big Data Analysis in Digital Marketing
Big DATA.
Machine Learning in Business John C. Hull
Presentation transcript:

SNS COLLEGE OF TECHNOLOGY COIMBATORE – 64035 Intelligent Data Analysis Big Data / Departme22nt of MCA

Intelligent Data Analysis (IDA) It is an interdisciplinary study for effective analysis of data used for extracting useful information, knowledge or interesting pattern from large quantities of online data use of statistical, pattern recognition, machine learning, data abstraction, and visualization tools for analysis of data and discovery of mechanism that created data Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA IDA – Processing Steps identifying a problem depending on the interest of a data analyst sources of information are identified and a subset of data is generated from the accumulated data the data set is pre-processed by removing noise, handling missing information and transforming to an appropriate forma IDA technique or a combination of techniques appropriate for the type of knowledge to be discovered is then applied to the derived data set. The discovered knowledge is then manipulatated, evaluated and interpreted Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA IDA -Tools See5 program for analyzing data and generating classifiers in the form of decision trees and/or rule sets Cubist analyzes data and generates rule-based piecewise linear models – collections of rules, each with an associated linear expression for computing a target value ILLM the tool constructs classification models in the form of rules which represent knowledge about relations hidden in data Magnum Opus finds association rules providing competitive advantage by revealing underlying interactions between factors within the data Big Data / Departme22nt of MCA

IDA –Disciplinary area Statistics Classification Prediction Modeling Pattern Regression Clustering Machine Learning study of algorithms that can learn from and make prediction of data ability to learn without being explicitly programmed Types: Supervised learning Types: Unsupervised learning Types: Reinforcement learning Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Thank You Big Data / Departme22nt of MCA

SNS COLLEGE OF TECHNOLOGY Analytic processes and tools COIMBATORE – 64035 Analytic processes and tools Big Data / Departme22nt of MCA

Analytic processes and tools a process for obtaining raw data and converting it into information useful for decision-making Data is collected and analyzed to answer questions, test hypotheses Data Requirements Data collection Data processing Data cleaning Data analysis Result Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Analytic processes Data Require ments Data are specified based upon the requirements Specific variables regarding a population (age, income) may be numerical or categorical Data collection collected from a variety of sources (sensors, CCTV, satellite, recording devices) It also be obtained through interviews and downloads from online sources Data processing placing data into rows and columns in a table format for further analysis (spreadsheet /statistical software) Data cleaning data may be incomplete, contain duplicates, or contain errors is the process of preventing and correcting these errors Tasks like record matching, deduplication, and column segmentation Analysis variety of techniques referred Mathematical formulas / models called algorithms may be applied to the data to identify relationships among the variables, such as correlation or causation Result Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Thank You Big Data / Departme22nt of MCA

Modern data analytical tools Analytical tools in the market from different vendors including Amazon, IBM, Microsoft Analytical tools comprises of Business Analytics monitors the status of any relevant business component or characteristic on-demand, in real-time Data Management It handles large amount of data, different data types including unstructured data Predictive Analytics and Performance Management helps to identify trends and characteristics, both positive of negative Data warehousing it can handle the traditional, processed data, unprocessed, raw data along with live data streams Business intelligence Big Data / Departme22nt of MCA

Modern data analytical tools Hadoop – Java based framework allow for distributed processing of large data set using commodity hardware Open source data management with scale-out storage and distributed processing Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Hadoop Ecosystem Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Hadoop Architecture It has two main components HDFS – Hadoop Distributed File System Distributed across nodes Namenode and datanode manage storage MapReduce – Programming model for distributed processing Split a task across processors Near the data and assembles result High bandwidth Clustered storage Jobtracker and tasktracker manage process Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Hadoop Architecture Other major components are YARN A framework for job scheduling and cluster resource management Hadoop Common Contains Java libraries and utilities required by other Hadoop modules provide file system and OS level abstractions and java files to start Hadoop Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Hadoop – File write Big Data / Departme22nt of MCA

Big Data / Departme22nt of MCA Hadoop – File read Big Data / Departme22nt of MCA