Presentation is loading. Please wait.

Presentation is loading. Please wait.

SNS COLLEGE OF TECHNOLOGY

Similar presentations


Presentation on theme: "SNS COLLEGE OF TECHNOLOGY"— Presentation transcript:

1 SNS COLLEGE OF TECHNOLOGY
COIMBATORE – 64035 Intelligent Data Analysis Big Data / Departme22nt of MCA

2 Intelligent Data Analysis (IDA)
It is an interdisciplinary study for effective analysis of data used for extracting useful information, knowledge or interesting pattern from large quantities of online data use of statistical, pattern recognition, machine learning, data abstraction, and visualization tools for analysis of data and discovery of mechanism that created data Big Data / Departme22nt of MCA

3 Big Data / Departme22nt of MCA
IDA – Processing Steps identifying a problem depending on the interest of a data analyst sources of information are identified and a subset of data is generated from the accumulated data the data set is pre-processed by removing noise, handling missing information and transforming to an appropriate forma IDA technique or a combination of techniques appropriate for the type of knowledge to be discovered is then applied to the derived data set. The discovered knowledge is then manipulatated, evaluated and interpreted Big Data / Departme22nt of MCA

4 Big Data / Departme22nt of MCA
IDA -Tools See5 program for analyzing data and generating classifiers in the form of decision trees and/or rule sets Cubist analyzes data and generates rule-based piecewise linear models – collections of rules, each with an associated linear expression for computing a target value ILLM the tool constructs classification models in the form of rules which represent knowledge about relations hidden in data Magnum Opus finds association rules providing competitive advantage by revealing underlying interactions between factors within the data Big Data / Departme22nt of MCA

5 IDA –Disciplinary area
Statistics Classification Prediction Modeling Pattern Regression Clustering Machine Learning study of algorithms that can learn from and make prediction of data ability to learn without being explicitly programmed Types: Supervised learning Types: Unsupervised learning Types: Reinforcement learning Big Data / Departme22nt of MCA

6 Big Data / Departme22nt of MCA
Thank You Big Data / Departme22nt of MCA

7 SNS COLLEGE OF TECHNOLOGY Analytic processes and tools
COIMBATORE – 64035 Analytic processes and tools Big Data / Departme22nt of MCA

8 Analytic processes and tools
a process for obtaining raw data and converting it into information useful for decision-making Data is collected and analyzed to answer questions, test hypotheses Data Requirements Data collection Data processing Data cleaning Data analysis Result Big Data / Departme22nt of MCA

9 Big Data / Departme22nt of MCA
Analytic processes Data Require ments Data are specified based upon the requirements Specific variables regarding a population (age, income) may be numerical or categorical Data collection collected from a variety of sources (sensors, CCTV, satellite, recording devices) It also be obtained through interviews and downloads from online sources Data processing placing data into rows and columns in a table format for further analysis (spreadsheet /statistical software) Data cleaning data may be incomplete, contain duplicates, or contain errors is the process of preventing and correcting these errors Tasks like record matching, deduplication, and column segmentation Analysis variety of techniques referred Mathematical formulas / models called algorithms may be applied to the data to identify relationships among the variables, such as correlation or causation Result Big Data / Departme22nt of MCA

10 Big Data / Departme22nt of MCA
Thank You Big Data / Departme22nt of MCA

11 Modern data analytical tools
Analytical tools in the market from different vendors including Amazon, IBM, Microsoft Analytical tools comprises of Business Analytics monitors the status of any relevant business component or characteristic on-demand, in real-time Data Management It handles large amount of data, different data types including unstructured data Predictive Analytics and Performance Management helps to identify trends and characteristics, both positive of negative Data warehousing it can handle the traditional, processed data, unprocessed, raw data along with live data streams Business intelligence Big Data / Departme22nt of MCA

12 Modern data analytical tools
Hadoop – Java based framework allow for distributed processing of large data set using commodity hardware Open source data management with scale-out storage and distributed processing Big Data / Departme22nt of MCA

13 Big Data / Departme22nt of MCA
Hadoop Ecosystem Big Data / Departme22nt of MCA

14 Big Data / Departme22nt of MCA
Hadoop Architecture It has two main components HDFS – Hadoop Distributed File System Distributed across nodes Namenode and datanode manage storage MapReduce – Programming model for distributed processing Split a task across processors Near the data and assembles result High bandwidth Clustered storage Jobtracker and tasktracker manage process Big Data / Departme22nt of MCA

15 Big Data / Departme22nt of MCA
Hadoop Architecture Other major components are YARN A framework for job scheduling and cluster resource management Hadoop Common Contains Java libraries and utilities required by other Hadoop modules provide file system and OS level abstractions and java files to start Hadoop Big Data / Departme22nt of MCA

16 Big Data / Departme22nt of MCA
Hadoop – File write Big Data / Departme22nt of MCA

17 Big Data / Departme22nt of MCA
Hadoop – File read Big Data / Departme22nt of MCA


Download ppt "SNS COLLEGE OF TECHNOLOGY"

Similar presentations


Ads by Google