Download presentation
Presentation is loading. Please wait.
1
CSCI 200 Data MINING Lecture 1
2
Recommended Textbooks and Resources
1. Applied predictive Analytics: Principles and Techniques for the Professional Data Analyst, Dean Abbott, Wiley, ISBN-13: 2. Data Mining and Business Analytics with R, Johannes Ledolter, Willey, ISBN: 3. Data Mining for Business Intelligence, Galit Shmueli, Nitin R. Patel, Peter C. Bruce, Willey, ISBN – 13:
3
Recommended Textbooks and Resources
4. QlikView Your Business: An expert guide to Business Discovery with QlikView and Qlik Sense, Oleg Troyansky, Tammy Gibson, Charlie Leichtweis, Lars Bjork (Foreword by) 5. Discovering Knowledge in Data: An Introduction to Data Mining 2nd Edition by Daniel T. Larose 6. Business Intelligence: A Managerial Approach, Efraim Turban; Ramesh Sharda; Dursun Delen; David King
4
Recommended Textbooks and Resources
7. Qlik Website 8. R Studio, R 9. W. Eckerson, Smart Companies in the 21st Century: The Secrets of Creating Successful Business Intelligent Solutions. 10. Garcia, M., Harmsen, B. (2012). QlikView 11 for Developers. Birmingham: Packt Publishing
5
Data Mining Common Tasks Daniel T. Larose [5]
Data mining is the process of discovering useful patterns and trends in large data sets. Common Data Mining Tasks Description Estimation Prediction Classification Clustering Association
6
Description Daniel T. Larose [5]
Description of patterns and trends lying within the data. Exploratory data analysis, a graphical method of exploring the data in search of patterns and trends Example: a pollster may uncover evidence that those who have been laid off are less likely to support the present incumbent in the presidential election. Descriptions of patterns and trends often suggest possible explanations for such patterns and trends. For example, those who are laid off are now less well off financially than before the incumbent was elected, and so would tend to prefer an alternative.
7
Estimation and Prediction Daniel T. Larose [5]
approximate the value of a numeric target variable using a set of numeric and/or categorical predictor variables Example: Estimating the amount of money a randomly chosen family of four will spend for back-to-school shopping this fall Example: Estimating the grade point average (GPA) of a graduate student, based on that student’s undergraduate GPA Prediction Similar to estimation, except that for prediction, the results lie in the future. Example: Predicting whether a particular molecule in drug discovery will lead to a profitable new drug for a pharmaceutical company.
8
Classification, Daniel T. Larose [5]
Similar to estimation, except that the target variable is categorical rather than numeric Categorical variables represent types of data which may be divided into groups. Example: income bracket - high income, middle income, and low income Suppose the researcher would like to be able to classify the income bracket of new individuals not in the current database, based on age, gender, and occupation
9
Clustering, Daniel T. Larose [5]
Grouping of records, observations, or cases into classes of similar objects. A cluster is a collection of records that are similar to one another, and dissimilar to records in other clusters. Clustering differs from classification in that there is no target variable for clustering. The clustering task does not try to classify, estimate, or predict the value of a target variable
10
Clustering, Daniel T. Larose [5]
Nielsen MyBestSegments is in the clustering business which provides a demographic profile of each oi the geographic areas in the country, as defined by zip code Clustering Mechanisms - PRIZM segmentation system, which describes every American zip code area in terms of distinct lifestyle types.
11
Association Daniel T. Larose [5]
The association task for data mining is the job of finding which attributes “go together.” The task of association seeks to uncover rules for quantifying the relationship between two or more attributes Example: Investigating the proportion of subscribers to your company’s cell phone plan that respond positively to an offer of a service upgrade Example: Finding out which items in a supermarket are purchased together, and which items are never purchased together
12
Different Terms Same Meaning Dean Abbott
Analytics is the process of using computational methods to discover and report influential patterns in data. The goal of analytics is to gain insight and often to affect decisions. 2005, Google Analytics. The ideas behind analytics are not new Different terms for analytics: cybernetics, data analysis, neural networks, pattern recognition, statistics, knowledge discovery, data mining, and now even data science.
13
Business Intelligence Efraim Turban; Ramesh Sharda; Dursun Delen; David King
Business intelligence (BI) is an umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies. BI’s objectives: to enable interactive access (sometimes in real time) to data, to enable manipulation of data to give business managers and analysts the ability to conduct appropriate analysis.
14
Decisions Actions DATA INFORMATION
Business Intelligence Efraim Turban; Ramesh Sharda; Dursun Delen; David King By analyzing historical and current data, situations, and performances, decision makers get valuable insights that enable them to make more informed and better decisions. The process of BI is based on the transformation of data to information, then to decisions, and finally to actions DATA INFORMATION Decisions Actions
15
Major Components of Business Intelligence
Data Warehouse Business Analytics - a collection of tools for manipulating, mining, and analyzing the data in the data warehouse Business Performance Management (BPM) for monitoring and analyzing performance User Interface
16
Many Information Products
RAW DATA Many Information Products From Data to Information - a data warehouse extracts data from multiple transaction or operational systems and integrates and stores the data in a dedicated database. This extraction and integration process turns data into a new product - information From Information to Knowledge. Then, users equipped with analytical tools access and analyze the information in the data warehouse. Their analysis identifies trends, patterns, and exceptions. Analytical tools enable users to turn information into knowledge.
17
BI environment takes raw material — data—and processes it into a many information products.
From Knowledge to Rules. Armed with these insights, users then create rules from the trends and patterns they have discovered. From Rules to Plans and Action. Users then create plans that implement the rules. The plans turn knowledge and rules into action. Feedback Loop. Once the plan is executed, the cycle repeats itself.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.