Data Warehousing Data Mining Privacy. Reading FarkasCSCE 824 - Spring 20112.

Slides:



Advertisements
Similar presentations
C6 Databases.
Advertisements

By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Accessing Organizational Information—Data Warehouse
Sharing Enterprise Data Data administration Data administration Data downloading Data downloading Data warehousing Data warehousing.
Privacy in Social Networks CSCE 201. Reading Dwyer, Hiltz, Passerini, Trust and privacy concern within social networking sites: A comparison of Facebook.
© Prentice Hall CHAPTER 14 Managing Technological Resources.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Week 9 Data Mining System (Knowledge Data Discovery)
Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Database – Part 2b Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Sakthi Angappamudali at Standard Insurance; BI.
Ethics and Responsibility
DATA WAREHOUSING.
Business Driven Technology Unit 2
Data Mining – Intro.
Data Warehouse Components
Data Mining.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
Understanding Data Warehousing
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Using SAS® Information Map Studio
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
BUSINESS DRIVEN TECHNOLOGY
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Foundations of Business Intelligence: Databases and Information Management.
Data Management Managing Big Data Briefing 10/2012 Will Graves US-VISIT Chief Biometric engineer Chair of Biometric Domain.
Advanced Database Concepts
Data Mining and Decision Support
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
Big Data Analytics Are we at risk? Dr. Csilla Farkas Director Center for Information Assurance Engineering (CIAE) Department of Computer Science and Engineering.
MBA/1092/10 MBA/1093/10 MBA/1095/10 MBA/1114/10 MBA/1115/10.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
نمايندگي استان يزد. نمايندگي استان يزد طراحی کسب و کار الکترونیکی ارائه کننده : محسن افسر قره باغ.
CHAPTER SIX DATA Business Intelligence
Data Mining – Intro.
DATA MINING © Prentice Hall.
Introduction to Data Mining
Introduction C.Eng 714 Spring 2010.
Data Warehouse.
Data Warehousing and Data Mining
C.U.SHAH COLLEGE OF ENG. & TECH.
Data Warehousing Data Mining Privacy
CSE591: Data Mining by H. Liu
Presentation transcript:

Data Warehousing Data Mining Privacy

Reading FarkasCSCE Spring 20112

Data Warehousing Repository of data providing organized and cleaned enterprise- wide data (obtained form a variety of sources) in a standardized format Repository of data providing organized and cleaned enterprise- wide data (obtained form a variety of sources) in a standardized format –Data mart (single subject area) –Enterprise data warehouse (integrated data marts) –Metadata FarkasCSCE Spring 20113

OLAP Analysis Aggregation functions Aggregation functions Factual data access Factual data access Complex criteria Complex criteria Visualization Visualization FarkasCSCE Spring 20114

Warehouse Evaluation Enterprise-wide support Enterprise-wide support Consistency and integration across diverse domain Consistency and integration across diverse domain Security support Security support Support for operational users Support for operational users Flexible access for decision makers Flexible access for decision makers FarkasCSCE Spring 20115

Data Integration Data access Data access Data federation Data federation Change capture Change capture Need ETL (extraction, transformation, load) Need ETL (extraction, transformation, load) FarkasCSCE Spring 20116

Data Warehouse Users Internal users Internal users –Employees –Managerial External users External users –Reporting and auditing –Research FarkasCSCE Spring 20117

Data Mining Databases to be mined Knowledge to be mined Techniques Used Applications supported FarkasCSCE Spring 20118

Data Mining Task Prediction Tasks Prediction Tasks –Use some variables to predict unknown or future values of other variables Description Tasks Description Tasks –Find human-interpretable patterns that describe the data FarkasCSCE Spring 20119

Common Tasks Classification [Predictive] Classification [Predictive] Clustering [Descriptive] Clustering [Descriptive] Association Rule Mining [Descriptive] Association Rule Mining [Descriptive] Sequential Pattern Mining [Descriptive] Sequential Pattern Mining [Descriptive] Regression [Predictive] Regression [Predictive] Deviation Detection [Predictive] Deviation Detection [Predictive] FarkasCSCE Spring

Security for Data Warehousing Establish organizations security policies and procedures Establish organizations security policies and procedures Implement logical access control Implement logical access control Restrict physical access Restrict physical access Establish internal control and auditing Establish internal control and auditing FarkasCSCE Spring

Security for Data Warehousing (cont.) Security Issues in Data Warehousing and Data Mining: Panel Discussion Security Issues in Data Warehousing and Data Mining: Panel Discussion Panel discussion of Bhavani Thuraisingham, The MITRE Corporation, Linda Schlipper, The MITRE Corporation, Pierangela Samarati, SRI International, T. Y. Lin, San Jose State University, Sushil Jajodia, George Mason University, Chris Clifton, The MITRE Corporation, xanadu.cs.sjsu.edu/~tylin/publications/pape rList/109_ security.ps Panel discussion of Bhavani Thuraisingham, The MITRE Corporation, Linda Schlipper, The MITRE Corporation, Pierangela Samarati, SRI International, T. Y. Lin, San Jose State University, Sushil Jajodia, George Mason University, Chris Clifton, The MITRE Corporation, xanadu.cs.sjsu.edu/~tylin/publications/pape rList/109_ security.ps FarkasCSCE Spring

Integrity Poor quality data: inaccurate, incomplete, missing meta-data Poor quality data: inaccurate, incomplete, missing meta-data Source data quality vs. derived data quality Source data quality vs. derived data quality FarkasCSCE Spring

Access Control Layered defense: Layered defense: –Access to processes that extract operational data –Access to data and process that transforms operational data –Access to data and meta-data in the warehouse FarkasCSCE Spring

Access Control Issues Mapping from local to warehouse policies Mapping from local to warehouse policies How to handle “new” data How to handle “new” data Scalability Scalability Identity Management Identity Management FarkasCSCE Spring

Inference Problem Data Mining: discover “new knowledge”  how to evaluate security risks? Data Mining: discover “new knowledge”  how to evaluate security risks? Example security risks: Example security risks: –Prediction of sensitive information –Misuse of information Assurance of “discovery” Assurance of “discovery” Interesting Read: C. C. Aggarwal and P.S. Yu, PRIVACY-PRESERVING DATA MINING: MODELS AND ALGORITHMS, Interesting Read: C. C. Aggarwal and P.S. Yu, PRIVACY-PRESERVING DATA MINING: MODELS AND ALGORITHMS, FarkasCSCE Spring

Privacy Large volume of private (personal) data Large volume of private (personal) data Need: Need: –Proper acquisition, maintenance, usage, and retention policy –Integrity verification –Control of analysis methods (aggregation may reveal sensitive data) FarkasCSCE Spring

Privacy What is the difference between confidentiality and privacy? What is the difference between confidentiality and privacy? Identity, location, activity, etc. Identity, location, activity, etc. Anonymity vs. accountability Anonymity vs. accountability FarkasCSCE Spring

FarkasCSCE Spring Legislations Privacy Act of 1974, U.S. Department of Justice ( ) Privacy Act of 1974, U.S. Department of Justice ( ) Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, ( dex.html ) Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, ( dex.html ) dex.htmlhttp:// dex.html Health Insurance Portability and Accountability Act of 1996 (HIPAA), ( tability_and_Accountability_Act ) Health Insurance Portability and Accountability Act of 1996 (HIPAA), ( tability_and_Accountability_Act ) tability_and_Accountability_Acthttp://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Act Telecommunications Consumer Privacy Act ( communications-privacy-act ) Telecommunications Consumer Privacy Act ( communications-privacy-act ) communications-privacy-acthttp:// communications-privacy-act

Online Social Network Social Relationship Social Relationship Communication context changes social relationships Communication context changes social relationships Social relationships maintained through different media grow at different rates and to different depths Social relationships maintained through different media grow at different rates and to different depths No clear consensus which media is the best No clear consensus which media is the best FarkasCSCE Spring

Internet and Social Relationships Internet Bridges distance at a low cost Bridges distance at a low cost New participants tend to “like” each other more New participants tend to “like” each other more Less stressful than face-to-face meeting Less stressful than face-to-face meeting People focus on communicating their “selves” (except a few malicious users) People focus on communicating their “selves” (except a few malicious users) FarkasCSCE Spring

Social Network Description of the social structure between actors Description of the social structure between actors Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds Support online interaction and content sharing Support online interaction and content sharing FarkasCSCE Spring

Social Network Analysis The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities Behavioral Profiling Behavioral Profiling Note: Social Network Signatures Note: Social Network Signatures –User names may change, family and friends are more difficult to change FarkasCSCE Spring

Interesting Read: M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, oc/summary?doi= M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, oc/summary?doi= oc/summary?doi= oc/summary?doi= FarkasCSCE Spring

Next Hippocratic Databases FarkasCSCE Spring

FarkasCSCE Spring Next Class Stream Data