Download presentation
Presentation is loading. Please wait.
Published byRolf Cole Modified over 8 years ago
1
Data Warehousing Data Mining Privacy
2
Reading FarkasCSCE 824 - Spring 20112
3
Data Warehousing Repository of data providing organized and cleaned enterprise- wide data (obtained form a variety of sources) in a standardized format Repository of data providing organized and cleaned enterprise- wide data (obtained form a variety of sources) in a standardized format –Data mart (single subject area) –Enterprise data warehouse (integrated data marts) –Metadata FarkasCSCE 824 - Spring 20113
4
OLAP Analysis Aggregation functions Aggregation functions Factual data access Factual data access Complex criteria Complex criteria Visualization Visualization FarkasCSCE 824 - Spring 20114
5
Warehouse Evaluation Enterprise-wide support Enterprise-wide support Consistency and integration across diverse domain Consistency and integration across diverse domain Security support Security support Support for operational users Support for operational users Flexible access for decision makers Flexible access for decision makers FarkasCSCE 824 - Spring 20115
6
Data Integration Data access Data access Data federation Data federation Change capture Change capture Need ETL (extraction, transformation, load) Need ETL (extraction, transformation, load) FarkasCSCE 824 - Spring 20116
7
Data Warehouse Users Internal users Internal users –Employees –Managerial External users External users –Reporting and auditing –Research FarkasCSCE 824 - Spring 20117
8
Data Mining Databases to be mined Knowledge to be mined Techniques Used Applications supported FarkasCSCE 824 - Spring 20118
9
Data Mining Task Prediction Tasks Prediction Tasks –Use some variables to predict unknown or future values of other variables Description Tasks Description Tasks –Find human-interpretable patterns that describe the data FarkasCSCE 824 - Spring 20119
10
Common Tasks Classification [Predictive] Classification [Predictive] Clustering [Descriptive] Clustering [Descriptive] Association Rule Mining [Descriptive] Association Rule Mining [Descriptive] Sequential Pattern Mining [Descriptive] Sequential Pattern Mining [Descriptive] Regression [Predictive] Regression [Predictive] Deviation Detection [Predictive] Deviation Detection [Predictive] FarkasCSCE 824 - Spring 201110
11
Security for Data Warehousing Establish organizations security policies and procedures Establish organizations security policies and procedures Implement logical access control Implement logical access control Restrict physical access Restrict physical access Establish internal control and auditing Establish internal control and auditing FarkasCSCE 824 - Spring 201111
12
Security for Data Warehousing (cont.) Security Issues in Data Warehousing and Data Mining: Panel Discussion Security Issues in Data Warehousing and Data Mining: Panel Discussion Panel discussion of Bhavani Thuraisingham, The MITRE Corporation, Linda Schlipper, The MITRE Corporation, Pierangela Samarati, SRI International, T. Y. Lin, San Jose State University, Sushil Jajodia, George Mason University, Chris Clifton, The MITRE Corporation, xanadu.cs.sjsu.edu/~tylin/publications/pape rList/109_ security.ps Panel discussion of Bhavani Thuraisingham, The MITRE Corporation, Linda Schlipper, The MITRE Corporation, Pierangela Samarati, SRI International, T. Y. Lin, San Jose State University, Sushil Jajodia, George Mason University, Chris Clifton, The MITRE Corporation, xanadu.cs.sjsu.edu/~tylin/publications/pape rList/109_ security.ps FarkasCSCE 824 - Spring 201112
13
Integrity Poor quality data: inaccurate, incomplete, missing meta-data Poor quality data: inaccurate, incomplete, missing meta-data Source data quality vs. derived data quality Source data quality vs. derived data quality FarkasCSCE 824 - Spring 201113
14
Access Control Layered defense: Layered defense: –Access to processes that extract operational data –Access to data and process that transforms operational data –Access to data and meta-data in the warehouse FarkasCSCE 824 - Spring 201114
15
Access Control Issues Mapping from local to warehouse policies Mapping from local to warehouse policies How to handle “new” data How to handle “new” data Scalability Scalability Identity Management Identity Management FarkasCSCE 824 - Spring 201115
16
Inference Problem Data Mining: discover “new knowledge” how to evaluate security risks? Data Mining: discover “new knowledge” how to evaluate security risks? Example security risks: Example security risks: –Prediction of sensitive information –Misuse of information Assurance of “discovery” Assurance of “discovery” Interesting Read: C. C. Aggarwal and P.S. Yu, PRIVACY-PRESERVING DATA MINING: MODELS AND ALGORITHMS, http://charuaggarwal.net/toc.pdf Interesting Read: C. C. Aggarwal and P.S. Yu, PRIVACY-PRESERVING DATA MINING: MODELS AND ALGORITHMS, http://charuaggarwal.net/toc.pdf http://charuaggarwal.net/toc.pdf FarkasCSCE 824 - Spring 201116
17
Privacy Large volume of private (personal) data Large volume of private (personal) data Need: Need: –Proper acquisition, maintenance, usage, and retention policy –Integrity verification –Control of analysis methods (aggregation may reveal sensitive data) FarkasCSCE 824 - Spring 201117
18
Privacy What is the difference between confidentiality and privacy? What is the difference between confidentiality and privacy? Identity, location, activity, etc. Identity, location, activity, etc. Anonymity vs. accountability Anonymity vs. accountability FarkasCSCE 824 - Spring 201118
19
FarkasCSCE 824 - Spring 201119 Legislations Privacy Act of 1974, U.S. Department of Justice (http://www.usdoj.gov/oip/04_7_1.html ) Privacy Act of 1974, U.S. Department of Justice (http://www.usdoj.gov/oip/04_7_1.html )http://www.usdoj.gov/oip/04_7_1.html Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, (http://www.ed.gov/policy/gen/guid/fpco/ferpa/in dex.html ) Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, (http://www.ed.gov/policy/gen/guid/fpco/ferpa/in dex.html )http://www.ed.gov/policy/gen/guid/fpco/ferpa/in dex.htmlhttp://www.ed.gov/policy/gen/guid/fpco/ferpa/in dex.html Health Insurance Portability and Accountability Act of 1996 (HIPAA), (http://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Act ) Health Insurance Portability and Accountability Act of 1996 (HIPAA), (http://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Act )http://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Acthttp://en.wikipedia.org/wiki/Health_Insurance_Por tability_and_Accountability_Act Telecommunications Consumer Privacy Act (http://www.answers.com/topic/electronic- communications-privacy-act ) Telecommunications Consumer Privacy Act (http://www.answers.com/topic/electronic- communications-privacy-act )http://www.answers.com/topic/electronic- communications-privacy-acthttp://www.answers.com/topic/electronic- communications-privacy-act
20
Online Social Network Social Relationship Social Relationship Communication context changes social relationships Communication context changes social relationships Social relationships maintained through different media grow at different rates and to different depths Social relationships maintained through different media grow at different rates and to different depths No clear consensus which media is the best No clear consensus which media is the best FarkasCSCE 824 - Spring 201120
21
Internet and Social Relationships Internet Bridges distance at a low cost Bridges distance at a low cost New participants tend to “like” each other more New participants tend to “like” each other more Less stressful than face-to-face meeting Less stressful than face-to-face meeting People focus on communicating their “selves” (except a few malicious users) People focus on communicating their “selves” (except a few malicious users) FarkasCSCE 824 - Spring 201121
22
Social Network Description of the social structure between actors Description of the social structure between actors Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds Support online interaction and content sharing Support online interaction and content sharing FarkasCSCE 824 - Spring 201122
23
Social Network Analysis The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities Behavioral Profiling Behavioral Profiling Note: Social Network Signatures Note: Social Network Signatures –User names may change, family and friends are more difficult to change FarkasCSCE 824 - Spring 201123
24
Interesting Read: M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, http://citeseer.ist.psu.edu/viewd oc/summary?doi=10.1.1.149.446 8 M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, http://citeseer.ist.psu.edu/viewd oc/summary?doi=10.1.1.149.446 8 http://citeseer.ist.psu.edu/viewd oc/summary?doi=10.1.1.149.446 8 http://citeseer.ist.psu.edu/viewd oc/summary?doi=10.1.1.149.446 8 FarkasCSCE 824 - Spring 201124
25
Next Hippocratic Databases FarkasCSCE 824 - Spring 201125
26
FarkasCSCE 824 - Spring 201126 Next Class Stream Data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.