Categorization Ethics: Questions about Truth, Privacy and Big Data

Slides:



Advertisements
Similar presentations
Ethical Theories & Decision-Making Models
Advertisements

A PowerPoint Presentation
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Ethics and Responsibility
Copyright 2004 John Wiley & Sons, Inc Information Technology: Strategic Decision Making For Managers Henry C. Lucas Jr. John Wiley & Sons, Inc Dinesh.
Data Protection: International. Data Protection: a Human Right Part of Right to Personal Privacy Personal Privacy : necessary in a Democratic Society.
9. Learning Objectives  How do companies utilize social media research? What are the primary approaches to social media research?  What is the research.
Automated Tracking of Online Service Policies J. Trent Adams 1 Kevin Bauer 2 Asa Hardcastle 3 Dirk Grunwald 2 Douglas Sicker 2 1 The Internet Society 2.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
“A man without ethics is a wild beast loosed upon this world.”
IBT - Electronic Commerce Privacy Concerns Victor H. Bouganim WCL, American University.
ETHICS ON BROADCASTING. BROADCASTING A medium that disseminates via telecommunications. It is the act of transmitting speech, music, visual images, etc.,
Privacy & Confidentiality By Ann Richards, Ph.D. West Virginia University adapted from a presentation by By Joan Sieber California State University, Hayward.
SINTEF Telecom and Informatics EuroSPI’99 Workshop on Data Analysis Popular Pitfalls of Data Analysis Tore Dybå, M.Sc. Research Scientist, SINTEF.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Group A Next Generation Information Access Group.
MEM 612 Project Management Chapter 7 Monitoring and Controlling the Project.
An Introduction to Scientific Research Methods in Geography Chapter 3 Data Collection in Geography.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
User Modeling and Recommender Systems: Introduction to recommender systems Adolfo Ruiz Calleja 06/09/2014.
Essential Questions (This section is reprinted from a series of articles first published in Technology Connection in 1995.)
A medium that disseminates via telecommunications. It is the act of transmitting speech, music, visual images, etc., as by radio or television. Broadcasting.
Data as an Asset Session 10 INST 301 Introduction to Information Science.
Chapter 1 The Science of Biology. Goals of Science to provide natural explanations for events in the natural world. to use those explanations to understand.
The Power of Using Artificial Intelligence
Introducing Precictive Analytics
Ethics on Broadcasting
The Price of Free Privacy Leakage in Personalized Mobile In-App Ads
Family Relationships & Moral Development
Chapter 4 Self-understanding
From Micro-Worlds to Knowledge Representation : AI at an Impasse Hubert L. Dreyfus 15. Oct Presented by BoYun Eom.
Natural Language Processing with Qt
MCJC Schools Curriculum Plan Year One
Peter Shepherd COUNTER March 2012
Ethics on Broadcasting
Intelligence in Technology
What appeal are advertisers using to get you to buy their product?
Media Ethics Chapter 15.
Content analysis, thematic analysis and grounded theory
Statistical Process Control (SPC)
What appeal are advertisers using to get you to buy their product?
Ethics on Broadcasting
Data Protection Update – GDPR or bust
Theory of Knowledge Review
Child Outcomes Summary (COS) Process Training Module
Nina Barakzai November 2017
A GACP and GTMCP company
Decision Automation using Models , Services and Dashboards
Taking Charge of Your Health
New Data Protection Legislation
Writing Analytics Clayton Clemens Vive Kumar.
Term Definition Examples Data Science Statistics with large data sets
Ethical questions on the use of big data in official statistics
General Data Protection Regulation
Identify & Document Client Requirements.
CEC in the Era of AI and Big Data
A Journey into the Dark Side Kevin Li
General Data Protection Regulations (GDPR) Training
Nicolás J. I. Rodríguez & Arild Mellesdal
Language and Learning Introduction to Artificial Intelligence COS302
Business Intelligence
The General Data Protection Regulation: Are You Ready?
Information Retrieval
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Child Outcomes Summary (COS) Process Training Module
Machine Learning for Space Systems: Are We Ready?
Ethical Implications of using Big Data for Official Statistics
IESBA CAG Meeting New York, USA March 4, 2019
Ethical Issues in Psychology
The devil is in the details
Presentation transcript:

Categorization Ethics: Questions about Truth, Privacy and Big Data Joseph Busch

Categorization overview Classification goals Make sense Clear perception Trust Classification Bias Likes/Dislikes Comfort/Fear Culture Family Education

Statistical Bias Epidemiology Media Machine Learning Is bias inherent? Sampling error Measurement error Epidemiology Selection bias Media Source omission Machine Learning Unsupervised analysis

ProPublica “Breaking the Black Box: How Machines Learn to Be Racist” Jeff Larson, Julia Angwin and Terry Parris Jr. “How Machines Learn to Be Racist.” (October 19, 2016) https://www.propublica.org/article/breaking-the-black-box-how-machines-learn-to-be-racist?word=Trump

Inherent bias BIAS

How does automated categorization work? CHAIR CAT CHAIR CAT CHAIR CAT CHAIR CAT OR CHAIR CAT AND CHAIR CAT NOT CHAIR CAT NOT CHAIR CAT CHAIR NOT CAT NOT CHAIR

Natural language processing enables automated categorization Feature extraction Tokenization Weighting Output ID nouns & noun phrase Count occ & co-occ’s Weight tf counts by dl Tag docs in coll Query Deliver IR services Text Collection Text Collection Text Collection Text Collection NLP Auto Cat

GDPR Article 5 Article 5 provides important restrictions on commercial uses of personally identifying information (PII) – even aggregated personal information, that has not been explicitly collected for a particular and personally approved purpose. Restricts the nature of collections used for machine learning by excluding anything that might be PII Permits PII to be collected for specified, explicit and legitimate purposes Does not permit further processing beyond those purposes except “for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes” Does not apply to processing of public or published content collections such as news stories or Wikipedia articles.

Does GDPR have an impact on classification bias? GDPR requires that personal identifying information be accurate, and that if requested by an individual, that PII be corrected or deleted. GDPR could have an unintended impact on selection bias by allowing deletion of PII leading to incomplete or inadequate representation of a selection class.

Summary GDPR provides some guidelines for aggregation of personal identifying information, but not on categorization bias itself. For information aggregators and information analyzers, the guidelines for appropriate behavior are not always clear When errors and bias are commonly held, this can be reflected in the information ecology. The responsibility for outcomes as a result of errors and bias is not clear.

Discussion Truth Big Data Are morals subjective (like ice cream preference) or are they objective (like insulin)? Do we create moral truth or discover it? When morality is reduced to personal tastes, people exchange the question, “What is good?” for the pleasure question, “What feels good?” Why is lying wrong? What harm do lies do? When is it OK to lie? Big Data Who or what data is being collected? Who's being left out of that kind of data collection? Who makes the decisions about what is being done with that data? How much can we rely on it?

Resources Jeff Catlin. “The Role of Artificial Intelligence in Ethical Decision Making.” Forbes Technology Council. (Dec 21, 2017) https://www.forbes.com/sites/forbestechcouncil/2017/12/21/the-role-of-artificial-intelligence-in-ethical-decision-making/#7d94a54f21dc. ProPublica. “Breaking the Black Box” series. Julia Angwin, Terry Parris Jr. and Surya Mattu. “What Facebook Knows About You.” (September 28, 2016) https://www.propublica.org/article/breaking-the-black-box-what-facebook-knows-about-you. Julia Angwin, Terry Parris Jr. and Surya Mattu. “When Algorithms Decide What You Pay.” (October 5, 2016) https://www.propublica.org/article/breaking-the-black-box-when-algorithms-decide-what-you-pay. Julia Angwin, Terry Parris Jr., Surya Mattu and Seongtaek Lim. “When Machines Learn by Experimenting on Us.” (October 12, 2016) https://www.propublica.org/article/breaking-the-black-box-when-machines-learn-by-experimenting-on-us. Jeff Larson, Julia Angwin and Terry Parris Jr. “How Machines Learn to Be Racist.” (October 19, 2016) https://www.propublica.org/article/breaking-the-black-box-how-machines-learn-to-be-racist?word=Trump. Seth Earley. “The Problem with AI.” 19 IT Professional 04 (July-Aug 2017) pp 63-67. https://www.computer.org/csdl/mags/it/2017/04/mit2017040063.html.

More Resources Olivia Solon. “The Rise of ‘Pseudo-AI’: How Tech Firms Queitly Use Humans to Do Bots’ Work.” The Guardian (July 6, 2018) https://www.theguardian.com/technology/2018/jul/06/artifial-intelligence-ai-humans-bots-tech-companies/. Jelani Harper. “The Global Expansion of Master Data Management.” Information Management (January 31, 2018) https://www.information-management.com/opinion/the-global-expansion-of-master-data-management

Questions Joseph Busch, jbusch@taxonomystrategies.com joseph@semanticstaffing.com (m) 415-377-7912