Objectivity of the Aleksandr Sinayev PhD Candidate, Quantitative Psychology Ohio State University.

Slides:



Advertisements
Similar presentations
The Math Studies Project for Internal Assessment
Advertisements

Microsoft ® Office Word 2007 Training Header and footer basics Sweetwater ISD presents:
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Albert Gatt Corpora and Statistical Methods Lecture 13.
Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
A Brief Overview. Contents Introduction to NLP Sentiment Analysis Subjectivity versus Objectivity Determining Polarity Statistical & Linguistic Approaches.
Peiti Li 1, Shan Wu 2, Xiaoli Chen 1 1 Computer Science Dept. 2 Statistics Dept. Columbia University 116th Street and Broadway, New York, NY 10027, USA.
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts 04 10, 2014 Hyun Geun Soo Bo Pang and Lillian Lee (2004)
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Assuming normally distributed data! Naïve Bayes Classifier.
Mapping Between Taxonomies Elena Eneva 11 Dec 2001 Advanced IR Seminar.
An Experimental Evaluation on Reliability Features of N-Version Programming Xia Cai, Michael R. Lyu and Mladen A. Vouk ISSRE’2005.
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Recommender systems Ram Akella November 26 th 2008.
Fairness and Balance Reporting and Writing
Computability Thank you for staying close to me!! Learning and thinking More algorithms... computability.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Learning at Low False Positive Rate Scott Wen-tau Yih Joshua Goodman Learning for Messaging and Adversarial Problems Microsoft Research Geoff Hulten Microsoft.
Chapter 4 Principles of Quantitative Research. Answering Questions  Quantitative Research attempts to answer questions by ascribing importance (significance)
Data Mining Techniques
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
Bayesian Networks. Male brain wiring Female brain wiring.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Designing Ranking Systems for Consumer Reviews: The Economic Impact of Customer Sentiment in Electronic Markets Anindya Ghose Panagiotis Ipeirotis Stern.
Transfer Learning with Applications to Text Classification Jing Peng Computer Science Department.
 Remember, it is important that you should not believe everything you read.  Moreover, you should be able to reject or accept information based on the.
DISTANCE LEARNING OPHRM MSC AT BIRKBECK – INTERNATIONAL PROGRAM.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Modeling the Human Classification of Galaxy Morphology Wednesday, December 5, 2007 Mike Specian.
28 April Crawford School 1 Causality and Causal Inference Semester 1, 2009 POGO8096/8196: Research Methods Crawford School of Economics and Government.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Artificial Intelligence 8. Supervised and unsupervised learning Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.
A Content-Based Approach to Collaborative Filtering Brandon Douthit-Wood CS 470 – Final Presentation.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
PSYA4 Research Methods Qualitative Data.
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original.
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Comparative Experiments on Sentiment Classification for Online Product Reviews Hang Cui, Vibhu Mittal, and Mayur Datar AAAI 2006.
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Nuhi BESIMI, Adrian BESIMI, Visar SHEHU
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Decision Tree Algorithms Rule Based Suitable for automatic generation.
Ensemble Methods Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
Research Methodology II Term review. Theoretical framework  What is meant by a theory? It is a set of interrelated constructs, definitions and propositions.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
© 2001 Laura Snodgrass, Ph.D.1 Experimental Psychology Introduction.
Using Asymmetric Distributions to Improve Text Classifier Probability Estimates Paul N. Bennett Computer Science Dept. Carnegie Mellon University SIGIR.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
1 Prepared by: Laila al-Hasan. 1. Definition of research 2. Characteristics of research 3. Types of research 4. Objectives 5. Inquiry mode 2 Prepared.
Tips, Tricks, Advice and Warnings WRITING ESSAYS FOR IB SL PSYCHOLOGY PAPERS 1 & 2.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Pedro Domingos, Michael Pazzani Presented by Lu Ren Oct. 1, 2007.
Research Skills.
I. Introduction to statistics
Principles of Quantitative Research
Effects of User Similarity in Social Media Ashton Anderson Jure Leskovec Daniel Huttenlocher Jon Kleinberg Stanford University Cornell University Avia.
Step 3: Do you think this source give a truthful (accurate) image of war? Why or why not? You can use any of your own knowledge to come to a conclusion.
Data science online training.
Under the document camera.
BBA V SEMESTER (BBA 502) DR. TABASSUM ALI
Presentation transcript:

Objectivity of the Aleksandr Sinayev PhD Candidate, Quantitative Psychology Ohio State University

About Me Quantitative psychologist Personally interested in applying statistical models to gain insight in any area

What Is Objectivity? Traditional objectivity (report the truth) –Content Pragmatic objectivity – Ward, 1999 – (reports are empirically valid and coherent) –Content

What Is Objectivity? Objectivity as pretense (Tuchman, 1972) –Form (and relationships)

Empirical Investigation Can the form and content of objectivity be reliably measured and compared across content areas?

How Can We Say an Article is Subjective? Could identify subjective elements according to the definition and prior work

How Can We Say an Article is Subjective? Automatized approach A lot of data are available on movies

The Data 2,000 reviews –see Pang, Lee, & Vaithyanathan, 2002 –1,000 positive reviews and 1,000 negative 2,000 synopses –See Bamman, O’Connor, & Smith, 2013 All New York Times articles available online

Preprocessing Common non-diagnostic words deleted –E.g., ‘a’, ‘on’ Numbers were changed to generic features –‘1,023’ => ‘4digitnumber’ –‘4.8’ => ‘singledigitwdecimal’ –Effort made to identify years, dates etc. Features were words and word bigrams Articles were units of analysis

The Classifier Naïve Bayes Trained on 1,500 reviews and 1,500 synopses Tested on 500 reviews and 500 synopses

Examples “I thought”“you will”“is certainly”“were hurt” Synopses ,113 Reviews9851,7571,265400

Did it Work? Classified 80% of the reviews and synopses it was tested on correctly. Most (82%) of the NYT articles were classified as certainly subjective Opinion pieces and editorials were almost always classified as subjective (receiving an average probability of.01 on objectivity) Political news articles averaged.15

Articles Remained Subjective over Time Local, national and international news

What else? Business articles tended to be quite objective (.12) Science and technology articles were more subjective (.01,.02)

What About Front Page? Similar results if counting numbers

Positive or Negative? Trained another algorithm like the one above to distinguish between positive and negative reviews, again achieving over 80% accuracy.

Opinion articles became more positive over time

Other Articles Did Not

Conclusions Objectivity appears to be measurable through simple word pairs News articles appear to concentrate on positive subjective judgments, at least inasmuch as they resemble positive reviews Positivity of articles across time appears to have little to do with positivity of the world across time –Dipped in the 90’s (also small dip in subjectivity)

Limitations Emphasis on form, relationships completely ignored, content partly ignored Objectivity harder to pin down than subjectivity Absolute values of numbers to be taken with a grain of salt

Final Remarks If you are interested and know the literature, help me write this up!