Download presentation
Presentation is loading. Please wait.
Published byEdgar Short Modified over 9 years ago
1
Background 30 seconds 5 minutes 24 minutes 54 minutes 1 m 2.5 hours Systematic attempts to measure partisan bias in the media tended to focus on estimating presentation bias, also known as “slant” or “spin,” using article keywords, phrases and citations. Groseclose and Milyo (2005) measure media bias by comparing interest group citations in news sources to interest group citations by members of Congress. Gentzkow and Shapiro (2010) measure bias by comparing newspaper article language with language in Congressional speeches. News in the era of Google and the Internet, however, is more prone to a form of agenda-setting bias sometimes called selection bias in which partisan narratives are constructed by selecting stories about topics which fit these narratives (Groeling 2013). While the expression of a partisan agenda among pre-Internet news organizations often required journalists to spin the same stories that their competitors were reporting on, contemporary news organizations can now more easily write a series of stories that appear to be ``spin-free'' but yet together form a highly biased partisan narrative. In this paper, we use probabilistic topic models to measure bias among 13 of the top online news sources and rank these organizations accordingly. An Unbiased Measure of Media Bias Using Latent Topic Models Lefteris Anastasopoulos, Aaron Kaufman and Luke Miratrix Contact Email: jasonanastas@ischool.berkeley.edu UC Berkeley School of Information, Harvard University Department of Government and Department of Statistics Abstract Most research using article text says “yes.” Groseclose and Milyo (2005) – Liberal/Democratic bias. Gentzkow and Shapiro (2010)– Biased toward ideology of readers. Quinn and Ho (2008) – Liberal bias. Presentation and Selection Bias Presentation Bias: Traditional Framework Top “Partisan” words as measured using the Congressional Record. Source: Gentzkow and Shapiro (2010) Comparison of words and phrases (n-grams) in known partisan/ideological sources with words and phrases in articles. Groseclose and Milyo (2005) – Think tank citations. Quinn and Ho (2008) – Supreme Court opinions. Gentzkow and Shapiro (2010) – Congressional speeches. Is there Partisan Bias in the Mainstream Media? What is Media Bias? Presentation Bias -How something is covered. -Referred to as “spin.” Selection Bias -What is covered. -AKA “agenda setting bias.” Ferguson Coverage. Source: The Daily Mail Ferguson Coverage. Source: Huffington PostDrudge Report Coverage of “Obamaphones” Huffington Post Coverage of Vegan Diets Selection Bias: Traditional Framework At any time t, there exists a universe of unobserved stories S t. News sources, N dt, are conceptualized as sets of non-random draws from S t. A news source is a set of stories that has an ideological/partisan valence. Goal was to restrict S t enough to measure selection bias. e.g., Larcinese, Puglisi and Snyder (2011): S t = Periodically released government reports. Reframing Media Bias Bias of a news source d is a function of Selection Bias SB d and presentation bias PB d. News Sources are Collections of Latent Topics - While the latent universe of potential news stories S t will always be unknown, the Latent Dirichlet Allocation (Blei, Ng and Jordan 2003) allows us to estimate a latent universe of k topics within each news source d over a time period t. Selection Bias I = Partisan/ide ological “valence” of topic k θ ad = Distribution of topic proportions for article a in news source d. SB ad = Selection bias of article a in source d. E[SB ad ] = Selection bias of article a in source d. (1) (2) (3) (5) (6) (7) Reframing Media Bias (continued) (4) Using LDA to Measure Selection and Presentation Bias Overview Corpus – Each of 13 top news website sources: CNN, NYT, Fox, LA Times, USA Today, WashPo, HuffPo, Chicago Sun Times, NY Daily News, ABC News, Wall St. Journal, NBC News. Documents – Articles within each source over the course of several months. Topics – Arbitrarily set to K = 30. Step 1: Extract K=30 topics from articles within each news source (8) (10) Graphical model and joint probability distribution for a K = 30 topic model of A articles in a single news source. Step 2: Identify topics common to all outlets and topics unique to each outlet using KL divergence to measure similarity between topic distributions (Kim and Oh 2011). Probability distribution over words for topic k in source d. KL divergence measure of similarity between topic k in source d and topic k in source d’. Step 3: Estimate partisan valence of each topic using partisan Mechanical Turk workers. Valence measured as topic “interest” among Republican and Democratic Turkers Step 4: Measure absolute and relative selection bias of all top media. Absolute Selection Bias – Partisan valence of topics common to all media sources. Relative Selection Bias – Partisan valence of topics unique to each source. Preliminary Results: Labeled topics and top 10 topic words for 13 news sources collected over 1 month.. Estimating Media Bias Using Topic Models (9)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.