Download presentation
Presentation is loading. Please wait.
Published byDarrell Marshall Modified over 6 years ago
1
Interdisciplinary Legal Research: Data Mining as Preliminary Research (I)
Dana Neacsu
2
Hypothetical 1: If, indeed, fake news had a crucial impact on our presidential elections, could we challenge the outcome of the 2016 presidential elections? On what ground?
3
ProQuest Statistical Insight
Search CLIO Provides statistical data from U.S. government publications from 1973, state and private sources from 1980, and international organizations from 1983. Tabular formats Wide breadth of formats Examples Evolving Role of News on Twitter and Facebook (Pew Research) Role of News on Facebook: Common Yet Incidental (Pew) Beyond Facebook, Small Portions Of The Public Learn About The Elections On Social Media [January 18-27, 2016 Survey](Page no.012 Table no.002) Internet Activities Of Adults By Geographic Community Type: 2010 To 2013 [Selected Months](Page no.750 Table no.1177) (Statistical Abstract)
4
Election Results ICPSR
United States General Election Exit Polls Series American National Election Study: 2016 Pilot Study (ICPSR 36390) The survey included questions about preferences in the presidential primary, stereotyping, the economy, discrimination, race and racial consciousness, police use of force, and numerous policy issues, such as immigration law, health insurance, and federal spending During the past 4 years, have you ever sent a message on Facebook or Twitter about a political issue, or have you not done this in the past 4 years? Dave Leip presidential Election Results – county by county From DSSC, 2016 not available yet.
5
Polling data Roper Center for Public Opinion Research
iPoll Specializes in public opinion surveys and has local, state, national and international pools Data on American politics includes coverage relating to presidential approval, U.S. elections (including exit polls), and congress. Access polls at question level Examples: 1) Which one of the following is your main source of political news and information?...Television, Internet, newspaper, radio, social media like Facebook and Twitter, talking with others, do you not really follow politics...And, which is your next major source of political news and information? (NBC) How much of what you post on Facebook is related to politics (including the 2016 elections)? (Rew) General Social Survey (NORC) Since 1972 A historical record of the concerns, experiences, attitudes, and practices of residents of the United States. Data Explorer - Variety of variables on internet use and opinions.
6
A word on Social Media Data
Searching archival collections will typically require funding and programming ability Capturing data moving forward may be more feasible Twitter archiving Google sheet: A list from NCSU Libraries:
7
Secondary sources – Google Searches Primary Sources – Data mining
Hypothetical 2: If you needed to find the homicide data during for US cities, how would you obtain it? Secondary sources – Google Searches Primary Sources – Data mining
8
Macro-management of interdisciplinary legal research: What data
Macro-management of interdisciplinary legal research: What data? From what databases? Free-of charge or proprietary?
9
Web Scraping Extracting and parsing formatted data from a web page *(HTML, XHMTL, JSON etc.). Automated or manual Python Beautiful Soup - Toolkit for dissecting a document and extracting what you need. It doesn't take much code to write an application Manages encodings Sits on top of popular Python parsers like lxml and html5lib Gathering election results example:
10
Web Scraping Automated tools (no programing) Import.io
Cloud based web application No longer free apparently, but free trial is available Webscraper.io Chrome browser plug in available free Sitemap building, data extraction and export are all done within browser Have not used, but there is Youtube:
11
Web Scraping Premade tools (many applications on GitHub)
Example – NYPD Crash Data Band Aid On Github - NYPD released data based through “idiotically obfuscated PDFs” Tool is built in python and on top of xpdf and wget
12
Training and Help Lynda.com (through libraries license)
Python: Programming Efficiently Code Academy. Python intro in addition to a variety of web based APIs Digital centers in the libraries Python open lab R open lab Collaborators at Columbia University An appointment-based free consulting service for students and researchers at Columbia University that offers assistance with planning and executing data driven research projects, including help with data visualization, analysis and prediction, both in conceptual terms and with concrete software implementations.
13
The research for both hypotheticals is preliminary research necessary to build a case
Then you move on to see if that data might fit the legal grounds your legal research has yielded
14
Questions? Dana Neacsu: edn13@Columbia.edu
Many thanks to Eric Glass & Amy Nurnberger!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.