SoLoGlo: An Archival and Analysis Service for Social, Local, and Global News Martin Peter Todd

Slides:



Advertisements
Similar presentations
1 What is the Internet Archive We are a Digital Library Mission Statement: Universal access to human knowledge Founded in 1996 by Brewster Kahle in San.
Advertisements

10 C H A P T E R © 2001 The McGraw-Hill Companies, Inc. All Rights Reserved1 Streaming Media and Synchronized Multimedia One of the ways the Internet is.
An introduction to the work of the Scottish Archive Network Internet access to the written history of Scotland.
White Master Replace with a graphic 5.5” Tall & 4.3” Wide © 2010 Adobe Systems Incorporated. All Rights Reserved. Video Distribution Philippe Degery DMO.
The Power of Social Media Hallie Janssen Vice President Anvil Media,
ARCHIVE IMAGING SEARCHABLE VIA THE WEBPAC Marthie de Kock The Hong Kong Institute of Education 9 December 2002.
Skills: posting images on Twitter using Twitpix.com Concepts: application ecosystem, application program interface (API) This work is licensed under a.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
Theology Resources, Methods and Interpretation Library Workshop 2 Suzie Kitchin and Christine Purcell November 2008.
TC2-Computer Literacy Mr. Sencer February 4, 2010.
The Marketing Landscape. Partnering & Packaging Creates authentic experiences that provide a unique sense of place Keeps visitors in town longer Stretches.
Information Access Douglas W. Oard College of Information Studies and Institute for Advanced Computer Studies Design Understanding.
IST 221 Internet Concepts and Applications Internet, WWW and HTML 1.
November 14, 2006 MIT OpenCourseWare Video Opportunities and Risks.
School location collector
1 Archiving and Preserving the Web Kristine Hanna Internet Archive April 2006.
Recent approaches to capture web content, which Heritrix can’t harvest  Capturing Social Media  Screen filming of Rich Media  Project: Event crawl of.
Top 5 Facebook Tips Mark Smith Rosemary Turner. What is Facebook? Users create a personalised profile for themselves and then add people as friends to.
1 Archive-It Training University of Maryland July 12, 2007.
Archive-It collection on “Occupy Movement 2011/2012” Archiving Web Content.
Section 2.1 Compare the Internet and the Web Identify Web browser components Compare Web sites and Web pages Describe types of Web sites Section 2.2 Identify.
What IS the Web? Mrs. Wilson Internet Basics & Beyond.
The Internet Writer’s Handbook 2/e Introduction to World Wide Web Terms Writing for the Web.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
WEB TERMINOLOGIES. Page or web page: a file that can be read over the world wide web Pages or web pages: the global collection of documents associated.
OurDigitalWorld.org OurDigitalWorld & Your Public Library SOLS October 3, 2013 Jess Posgate.
AS GCE Applied ICT ICT Unit 1 – The Information Age.
In addition to Word, Excel, PowerPoint, and Access, Microsoft Office® 2013 includes additional applications, including Outlook, OneNote, and Office Web.
April 24 – Jim Haris.  Groups are meant to foster group discussion around a particular topic area while Pages allow entities such as public figures and.
Medical Heritage Library. Mission Content-centered digital community Supporting research, education, dialog History of medicine contributing to understanding.
From Concept to Reality: An overview of the University of Wisconsin Digital Collections Melissa Mclimans.
Interoperability through Library APIs Library Technology Services Open House 7/30/15.
March 19, 2010 Social Media: It’s Not Just for Geeks Anymore.
CHAPTER 1 THE READ/WRITE WEB Marquita Friend Resa Garvin October 17, 2012 EDUC 303.
What technologies are there and how can you use them?
UNESCO ICTLIP Module 1. Lesson 61 Introduction to Information and Communication Technologies Lesson 6. What is the Internet?
1 Knowing Your Audience Readership analytics and editorial strategies for online news The Norman Lear Center Annenberg School for Communications & Journalism.
Beth Schaefer, assistant director Client Services University Information Technology Services IT's 4 U: Putting social networking tools to work.
Streaming Media A technique for transferring data on the Internet so it can be processed as a steady and continuous stream.
Unit no. 5 Digital Library Adolf Knoll National Library of the Czech Republic © Adolf Knoll, National Library of the Czech Republic.
9/4/01Mary Price - The Internet1 9/4/01Mary Price - The Internet 2 T h e I n t e r n e t ? I n t e r n a t i o n a l N e t w o r k.
2011 OSEP Leadership Mega Conference Collaboration to Achieve Success from Cradle to Career 2.0 Based on work with the Family Initiative Coordination Services.
Geospatial One-Stop FGDC and GOS: Working as One to Build the NSDI Sharon Shin Federal Geographic Data Committee Geospatial One-Stop Metadata Coordinator.
World Wide Web “WWW”, "Web" or "W3". World Wide Web “WWW”, "Web" or "W3"
TECHNOLOGY TERMS BY:SHAQUILLA WATSON&SIMONE TAYLOR.
FAMILYSEARCH INDEXING IS WORLDWIDE. INDEXING 1.WHAT IS INDEXING? - A PROCESS WHERE A PERSON CAN TRANSCRIBE DATA FROM A DIGITAL IMAGE WHICH IS THEN POSTED.
Curriculum 2.0: How to Find Awesome Open Source Curricula Online Presented by: Kim Jones, Joshua Marks and Christine Loew Curriki ISTE 2010 | Denver, CO.
Shoveling tweets: An analysis of the microblogging engagement of traditional news organizations Marcus Messner Maureen Linke Asriel Eford School of Mass.
A Semantic Knowledge Base for the UK Government Web Archive Tom Storrar & Claire Newing Applying records management processes principles to the open government.
World Wide Web Guide * for Students to the Internet.
Patron Driven Captures The January 25th Revolution Web Archive at the American University in Cairo 2012 Archive-It Partner Meeting December 3, 2012 Annapolis,
Water Rights Website (Toolshed Tour) RWUA Water Rights Workshop April 29, 2008
Introduction to Social Media October 28, 2010 Green County High School Vickie Buckman.
The Internet Salihu Ibrahim Dasuki (PhD) CSC102 INTRODUCTION TO COMPUTER SCIENCE.
Introducing Planet eStream The complete media solution for education.
The Internet and the WWW IT-IDT-5.1. History of the Internet How did the Internet originate? Goal: To function if part of network were disabled Became.
Glencoe Introduction to Multimedia Chapter 2 Multimedia Online 1 Internet A huge network that connects computers all over the world. Show Definition.
The Challenge of Collecting and Providing Access to Social Media Content Vakil Smallen Rachel Trent Brian Dietz Peter Broadwell Camille Tyndall Watson.
Seamlessly customize and update content for each and every location.
Archiving & Preserving Digital Content
Programming by a Sample: Rapidly Creating Web Applications with d.mix
Some Common Terms The Internet is a network of computers spanning the globe. It is also called the World Wide Web. World Wide Web It is a collection of.
DIGITAL LIBRARY.
Helena S Chapman Steve Atkin
online newspaper’s journalists
Empowerment through knowledge and the sharing of knowledge
How Digital Humanities adds to PhD Projects
"The Internet's pace of adoption eclipses all other technologies that preceded it. Radio was in existence 38 years before 50 million people tuned in; TV.
PLTW Terms PLTW Vocabulary Set #10.
PLTW Terms PLTW Vocabulary Set #10.
Presentation transcript:

SoLoGlo: An Archival and Analysis Service for Social, Local, and Global News Martin Peter Todd Sharon

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Collected by researchers Donated by activists Images, audio, video, scanned documents, social media, web server logs Digital Ephemera Collections

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 NewsScape >244,000 hours of TV news archived Recorded 2005-present 13 countries, 9 languages 38 networks Searchable by captions, on-screen text, named entities

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Social Local Global

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Twitter’s Contribution Twitter has made full archive of tweets available Indexed, searchable Not accessible via API How about deleted tweets? Real-time capture of embedded resources?

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Tool Reuse 1/2 Social Feed Manager (Dan Chudnov, GWU)

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Tool Reuse 2/2 twarc (Ed Summers, MITH)

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 How SoLoGlo is different I.Real-time capture and archiving of tweets, referenced URIs, and embedded resources II.Rapid analysis, real-time opportunities III.Collection-agnostic linking

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 SoLoGlo Collections 1/4 Twitter dataset about Egyptian revolution >400k tweets 50k unique users Tweets originated from within 200 miles around Cairo 25% of tweets contain references to external resources (web pages, images, videos, etc)

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Twitter dataset about Egyptian revolution >400k tweets 50k unique users Tweets originated from within 200 miles around Cairo 25% of tweets contain references to external resources (web pages, images, videos, etc) 20% of references are dead  HTTP GET  200 OK  HTTP HEAD  204 No Content SoLoGlo Collections 1/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Twitter dataset about Egyptian revolution >400k tweets 50k unique users Tweets originated from within 200 miles around Cairo 25% of tweets contain references to external resources (web pages, images, videos, etc) 20% of references are dead 60% of these are not archived SoLoGlo Collections 1/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th This one is! SoLoGlo Collections 1/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Need Another Example? URIs from Ed Summer’s Ferguson dataset pink == not archived (Internet Archive) 28%

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 SoLoGlo Collection Linking

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Rapid analysis #keystoneXL SoLoGlo Collections 2/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Collection on Charlie Hebdo shooting on 01/07/2015 >11 million tweets containing #CharlieHebdo or #JeSuisCharlie > 4.5 million tweets contain embedded media 4.5 million tweets reference URIs SoLoGlo Collections 3/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 ~ 67k tweets contain lat/long coordinates SoLoGlo Collections 3/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Collection on AirAsia QZ8501 crash on 12/28/ million tweets containing #AirAsia or #QZ million distinct users SoLoGlo Collections 4/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 Collection on AirAsia QZ8501 crash on 12/28/ million tweets containing #AirAsia or #QZ million distinct users SoLoGlo Collections 4/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 SoLoGlo Collections 4/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 #8501qs SoLoGlo Collections 4/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 #8501qs Coined by CNN SoLoGlo Collections 4/4

SoLoGlo Martin Klein, Peter Broadwell, Todd Grappone, Sharon Farb #IIPCGA2015, Stanford, CA, April 28th 2015 SoLoGlo Collections, Encore

SoLoGlo: An Archival and Analysis Service for Social, Local, and Global News Martin Peter Todd Sharon