Open data to fight corruption: Training for journalists

Slides:



Advertisements
Similar presentations
Archiving Trevor Croft MICS3 Data Archiving, Dissemination and Further Analysis Workshop Geneva - November 6th, 2006.
Advertisements

AS ICT Finding your way round MS-Access The Home Ribbon This ribbon is automatically displayed when MS-Access is started and when existing tables.
Business Planning using Spreasheets-2 1 BP-2: Good Spreadsheet Practice  There is always the temptation to rush in and start entering data.  However.
Unit 27 Spreadsheet Modelling
MICROSOFT OFFICE ACCESS 2007.
Questions from a patient or carer perspective
Database Design IST 7-10 Presented by Miss Egan and Miss Richards.
Data analysis & visualisation Term 2 – Week 9 VCE IT – UNIT 2.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Term 2, 2011 Week 1. CONTENTS Types and purposes of graphic representations Spreadsheet software – Producing graphs from numerical data Mathematical functions.
Exploring Business Technologies “I Can” and “I Will” Statements By Mr. Free.
Keeping an Open Mind OPEN DATA SUZANNE VAN DEN HOOGEN, MLIS DLI WORKSHOP FREDERICTON, NB APRIL 28, 2015.
1 OM2, Supplementary Ch. D Simulation ©2010 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible.
Innovations in Justice Information Sharing Strategies and Best Practices November 30, 2006 Lisa M. Palmieri, CCA-Supervisory Intelligence Analyst President,
IAEA International Atomic Energy Agency Open Data at NIS United Nations Library and Information Network for Knowledge Sharing (UN-LINKS) October.
DATABASES Southern Region CEO Wednesday 13 th October 2010.
Database What is a database? A database is a collection of information that is typically organized so that it can easily be storing, managing and retrieving.
1 Legislative monitor Legislative footprint and use of legislation LJUBLJANA, 1 OCTOBER 2015 Transparency International Slovenia Supported [in part] by.
SCIENCE PROCESS SKILLS
When you first log in, this is the page you will see. It lists all the courses you’re enrolled in – and differentiates between those that are active and.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
TIMOTHY SERVINSKY PROJECT MANAGER CENTER FOR SURVEY RESEARCH Data Preparation: An Introduction to Getting Data Ready for Analysis.
Skills not Silos! Open Data as OER #oer16 #ODasOER OER16 edinburgh april 2016 Leo Havemann Birkbeck, University of Javiera Atenas.
© CGI Group Inc. EGI-InSPIRE Open Data and Business Modelling for Open Science John van Echtelt Business Model Innovator Madrid, 18 September 2013.
Research – using the Internet and other secondary sources and Source analysis Top Tips – get ready to make your own notes!
TOPSpro Special Topics VI:TOPSpro for Instructors.
Module X. SMS and Broadcasting
Steps to Prepare a Science Fair Project and Data Analysis
What every benchmarking coordinator needs to know
Finding Magazine & Newspaper Articles in a Library Database
AP CSP: Cleaning Data & Creating Summary Tables
AP CSP: Finding a Data Story
Key Features Advantages over PDF sharing Use Cases Clients
World Bank Conference, Land and Poverty March 21, 2017
Javiera Atenas (UCL, UK) & Leo Havemann (BBK, UK)
Experimental Psychology
HSC Legal Studies.
Added Value Unit/Assignment
Kanban Task Manager for Outlook ‒ Introduction
SEM II : Marketing Research
Kanban Task Manager SharePoint Editions ‒ Introduction
Overview of Teaching Data and Journalism
Kanban Task Manager Single‒ Introduction
SYLVIA ROBERTS Communication librarian
Title of your science project
Unit 9 – Spreadsheet Development
Bountiful High School MAP Ethics Research Project
the Need for Data Integration
Collaboration with Google Docs
Unit R006 – Creating Digital Images
Getting the Most Out of Your Data
Give Me Simple Outputs: Using Excel to Tell Your Story
Reading Strategies English 9 Honors.
User analyses and profiling - results
Databases Software This icon indicates the slide contains activities created in Flash. These activities are not editable. For more detailed instructions,
Data Validation in the ESS Context
Agenda About Excel/Calc Spreadsheets Key Features
Mining Social Networks. Contents  What are Social Networks  Why Analyse Them?  Analysis Techniques.
Spreadsheets, Modelling & Databases
Title of your experimental design
Digital a double edged sword.
Databases This topic looks at the basic concept of a database, the key features and benefits of a Database Management System (DBMS) and the basic theory.
A blueprint for experiment success.
Chapter 2 Applications Software and Operating Systems
Wide Ideas Idea Management Software Idea Management Process
Scopus - Elsevier (Advanced Course: Module 8)
Rebecca Nyman Service Designer Gordon Williamson Product Manager
Kanban Task Manager SharePoint Editions ‒ Introduction
Presentation transcript:

Open data to fight corruption: Training for journalists Turkey, 11 November 2015 Jean Brice Tetka Data and Technology Coordinator, People Engagement Programme

What is Open Data? Open data is data that can be freely used, modified and shared by anyone for any purpose. http://blog.okfn.org/2013/10/03/defining-open-data/

key features of openness Availability and access The data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.

key features of openness Reuse and redistribution The data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must be machine-readable.

key features of openness Universal participation Everyone must be able to use, reuse and redistribute data. There should be no discrimination against fields of endeavour or against persons or groups.

Why open data? Transparency In a well-functioning, democratic society citizens need to know what their government is doing. To do that, they must be able to freely access government data and information and to share that information with other citizens. Transparency isn’t just about access, it is also about sharing and reuse: to understand material it needs to be analyzed and visualised and this requires that the material is open.

Why open data? Releasing social and commercial value In a digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data, much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value.

Why open data? Participation and engagement It is needed for participatory governance and for business and organisations engaging with your users and the audience. Much of the time citizens are only able to engage with their own government sporadically, maybe just at an election every 4 or 5 years. By opening up and communication data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the process of governance but being able to contribute to it.

Why Open Data is important for journalists Journalists can gather, filter and visualise what is happening beyond what the eye can see.

Why Open Data is important for journalists Journalists can analyze the dynamics of a complex situation like riots or political debates, show the fallacies and help everyone to see possible solutions to complex problems.

Why Open Data is important for journalists Journalists can write a strong, beautiful, impactful story and provide informative data visualisation. http://www.transparency.org/

Examples of articles using Open Data Politicians’ health priorities: This article is a mix of text, statistics and a picture https://www.washingtonpost.com/news/the-fix/wp/2015/02/02/why-measles-should-be-the-thing-that-freaks-politicians-out/

Examples of articles using Open Data Women and work: An article with some statistics and visuals http://www.washingtonpost.com/blogs/she-the-people/wp/2014/06/09/women-and-work-opt-out-or-pushed-out-the-story-in-data/

Examples of articles using Open Data How long would it take you to earn a top footballer’s salary? http://www.bbc.com/news/world-31110113

OPEN Data sets Available: TURKEY

National Data sets National Statistics http://www.turkstat.gov.tr/Start.do

National Data sets Government Budget hptt://www.bumko.gov.tr/EN,2677/statistics.html

National Data sets Legislation http://mevzuat.basbakanlik.gov.tr/KHK.aspx

National Data sets Company Register http://www.ticaretsicilgazetesi.gov.tr/english/sorgu_acik.php

International Data sets The World Bank http://data.worldbank.org/country/turkey

International Data sets The Organisation for Economic Co-operation and Development (OECD) https://data.oecd.org/turkey.htm

International Data sets Transparency International http://www.transparency.org/country/#TUR_DataResearch_SurveysIndices

Practice

Practice: Collect and Clean Data How to Collect Data Before you start collecting data you should be really clear about the topic you want to present. Identify what variables you need to present that topic. Check who collects those variables. This can be agencies or organisations, such as governments or corporations. You can also collect data from other journalists.

Practice: Collect and Clean Data How to Clean Data The information that can be found might have omissions or is misleading. Before you start the cleaning of your data, you should save the original version and work on a copy. If your data is not in an Excel format, you have to convert it and obtain a table that you can work on. Check duplicate entries and empty entries. You can create default values where no information was held. Correct formatting errors (e.g. words instead of numbers). Standardise your data by assigning the same value to information meaning the same thing (e.g. BBC and B.B.C. and British Broadcasting Corporation). Find missing data from other sources like another datasets or a newspaper.

Practice: Collect and Clean Data Some tools Using “Open Refine” to clean messy data: Open Refine is downloadable software, which can quickly sort and reconcile the imperfections in real-world data. http://openrefine.org/ Convert PDFs: Convert PDF documents into usable spreadsheets with Tabula. http://tabula.technology/

Practice: Analyse Data Analysing is about applying statistical techniques to the data. A popular tool is Excel where you can use simple spreadsheet functions like: Sort and filter functions Pivot tables

Practice: Analyse Data Tutorials https://www.youtube.com/watch?v=peNTp5fuKFg https://www.youtube.com/watch?v=9NUjHBNWe9M https://www.youtube.com/watch?v=g530cnFfk8Y https://www.youtube.com/watch?v=FyggutiBKvU

Practice: Extract pertinent information As a model, we will follow similar steps as used in investigation of crimes: Imagine that you are an inspector. You have a statement that you want to make and you have a lot information. You will start tracking evidences now.

Practice: Extract pertinent information Document Everything Keep a detailed record of all information you have collected, don’t trust your memory. Organise very well all of your information by topics, date and type of files. Corruption Mining International statistics National statistics 2015 Excel PDF 2014 Health Defence

Practice: Extract pertinent information Nail Down the Timeline All information needs to refer to a specific time. You must align all your information to a timeline, to be able to represent the chronology of how information was provided.

Practice: Extract pertinent information Now, you have to investigate on relationships between information. Write down all hypothesis arising from your analysis and represent them on a chart mentioning all actors involved. Follow every lead

Practice: Extract pertinent information The most important is not just to collect hypothesis or evidence, but to determine what is true or right. You should be able to justify the statistics or statements you will make. Be ready to demonstrate to any audience how you moved from data to your statement. Try to present your assumptions to a few people before you publish them. Verify your new assumptions

Practice: Extract pertinent information It can be tricky to navigate into multiple datasets to find information. You need to be patient and check whether you need more datasets or a technical support to analyse your data. Sometimes, looking for an information for a statement you will find something else. The best is to write it down and keep looking for what you are searching. Persevere

Practice: Extract pertinent information Some tools Graphs, networks, connections and relations https://gephi.github.io/ Visualise Timeline http://www.simile-widgets.org/timeline/ Extract text from images (OCR) http://scantailor.org/ A manual for investigative journalists http://unesdoc.unesco.org/images/0019/001930/193078e.pdf

Sources http://datadrivenjournalism.net/news_and_analysis/from_idea_to_story_planning_the_data_journalism_story http://datajournalismhandbook.org http://onlinejournalismblog.com/2011/07/07/the-inverted-pyramid-of-data-journalism/ https://blog.infogr.am/4-steps-informative-data-visualization/ https://okfn.org/opendata/ http://opendatahandbook.org/guide/en/what-is-open-data/ http://www.datajournalismtools.net/

www.transparency.org facebook.com/transparencyinternational twitter.com/anticorruption blog.transparency.org © 2014 Transparency International. All rights reserved.