Open data to fight corruption: Training for journalists Turkey, 11 November 2015 Jean Brice Tetka Data and Technology Coordinator, People Engagement Programme
What is Open Data? Open data is data that can be freely used, modified and shared by anyone for any purpose. http://blog.okfn.org/2013/10/03/defining-open-data/
key features of openness Availability and access The data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.
key features of openness Reuse and redistribution The data must be provided under terms that permit reuse and redistribution including the intermixing with other datasets. The data must be machine-readable.
key features of openness Universal participation Everyone must be able to use, reuse and redistribute data. There should be no discrimination against fields of endeavour or against persons or groups.
Why open data? Transparency In a well-functioning, democratic society citizens need to know what their government is doing. To do that, they must be able to freely access government data and information and to share that information with other citizens. Transparency isn’t just about access, it is also about sharing and reuse: to understand material it needs to be analyzed and visualised and this requires that the material is open.
Why open data? Releasing social and commercial value In a digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data, much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value.
Why open data? Participation and engagement It is needed for participatory governance and for business and organisations engaging with your users and the audience. Much of the time citizens are only able to engage with their own government sporadically, maybe just at an election every 4 or 5 years. By opening up and communication data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the process of governance but being able to contribute to it.
Why Open Data is important for journalists Journalists can gather, filter and visualise what is happening beyond what the eye can see.
Why Open Data is important for journalists Journalists can analyze the dynamics of a complex situation like riots or political debates, show the fallacies and help everyone to see possible solutions to complex problems.
Why Open Data is important for journalists Journalists can write a strong, beautiful, impactful story and provide informative data visualisation. http://www.transparency.org/
Examples of articles using Open Data Politicians’ health priorities: This article is a mix of text, statistics and a picture https://www.washingtonpost.com/news/the-fix/wp/2015/02/02/why-measles-should-be-the-thing-that-freaks-politicians-out/
Examples of articles using Open Data Women and work: An article with some statistics and visuals http://www.washingtonpost.com/blogs/she-the-people/wp/2014/06/09/women-and-work-opt-out-or-pushed-out-the-story-in-data/
Examples of articles using Open Data How long would it take you to earn a top footballer’s salary? http://www.bbc.com/news/world-31110113
OPEN Data sets Available: TURKEY
National Data sets National Statistics http://www.turkstat.gov.tr/Start.do
National Data sets Government Budget hptt://www.bumko.gov.tr/EN,2677/statistics.html
National Data sets Legislation http://mevzuat.basbakanlik.gov.tr/KHK.aspx
National Data sets Company Register http://www.ticaretsicilgazetesi.gov.tr/english/sorgu_acik.php
International Data sets The World Bank http://data.worldbank.org/country/turkey
International Data sets The Organisation for Economic Co-operation and Development (OECD) https://data.oecd.org/turkey.htm
International Data sets Transparency International http://www.transparency.org/country/#TUR_DataResearch_SurveysIndices
Practice
Practice: Collect and Clean Data How to Collect Data Before you start collecting data you should be really clear about the topic you want to present. Identify what variables you need to present that topic. Check who collects those variables. This can be agencies or organisations, such as governments or corporations. You can also collect data from other journalists.
Practice: Collect and Clean Data How to Clean Data The information that can be found might have omissions or is misleading. Before you start the cleaning of your data, you should save the original version and work on a copy. If your data is not in an Excel format, you have to convert it and obtain a table that you can work on. Check duplicate entries and empty entries. You can create default values where no information was held. Correct formatting errors (e.g. words instead of numbers). Standardise your data by assigning the same value to information meaning the same thing (e.g. BBC and B.B.C. and British Broadcasting Corporation). Find missing data from other sources like another datasets or a newspaper.
Practice: Collect and Clean Data Some tools Using “Open Refine” to clean messy data: Open Refine is downloadable software, which can quickly sort and reconcile the imperfections in real-world data. http://openrefine.org/ Convert PDFs: Convert PDF documents into usable spreadsheets with Tabula. http://tabula.technology/
Practice: Analyse Data Analysing is about applying statistical techniques to the data. A popular tool is Excel where you can use simple spreadsheet functions like: Sort and filter functions Pivot tables
Practice: Analyse Data Tutorials https://www.youtube.com/watch?v=peNTp5fuKFg https://www.youtube.com/watch?v=9NUjHBNWe9M https://www.youtube.com/watch?v=g530cnFfk8Y https://www.youtube.com/watch?v=FyggutiBKvU
Practice: Extract pertinent information As a model, we will follow similar steps as used in investigation of crimes: Imagine that you are an inspector. You have a statement that you want to make and you have a lot information. You will start tracking evidences now.
Practice: Extract pertinent information Document Everything Keep a detailed record of all information you have collected, don’t trust your memory. Organise very well all of your information by topics, date and type of files. Corruption Mining International statistics National statistics 2015 Excel PDF 2014 Health Defence
Practice: Extract pertinent information Nail Down the Timeline All information needs to refer to a specific time. You must align all your information to a timeline, to be able to represent the chronology of how information was provided.
Practice: Extract pertinent information Now, you have to investigate on relationships between information. Write down all hypothesis arising from your analysis and represent them on a chart mentioning all actors involved. Follow every lead
Practice: Extract pertinent information The most important is not just to collect hypothesis or evidence, but to determine what is true or right. You should be able to justify the statistics or statements you will make. Be ready to demonstrate to any audience how you moved from data to your statement. Try to present your assumptions to a few people before you publish them. Verify your new assumptions
Practice: Extract pertinent information It can be tricky to navigate into multiple datasets to find information. You need to be patient and check whether you need more datasets or a technical support to analyse your data. Sometimes, looking for an information for a statement you will find something else. The best is to write it down and keep looking for what you are searching. Persevere
Practice: Extract pertinent information Some tools Graphs, networks, connections and relations https://gephi.github.io/ Visualise Timeline http://www.simile-widgets.org/timeline/ Extract text from images (OCR) http://scantailor.org/ A manual for investigative journalists http://unesdoc.unesco.org/images/0019/001930/193078e.pdf
Sources http://datadrivenjournalism.net/news_and_analysis/from_idea_to_story_planning_the_data_journalism_story http://datajournalismhandbook.org http://onlinejournalismblog.com/2011/07/07/the-inverted-pyramid-of-data-journalism/ https://blog.infogr.am/4-steps-informative-data-visualization/ https://okfn.org/opendata/ http://opendatahandbook.org/guide/en/what-is-open-data/ http://www.datajournalismtools.net/
www.transparency.org facebook.com/transparencyinternational twitter.com/anticorruption blog.transparency.org © 2014 Transparency International. All rights reserved.