RPI Li Ding, Jim Hendler and Deborah McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute July 27, 2010 The Data-gov project is headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores.
2 Raw Government Data Now January 1, 2009 Openness will strengthen our democracy and promote efficiency and effectiveness in Government. --- President Obama Putting Government Data online May 21, 2009 January 19, 2010 data.gov.uk online May 21, 2010 data.gov online data.gov relaunch with semantic web featured June30,2009 December 8, 2009 Open Government Directive released …
3 Semantic Web featured at data.gov leveraged contributions from the Tetherless World Constellation at RPI published 6.4 billions of triples (almost doubled LOD cloud – 13 billion triple in total) hosted triple store (virtuoso) and open source RDF mashups
4 Data-gov Wiki: Portal for Innovations at RPI The Data-gov Wiki explores and educates the use of semantic web technologies, esp. linked data, in producing, processing and utilizing government data from data.gov. The Data-gov Wiki is run by the Tetherless World Constellation at RPI, headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores. 40+ Demos 400+ Datasets Tutorials & Videos
5 Synopsis Open Data: available for public use Linked Data: easy to integrate Visualization: easy to understand data Mashups: enrich meaning of data Provenance: make mashups accountable
6 A Typical Mashup: CASTNET Exhibit Visualization API Data.gov CASTNET Ozone (CSV) epa.gov CASTNET Site (CSV) Convert raw dataset into linkable RDF Data MashupWeb Application Mashup Visualization Mashup query multiple RDF dataset via SPARQL end point surf to EPA applications 1 2 drill down for details 3 4 Created by Dominic DiFranzo, PhD student at RPI,
7 Mashup: AGI vs Medicare Claims Created by Peter Coons, [Spatial Mashup] Data.gov (AGI + Medicare Claims + Population)
8 Mashup: US and UK Foreign AID AIDMajor aids from USMajor aids from UK PakistanUS >UKEconomic/Security Assistance,Health, IndiaUK > USChild Survival and HealthHealth, Created by James Michaelis, PhD student at RPI, Data Sources: [Spatial Mashup] Data.gov (USAID) + Data.gov.uk (DFID)
9 Social Mashup: US Wildland Fire Wildland fire (NIFC) Budget on wildfire DOI and USDA (OMB) Category:Wildfires In The United States Created by Li Ding, researcher at RPI, [Temporal Mashup] Data.gov (statistics+ budget) + Wikipedia (famous fires)
10 Mashup: White House Visitor Search POTUS dbpedia:Barack_Obama Created by Dominic DiFranzo, [Person Mashup (via Data-gov Wiki)] Data.gov (statistics) + Wikipedia (personal profiles) whitehouse Data-gov Wiki Wikipedia
11 Mashup: USPS Spending and News Created by Sarah Magidson, [Temporal Mashup] Data.gov (spending and budget) + User-contributed Data (news)
12 Mashup: Supreme Court Justices Created by Xian Li, [Person Mashup] Data.gov (budget) + SCDB (Voting History) + Wikipedia (personal profiles)
13 More Mashups: Using Web Tools SPARQL results (XML) can be converted into other formats (e.g. JSON, CSV) as input of other Web tools: Yahoo Pipes, IBM Many Eyes, Microsoft Web n-gram Service, …
14 More Mashups: Provenance Critical to accountability Demo => Dataset => Agency –Where data come from? Agency =>Dataset => Comments –Support users feedback Dataset Demo Agency
15 Conclusions Now –6.4 billions of triples from data.gov –data + visualization + mashup is powerful –Low-cost solutions available for education Future –Development More raw data, data catalog, links, RDFa More tools, esp. Web visualization, SPARQL endpoint More demos and applications in different domain –Research Integration: link, search, social contribution,… Provenance: source, versions, trust, … Usability: scalable, quality…