ODaF Europe 2009 Virtual Research and Collaborative Center Pascal Heus, Open Data Foundation Tim Mulcahy, National Opinion Research Center

Slides:



Advertisements
Similar presentations
Workshop on Metadata Standards and Best Practices November 19-20th, 2007 Session 1 Leveraging Metadata Standards in RDC Pascal Heus Open Data Foundation.
Advertisements

Workshop on Metadata Standards and Best Practices November th, 2007 Session 3 Researcher Metadata in RDCs Pascal Heus Open Data Foundation
11th Annual Federal CASIC Workshops Washington, DC, March 6 - 8, 2007 Session WP4 Metadata challenges and solutions for socio-economic data Pascal Heus.
Workshop on Metadata Standards and Best Practices November th, 2007 Session 4 The Data Documentation Initiative Technical Overview Pascal Heus Open.
10th Annual Open Forum for Metadata Registries New York, NY, July 9-11, 2007 Track 3 – Future Directions Metadata challenges and solutions for socio-economic.
3rd International Digital Curation Conference Washington, DC, Dec 2007 Paper Presentations: Interoperability, Metadata & Standards Data Documentation Initiative:
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
Status on the Mapping of Metadata Standards
ODaF Europe 2008 Colchester, UK, April 14-15, 2008 Metadata in social science and the Open Data Foundation Pascal Heus Open Data Foundation
National Institute of Statistics, Geography and Informatics (INEGI) Implementation of SDMX in Mexico.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Business Development Suit Presented by Thomas Mathews.
COMBASE: strategic content management system Soft Format, 2006.
An Introduction to Repositories Thornton Staples Director of Community Strategy and Alliances Director of the Fedora Project.
Why, what were the idea ? 1.Create a data infrastructure, 2.Data + the knowledge products that are produced on the basis of data a) Efficiant access to.
INTRODUCTION TO THE VIRTUAL PLATFORM OF THE AFRICAN FORUM FOR AGRICULTURAL ADVISORY SERVICES (AFAAS)
The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.
Tom Sheridan IT Director Gas Technology Institute (GTI)
DEV392: Extending SharePoint Products And Technologies Through Web Parts And ASP.NET Clint Covington, Program Manager Data And Developer Services - Office.
1. Failure is when users do not feel they get what they paid for. 2. Failure is when the overall organization fails to adopt the solution.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Virtual Center for Collaborative Research (ViCtoR) IASSIST 2010 – Session D3: Virtual Research Environments Pascal Heus, Metadata Technology North America.
Managing Records in 21st Century Stories from the World Bank Group.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Welcome to the Minnesota SharePoint User Group. Introductions / Overview Project Tracking / Management / Collaboration via SharePoint Multiple Audiences.
XML, DITA and Content Repurposing By France Baril.
HTML Comprehensive Concepts and Techniques Intro Project Introduction to HTML.
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
Work Package 3 SEE cluster policy learning platform.
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
Trimble Connected Community
INFOBALT, October 22, 2004, Vinius IST4Balt project information dissemination using web-based knowledge systems Zigmas Bigelis EU projects consultant Asociation.
DATA FOUNDATION TERMINOLOGY WG 4 th Plenary Update THE PLUM GOALS This model together with the derived terminology can be used Across communities and stakeholders.
WP.5 - DDI-SDMX Integration E.S.S. cross-cutting project on Information Models and Standards Marco Pellegrino, Denis Grofils Eurostat METIS Work Session6-8.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
Communication & Web Presence David Eichmann, Heather Davis, Brian Finley & Jennifer Laskowski Background: Due to its inherently complex and interdisciplinary.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
Chuck Humphrey Data Library Co-ordinator University of Alberta May 16, Capitalising on Metadata Tool development plans IASSIST 2007.
Melissa Armstrong – Sponsor Dr. Eck Doerry – Mentor Greg Andolshek Alex Koch Michael McCormick Department of Computer Science SolutionProblemDesign User.
Integrated Collaborative Information Systems Ahmet E. Topcu Advisor: Prof Dr. Geoffrey Fox 1.
PTT GSP Knowledge Management System User Training Ekkarin Sereechuenpojit System Engineer Infrastructure Solutions Wannee Govitsutthisak System Engineer.
The Brain Project – Building Research Background Part of JISC Virtual Research Environments (Phase 3) Programme Based at Coventry University with Leeds.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Mr. Gopi Nair Defense Technical Information Center Briefing at Board on Research Data and Information (BRDI) Meeting September 24, 2009 Approved for Public.
Introducing HingX now with Capacity Development Network.
Secure Epidemiology Research Platform (SERPent) Kick Start Meeting - April 15 th, 2010 Pascal Heus
ESCMID phone ESCMID/ESGAP Open Virtual Learning Community (OVLC) Draft Concept of Future Development and Costs November.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Archiving microdata Standards and good practices United Nations Statistics Commission New York, February 26, 2009 Olivier Dupriez World Bank, Development.
SDMX IT Tools Introduction
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
1 © Xchanging 2010 no part of this document may be circulated, quoted or reproduced without prior written approval of Xchanging. MOSS Training – UI customization.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
Many life sciences organizations have deployed Microsoft Office SharePoint Server-based collaborative portals for communication and data sharing. A collaborative.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
1 The NORC Data Enclave for Sensitive Microdata Timothy M. Mulcahy Senior Research Scientist, NORC/University of Chicago,
TSI Project Funded by Implemented by Kick-off training seminar Brussels, 4-6 Nov 2014 Digital platform Isabelle Gachie Vinson Luca Salvadori.
International Planetary Data Alliance Registry Project Update September 16, 2011.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Knowledge Hub Walkthrough August
Kundan Singh Venkatesh Oct 2013
Managing Records in 21st Century
DataNet Collaboration
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
SDMX in the S-DWH Layered Architecture
Capitalising on Metadata
Palestinian Central Bureau of Statistics
Presentation transcript:

ODaF Europe 2009 Virtual Research and Collaborative Center Pascal Heus, Open Data Foundation Tim Mulcahy, National Opinion Research Center

Euope 2009 Background Demand for socio-economic data has grown dramatically in the past decade –Connectivity / network speed –Globalization / Economic crisis Access to microdata has improved –Better archiving / preservation –Adoption of metadata standards such as DDI and related practices But many challenges remain: –Discovery, access remain an issue (lack of visibility) –Usability: documentation is still an issue, complexity of datasets is a barrier –No community knowledge –Dataset are still typically made available using simple / static web based interfaces –There is a lack of researchers tools that leverage on metadata

Euope 2009 Putting some ideas together… Internet technologies –Community driven virtual spaces are now very common –Social networking is widely accepted –User driven knowledge management works (for large groups) Social science –Large number of public datasets are available –Surveys can now be easily be documented using the Data Documentation Initiative –Metadata related XML technologies can significantly automate tasks and maintain linkages across the life cycle Researcher –User needs are different from the producers: they have a custom view of the data (their project) –Outputs should be preserved / captured / shared (not limited to a paper) –Need community space to foster dialog / share knowledge (within and outside research projects)

Euope 2009 A Virtual Research and Collaborative Center Go beyond the static web site to provide dynamic, virtual research within a collaborative environment Leverage on Internet / XML technologies and metadata standards Provide virtual access to public use data (global) –Web-based remote access: for discovery, analysis, publication –Enhanced analytical tools: data and documentation customization –Advanced collaboration, communication and dissemination tools: community knowledge capture, collaboration, social networking, information sharing/reuse Approach –New tools based on DDI metadata and related standards –Leverage on Web 2.0 technologies –Provide research oriented environment –Build upon open source solutions

Euope 2009 Researcher ServicesCollaborative Space My Datasets Create custom view of the data for use in project or sharing with community My Projects Bring together researchers in a virtual environment to share research ideas, data, documentation, and scripts. My Publications Package research outputs (papers, documents, scripts/programs, secondary data) for preservation, dissemination and sharing My Profile Provide individual background information, research interests, set privacy options and configure notifications services Wiki Capture knowledge surrounding the data. Initial content will be seeded with survey metadata. Communication Events and news, Community driven discussion groups, FAQ/Answers, Chat Library Searchable libraries of papers/references/documentation, scripts/programs, primary and secondary data. Most of the content is extracted automatically from the research space. Services Researcher Directory, Project Directory, Call for collaboration, Notification, Support, Training Infrastructure Primary and researcher data and metadata storage, databases, security (access, backups), web services Admin Services System and data usage reports, data/metadata management, user administration, etc. Home Welcome, background information, contact, simple access to public data and documentation

Euope 2009 General features Everything is publicly available (read only) Registered users can manage research projects and contribute to the content –Registration will likely be based on OpenID (no need to create a new account) User will optionally provide (with privacy control) –Demographics: name, nickname, , social networks –Affiliations: institutions, memberships –Academic background –Research interests

Euope 2009 Analytical Tool: My Datasets Researcher rarely use the full set of variables available in a single survey Instead derived a virtual dataset off one of more data sources Description of virtual dataset can be captured using DDI like metadata Scripts to generate that particular view can then get automatically created for various statistical packages Benefits –Hides the complexity of merging, filtering, recoding files –Independent of statistical package –Customized documentation can be produced dynamically –Virtual datasets can be versioned, shared with other, refreshed with new data, etc. –This also provides valuable usage information to data provider

Euope 2009 Analytical Tools: My Projects Provide virtual space for research team Brings together virtual datasets, documents, scripts, outputs, collaborative tools Primary Investigator can bring in collaborators Knowledge exchange tools: blog, IM, optional wiki File sharing tools: –Documents: referenced, research, outputs –Citations: within and outside project, –Scripts: shared research processes –Secondary data: microdata and aggregates –Can be marked for preservation / dissemination (see My Publications) –Can draw from community libraries Project description contains topics that provides valuable metadata for usage and collaboration

Euope 2009 Dissemination Tools: My Publications Typically research output is a PDF –This is insufficient to meet Gary Kings Replication Standard –Leads to poor preservation and reuse Need tool to package as enhanced publications –For preservation: contains everything that needs to be archived (from My Projects) –For dissemination: contains all necessary information to reproduce research process (not just the paper) Files in projects can be marked for archiving and/or dissemination –Extra metadata can be provide for each file (Dublin Core citation, etc.) –Archived files will be stored for several years –Dissemination package will be made available on the web Research paper –Can be circulated for peer review –Will be shared with the community, can be automatically sent to libraries, citation repositories, integrated into printed publications, etc. Scripts can be automatically tagged with header, author, etc. Data can marked as intermediate, final, public, etc. Public usage, comments, ratings will be reported to PI

Euope 2009 Discovery Tools: My Profile Looking for data or documents is a significant effort for researcher A metadata driven system can greatly alleviate by bringing the information to the user (rather than the other way around) Researcher profile will provide various subscription and notification tools based on research interest Examples: –Document becomes available on a specific topic or from a particular author/group –New or updated data becomes available on a specific topic –New research paper published using a specific dataset –Resarch project looking for collaborator or reviewers

Euope 2009 Collaborative: Catalogs The center community space will contain several catalogs, libraries, directories Content will be derived automatically from research projects or contributed by users/providers Data catalog: simple and complex search for dataset / variables based on survey, time,geography, topics, etc. Document library: searchable collections of research papers, survey documentation, references/methodologies, etc. Script library: statistical programs shared by projects/users searchable by dataset, language, etc. Researcher directory: lookup other researchers by interest, profile, expertise, etc. Project directory: completed, ongoing and future research projects. Also a place to advertise research opportunities

Euope 2009 Collaborative: Tools Wiki: classic community driven knowledge capture –Some of the content will be seeded automatically from DDI metadata to create pages per survey, file, variable, etc Classic tools: FAQ, news, events/calendar, chat, discussion forums Collaborative tagging: –folksonomies to capture researcher perspective/feedback at the survey, dataset, variable level –Rating/comments on papers, datasets, etc. And likely more….

Euope 2009 Administration Various management tools will be implemented Reporting –User demographics –Data usage: most user variables, popular research topics, quality feedback, etc. –System usage: hits/visits, number of active projects, new papers, secondary datasets, etc. Management –Data / metadata maintenance –User/Group management

Euope 2009 Implementation strategy Based on metadata standards Build as open source product (and leverage on OSS) Web service based architecture Virtual / cloud server environment to ensure scalability (processing and storage) Modular system to allow for incremental development Build upon other ongoing initiatives Not only a technological chalenge: need also to address organizational / legal issues

Euope 2009 Status / Next steps Project at initial stage (concept note) Partnership NORC, ODaF and other agencies Will likely start at NORC using the General Social Survey (GSS) and possibly other public use files –In discussion with other producers Planning for prototype 4Q 2009 Other options being considered: –Use for non-public dataset –Add harmonization/comparability features –Extend functionalities to aggregate data (SDMX) –Link to geography (ISO and others) –Integrate statistical engine –Integrate disclosure control features

Euope 2009 Conclusion Proposal to build innovative tools to provide a dynamic environment to perform research on survey microdata Based on metadata and open technology standards to ensure a generic solution Promotes sharing and reuse Facilitates preservation and dissemination of research outputs Foster collaboration and support community driven knowledge base Provides better understanding on the usage of the data For further information, contact –Tim Mulcahy, National Opinion Research Center (NORC), –Pascal Heus, Open Data Foundation (ODaF),

Euope 2009 XML metadata specifications for socio-economic data Statistical Data and Metadata Exchange (SDMX) –Macrodata, time series, indicators, registries – Data Documentation Initiative (DDI) –Microdata (surveys, studies) – ISO –Semantic modeling, concepts, registries – ISO –Geography – Dublin Core –Resources (documentation, images, multimedia) –

Euope 2009 The Data Documentation Initiative (DDI) International XML based specification for the documentation of social and behavioral data –Started in 1995, now driven by DDI Alliance (30+ members) –Became XML specification in 2000 (v1.0) –Current version is 2.1 with focus on archiving (survey/codebook) New Version 3.0 (2008) –Focus on entire survey Life Cycle –Provide comprehensive metadata on the entire survey process and usage –Aligned on other metadata standards (DC, MARC, ISO 11179, SDMX, …) –Include machine actionable elements to facilitate processing, discovery and analysis