The Astrolabe Project: Identifying and Curating Astronomical ‘Dark Data’ through Development of Cyberinfrastructure Resources Gretchen Stahlman, PhD Candidate.

Slides:



Advertisements
Similar presentations
The Messy World of Grey Literature in Cyber Security 8 th Grey Literature Conference 4-5 December 2006 New Orleans, Louisiana Patricia Erwin – I3P Senior.
Advertisements

14 Sept 2005NVO Summer School II1 Whats on Tap in the VO? T HE US N ATIONAL V IRTUAL O BSERVATORY Robert Hanisch US NVO Project Manager Space Telescope.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
Digital & Preservation Resources Managing the digital collection life cycle.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Using Sakai to Support eScience Sakai Conference June 12-14, 2007 Sayeed Choudhury Tim DiLauro, Jim Martino, Elliot Metsger, Mark Patton and David Reynolds.
Long-Term Preservation of Astronomical Research Results Robert Hanisch US National Virtual Observatory Space Telescope Science Institute Baltimore, MD.
NIST Scientific Data for Data Science United Nations Open Data / Open Government Conference, April 26-28, Abu Dhabi
DuraCloud Managing durable data in the cloud Michele Kimpton, Director DuraSpace.
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Presenter: Karla Strieb Assistant Executive Director Transforming Research Libraries June 3, 2010 Supporting E-science: Progress at Research Institutions.
HathiTrust Digital Library. Overview ›Began in 2008 ›Large scale digital preservation repository ›Partnership of major research libraries ›Focus on both.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
The Materials Genome Initiative and Materials Innovation Infrastructure Meredith Drosback White House Office of Science and Technology Policy September.
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
SCIENCE, RESEARCH DATA, AND PUBLISHING Stewart Wills Editorial Director, Web & New Media, Science 26 February 2013.
Publisher’s Perspective: Digitization of print resources, and archiving of digital resources Judy Best, June 13, 2006.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Kevin T. Gallagher and Linda C. Gundersen September 5, 2012 CDI Science.
Open Access What does it mean? Open access (OA) means immediate, free and unrestricted online access to digital scholarly material[1], primarily peer-reviewed.
Challenges and Opportunities for Academic Libraries Collaborative Imperatives to Support Collections, Digital Initiatives, and New Services for a Changing.
How to Publish Your Code on COIN-OR Bob Fourer Industrial Engineering & Management Sciences Northwestern University COIN Strategic Leadership Board.
Finding Partners, Creating Impact Rusty Low Poles Together Workshop NOAA Boulder, CO July 20-22, 2005.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Data Archives: Migration and Maintenance Douglas J. Mink Telescope Data Center Smithsonian Astrophysical Observatory NSF
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
ARL Workshop on New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe September 26-27, 2006 ARL Prue.
DuraCloud Open technologies and services for managing durable data in the cloud Michele Kimpton, CBO DuraSpace.
April 14, 2005MIT Libraries Visiting Committee Libraries Strategic Plan Theme III Work to shape the future MacKenzie Smith Associate Director for Technology.
Arizona Astronomical Data Hub AAS 227: Dark/Orphaned Data P. Bryan Heidorn ORCID: University of January 2016.
Bringing Europe’s eLectronic Infrastructures to Expanding frontiers Santiago – September 2006 BELIEF Project Bringing Europe’s eLectronic Infrastructures.
U.S. Department of the Interior U.S. Geological Survey August 24-25, 2011 Data Management Best Practices: FY11 Report.
SEDAC Long-Term Archive Development Robert R. Downs Socioeconomic Data and Applications Center Center for International Earth Science Information Network.
Open Science and Research – Services for Research Data Management © 2014 OKM ATT 2014–2017 initiative Licenced under.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Founded in 1899, the Society  is a non-profit corporation  has an Executive Office in Washington, DC  is governed by a 19-member Council  elected.
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
ICPSR Data Fair November 8, 2010 Katherine McNeill, MIT Libraries
Accessing the VI-SEEM infrastructure
Digital Collection Development Policy
Data & Digital Stewardship
Tools and Services Workshop
Community Science Updates
Joslynn Lee – Data Science Educator
What is the National Data Service?
Creating an Academic Presence
Gretchen Stahlman, PhD Candidate, University of Arizona
? What is Institutional Repository for Rutgers University
Summit 2017 Breakout Group 2: Data Management (DM)
Workflows in archaeology & heritage sciences
Jon Dunn, Indiana University Marcel LaFlamme, Rice University
Access  Discovery  Compliance  Identification  Preservation
Long-Term Preservation of Astronomical Research Results
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Concepts.
Chief Librarian & Curator Natural History Museum of Los Angeles County
The Case for Data Management: Agency Requirements
DPubS: An Open Source Electronic Publishing System
Expanding Knowledge: Introduction to Scholarly Communication
An ecosystem of contributions
Media365 Portal by Ctrl365 is Powered by Azure and Enables Easy and Seamless Dissemination of Video for Enhanced B2C and B2B Communication MICROSOFT AZURE.
Research Data Management
The National Science Digital Library (NSDL)
Malte Dreyer – Matthias Razum
White House Office of Science and Technology Policy
Bird of Feather Session
Extending “Scholarship” to Including Teaching in a Digital World
Developing Institutional Data Repositories
Dataverse for citing and sharing research data
Presentation transcript:

The Astrolabe Project: Identifying and Curating Astronomical ‘Dark Data’ through Development of Cyberinfrastructure Resources Gretchen Stahlman, PhD Candidate University of Arizona School of Information Library and Information Services in Astronomy (LISA) VIII, June 7, 2017

Astrolabe Astrolabe is a new data repository and computational environment being created at University of Arizona (UA). Partners include: UA School of Information UA Department of Astronomy and Steward Observatory UA University Libraries CyVerse (formerly the iPlant Collaborative) American Astronomical Society Astrolabe has been funded by: UA Office for Research & Discovery (now RDI) National Science Foundation ACI

The “Astrolabe” Project Astrolabe has a mission to: Collect, preserve, disseminate Provide tools for analysis and data sharing Expose research data

Archive Management Image credit: Digital Curation Centre, www.dcc.ac.uk Image credit: http://data-archive.ac.uk/create-manage/life-cycle

Lifecycle Preservation and Access Curation “is the active management and appraisal of digital information over its entire life cycle” (Pennock, 2007). Curation requires insightful knowledge of data and communities. Resources must be developed to support publication of (and links between) research AND data.

“Dark Data” in the Long Tail Large projects have well-planned data stores, while large amounts of data remain uncurated (Heidorn, 2008). “Like dark matter, this dark data on the basis of volume may be more important than that which can be easily seen” (p. 281). Long Tail data require institutions, practices and policies to make these data useful to researchers.

Long Tail Distribution in Astronomy

The “Top 20%”

Curating the Long Tail with Astrolabe

General Properties of “Dark Data” “Dark Data” are typically: Heterogeneous Generated through unique procedures Curated by individual scientists Not maintained Obscured or protected Seldom reused Currently unnoticed

Astronomical Data Common Data Types Common Data Format Sky images Light curves Spectroscopy Catalogs Common Data Format FITS (Flexible Image Transport System) Culture of Open Access

American Astronomical Society (AAS) Key professional society for astronomers in the US Hosts two major conferences each year Non-profit organization Publishes four major journals The Astronomical Journal (AJ) The Astrophysical Journal (ApJ) The Astrophysical Journal Letters (ApJL) The Astrophysical Journal Supplements (ApJS)

CyVerse.org Discovery Environment Atmosphere Data Store Use hundreds of Apps and manage data in a simple web interface Atmosphere Custom cloud-based scientific analysis platform or use a ready-made one for your area of scientific interest Data Store Store, manage, access, and share all the data related to research

CyVerse Services

Astrolabe Organizational Model

July 2015 Workshop Outcomes Identify mission and clear science use cases Take advantage of CyVerse cyberinfrastructure and longevity of University of Arizona Obtain community buy-in and manage expectations Focus on “low-hanging fruit” such as data not curated elsewhere and data behind figures in journals Develop a follow-on workshop for additional feedback

July 2016 Workshop Outcomes Physical format of dark data (i.e. historical data stored on tapes) Author websites archiving data (not typically long-lived) LSST time domain and serendipitous data cases (follow-up to LSST observations and discovery through historical data) Searching the literature for references to dark data (for indicative text, broken links, etc.)

Astrolabe Timeline 2013 - AAS Strategy Meeting 2015 - Workshop #1 in Tucson funded by UA Start for Success seed grant 2016 UA Accelerate for Success awarded for one-year pilot – collaborators include iSchool, Steward, UA Libraries, CyVerse, AAS Changed name from Arizona Astronomical Data Hub (AADH) to Astrolabe to focus beyond AZ Established a Board of Directors Workshop #2 focusing on specifying requirements for Astrolabe system NSF ACI Grant awarded to develop WorldWide Telescope as Astrolabe front end and visualization tool, with the idea that this could scale to other repositories

WorldWide Telescope (WWT) A screenshot from WWT HTML5 web client – worldwidetelescope.org

Current Status of Astrolabe: 2017 Activities and Objectives Searching for uncurated or “at-risk” data by mining the literature, and by contacting authors individually based on our team’s review of particular types of publications Recently contracted a developer to accomplish objectives specified in recently-awarded NSF grant (award #1642446), will hire additional developers Working on funding proposals for system development, including project to create protocols for migrating data from obsolete media into Astrolabe Collaborating with CyVerse to develop and optimize interfaces, apps, metadata templates and indexing, cone search and VO Installed Montage for conversion of FITS to JPEG to TOAST Designing website as interface to CyVerse data store to facilitate data deposition and reuse

Our Team Principal Investigators AAS Affiliate WWT Developer Bryan Heidorn, PhD, UA School of Information Dennis Zaritsky, PhD, UA Department of Astronomy AAS Affiliate Julie Steffen, AAS Director of Publishing WWT Developer Jonathan Fay, AAS Contractor and Microsoft Software Engineer Postdoctoral Researcher Huanian Zhang, PhD, UA Department of Astronomy Graduate Research Associate Gretchen Stahlman, UA School of Information Astrolabe Advisory Board Members Robert Hanisch (NIST) Chris Lintott (Oxford/AAS) Barbara Kern (U of Chicago) Julie Steffen (AAS) Frank Timmes (AZ State/AAS) Benjamin Weiner (Steward/UA) Edwin Henneken (ADS) Henry “Trae” Winter, Astrolabe Advisory Board Chair (CfA)

Thank you! http://astrolabe.arizona.edu This material is based upon work supported by the National Science Foundation under Grant No. 1642446.