Arctos/TACC Collaboration Chris Jordan Texas Advanced Computing Center

Slides:



Advertisements
Similar presentations
2011 NetIS Presentation The Complete ePublishing Platform Designed for the 21 st Century.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Arctos at the University of Alaska Museum Insect Collection Derek Sikes 1 Gordon Jarrell 2 Dusty McDonald 1 1 University of Alaska Museum Fairbanks, AK.
Public Access to Fully Digitized Collections at the Archives of American Art NARA Preservation Conference Digitizing for Preservation and Access: Past.
The Digital Preservation Network at UT Austin Chris Jordan Texas Advanced Computing Center.
A Very Brief Introduction to iRODS
BCAD Architecture 2009 British Cartoon Archive. Projects A project to digitise and catalogue the Carl Giles Archive to current international standards.
A LOOMING CRISIS: MAINTAINING ACCESS TO ELECTRONIC RESEARCH PRODUCTS Daphne Fautin University of Kansas Gail Kampmeier Illinois Natural History Survey.
EGEMS A Dedicated Web Based System for Ground Water Data Processing Analysis and Storage.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
The University of Texas Research Data Repository : “Corral” A Geographically Replicated Repository for Research Data Chris Jordan.
NSF EF Welcome to Summit III University of Florida Florida State University.
OCLC Online Computer Library Center Registry of Digital Masters A joint project of the Digital Library Federation and OCLC Taylor Surface, OCLC ALA Annual.
Welcome to the Nebraska SharePoint User Group May 7 th, 2008 Enterprise Content Management (ECM) in SharePoint Corey Erkes.
Currently 7 Thematic Collection Networks with 130 participating institutions A dvancing D igitization of B iodiversity C ollections (ADBC NSF Program)
Navigating the Maze How to sell to the public sector Adrian Farley Chief Deputy CIO State of California
Effectively Explaining the Cloud to Your Colleagues.
Models for Partnership Jennifer Johnson Kristi Palmer May 3, 2006 IUPUI's Collaborative Digital Projects in Content DM │
5280 Solutions C O R P O R A T E O V E R V I E W.
ED Plus Electronic Reserve Collection For the Libraries Wai Chan Asia Corporate Information Ltd. October 1999.
City of Seattle Office of the City Clerk Open Government = Access Challenges and Opportunities with Digital Records.
Managing the Record of Research At the Smithsonian Using SIdora SAA Research Forum August 12, 2014.
Johannes Spitzbart Phonogrammarchiv, Austrian Academy of Sciences Österreichische Tage der Digitalen Geisteswissenschaften save the data - workshop on.
 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.
Data Integration and Analysis Gateway (IABIN DIAG) A Web-Based Application to Integrate, Visualize, Analyze and Share IABIN Thematic Network Data.
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Corral: A Texas-scale repository for digital research data Chris Jordan Data Management and Collections Group Texas Advanced Computing Center.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
University of Florida Florida State University
- Raghavi Reddy.  With traditional desktop computing, we run copies of software programs on our own computer. The documents we create are stored on our.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Metadata in the iPlant Collaborative Cyberinfrastructure Birds of a Feather meeting at PAG XXII, Jan. 14, 2014.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
International Seminary on Digitisation: Experience and Technology 11 th May 2004 | National Library | Lisbon – Portugal DIGITAL ARCHIVE OF PORTUGUESE ART.
National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center Persistent Management of Distributed Data Reagan W. Moore.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
The OhioLINK Library System Ohio Library and Information Network.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Research Data Management At the Smithsonian Using Sidora CNI December 10, 2013.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University
Context: The Strategic Plan for Establishing the Network Integrated Biocollections Alliance Judith E. Skog, Office of the Assistant Director, Biological.
OOI-CYBERINFRASTRUCTURE OOI Cyberinfrastructure Education and Public Awareness Plan Cyberinfrastructure Design Workshop October 17-19, 2007 University.
Archiving and Preservation Michele Kimpton CEO, DuraSpace Bryan Beecher Director, ICPSR DuraSpace Webinar November 2, 2011.
Amazon Basin Biodiversity Information Facility – ABBIF.
Arctos A multi-institution, multi- collection museum database
Accessing MVZ: A Primer and Demo of Arctos, MVZ’s Collection Management System, for Biodiversity Researchers
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Grant Writing for Digital Projects September 2012 IODE Project Office IODE Project Office Oostende, Belgium Oostende, Belgium Sustainability and.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
CENTRAL/WESTERN MASSACHUSETTS AUTOMATED RESOURCE SHARING Digitization GOALS & THEIR LOGISTICS Michael J. Bennett Digital Initiatives Librarian C/WMARS,
ESign365 Add-In Gives Enterprises and Their Users the Power to Seamlessly Edit and Send Documents for e-Signature Within Office 365 OFFICE 365 APP BUILDER.
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
The effort-saving, cost-cutting, low-overhead, cloud capture platform.
GISELA & CHAIN Workshop Digital Cultural Heritage Network
Introduction to GIS GIS in Cadastral Management
Using E-Business Suite Attachments
LIBER Forum for Digital Cultural Heritage
CS 501: Software Engineering Fall 1999
Linking persistent identifiers at the British Library
Intermountain West Data Warehouse
Cloud computing mechanisms
Unit# 5: Internet and Worldwide Web
GISELA & CHAIN Workshop Digital Cultural Heritage Network
How Digital Humanities adds to PhD Projects
Cody W. Thompson, Ph.D. University of Michigan
Presentation transcript:

Arctos/TACC Collaboration Chris Jordan Texas Advanced Computing Center

Arctos: A 15 year history MVZ: 1995 - Hired Stan Blum to develop relational data model (following modeling by Assoc. Systematic Collections). MVZ: 1997 - Hired John Wieczorek to implement model (desktop application) using Sybase and Versata. Partial implementation (e.g., no loans). UAM: 1998-2000 - John W. migrated mammal data to Oracle, set up Versata. UAM: 2002 - Dusty McDonald replaced Versata with ColdFusion, implemented full model (first web-based instance, aka Arctos). MSB: 2003 – Joined Arctos at UAM (first multi-hosting instance). MVZ and MCZ: 2005-2007 - Implemented separate instances of Arctos at Berkeley and Harvard (MVZ: first Postgres, then Oracle). MVZ: 2009 - Moved hosting of data to Alaska (Virtual Private Database version).

Major repositories using the Arctos database: (34 collections of specimens or observations, 1.3M records)

TACC and TeraGrid 10-year history of Research Cyberinfrastructure Supercomputing, Visualization and Storage Supported by NSF to provide research resources TACC expansion of Data-focused support 1 Petabyte dedicated online disk 10 Petabytes offline archive National network of replication resources

Data Diversity at TACC Image Collections (Natural History, Art, etc) Structured Data (Economics, Public Health) BioMolecular Data (DNA, RNAseq, etc) Physical Sciences/Simulation Data Geographic data (Climate, Disaster Preparedness) Integrated Infrastructure Supports Diverse Collections

A versatile online collections management system Arctos is… A versatile online collections management system Cataloged Items (ID, attributes, parts, etc.; batch uploading, downloading, editing; encumbrances) Localities & Collecting Events (mapping, media, history) Transactions (loans, accessions, borrows, permits; email reminders) Usage (publications, projects, sponsors, GenBank) Curatorial (object tracking, parts, condition, relations, etc.) Determination history (identification, georef, attributes) All the usual stuff that every museum DB-ish creation does in one manner or another.

Breadth of Data in Arctos Fish, amphibians, reptiles, mammals, birds and bird eggs/nests, plants, arthropods, fossils, molluscs Specimens and observations Media (images, audio) Publications, fieldnotes Arctos constantly evolving to incorporate new kinds of data, e.g.,: Better representation of non-publication documents (fieldnotes, correspondence) Cultural collections (art, anthropology...) Nearly all that is known about an object (or observation) can be included in Arctos.

Arctos/TACC Partnership Arctos hosts web/database resources TACC hosts media collections Images, Recordings, etc Simple workflows for automated generation of thumbnails, JPG versions, MP3s, OCR Replication policies automatically replicate to various storage locations Images directly served from TACC to browsers

Arctos/TACC History Initial work with UAF Herbarium in 2008 Brought on MVZ Collections in 2009 Ongoing work on web audio, OCR New collections from UAF, UNM, others Currently >300,000 digital objects under management Support >100,000 downloads of original scans each year

Advantages for Collections Lower cost and management overhead Highly reliable, large-scale infrastructure No scalability issues Longer-term partnerships promote technical collaboration to add capabilities over time Provides built-in “Data Management Plan”

Long-Term Sustainability TACC plan is to be a permanent research data resource Arctos will evolve over time but the collections have permanent value Infrastructure foundation is stable Agency funding future is uncertain Develop diverse funding sources and models to support robust, long-term operation

Ongoing Efforts Expansion of storage resources at TACC (~10PB online disk) Greater engagement in data management activities Working with BRC, ADBC awards and associated data iPlant Data/Genetic resources – link to specimen records?

Thanks for your Time Steffi Ickert-Bond, UAF Gordon Jarrell, UNM Carla Cicero, MVZ Michelle Koo, MVZ Dusty Mcdonald, Arctos