Research Data Storage Resources at IU

Slides:



Advertisements
Similar presentations
XenData SXL-5000 LTO Archive System Turnkey video archive system with near-line LTO capacities scaling from 210 TB to 1.18 PB, designed for the demanding.
Advertisements

XenData SX-520 LTO Archive Servers A series of archive servers based on IT standards, designed for the demanding requirements of the media and entertainment.
STANFORD UNIVERSITY INFORMATION TECHNOLOGY SERVICES IT Services Storage And Backup Low Cost Central Storage (LCCS) January 9,
| Copyright© 2010 Microsoft Corporation Quick Start into Activating and Selling Office 365.
CAMP Med Building a Health Information Infrastructure to Support HIPAA Rick Konopacki, MSBME HIPAA Security Coordinator University of Wisconsin-Madison.
Digital Storage in the Cloud: Amazon Web Services & DSpace Barry Davis - Coordinator of Multimedia & Digital Production Services Kevin Gilbertson - Web.
Mark J. Myers Electronic Records Archivist, KY Dept for Libraries and Archives (2001-May, 2014) Electronic Records Specialist, TX State Library and Archive.
Network Redesign and Palette 2.0. The Mission of GCIS* Provide all of our users optimal access to GCC’s technology resources. *(GCC Information Services:
…your guide through terrain
Technology Steering Group January 31, 2007 Academic Affairs Technology Steering Group February 13, 2008.
How SharePoint Has Made Access To My Digital Information At IU More Convenient September 29 th, 2011 Presenters Cory P. Retherford Richard LeBeau.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
ICPL Institute for Computer Policy & Law H. David Lambert Vice President for Information Services and Chief Information Officer Georgetown University e-Discovery:
SHARESYNCPage 1 of 2 ShareSync is a business-grade file sync and share service Sync files across devices Share files and folders easily and securely Business-grade.
Anthony Atkins Digital Library and Archives VirginiaTech ETD Technology for Implementers Presented March 22, 2001 at the 4th International.
Test Review. What is the main advantage to using shadow copies?
Application Software.
EDUCATION YOU CAN TRUST ® Windows SharePoint Services Course Review Review provided by: DNS Computing Services, LLC
SWIS Digital Inspections Project (SWIS DIP) Chris Allen, Information Management Branch California Integrated Waste Management Board November 5, 2008 The.
Delivering a New Desktop and Application Deployment Strategy Indiana University and the New Emerging Personal Computing Model Duane Schau
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
ACCELERATING CLINICAL AND TRANSLATIONAL RESEARCH A simple, flexible tool for inexpensively building secure data capture systems Andy.
SXL-8 LTO Archive System. SXL-8 Components: HP 1/8 Autoloader XenData SX-10 1RU.
Plenary meeting 2015 – Chania - Crete CASCADE Data Services Yusuf Yigini, Panos Panagos, Martha B. Dunbar Joint Research Centre - European Commission.
Indiana University’s Research File System. What is the IU Research File System? /user1/user2 /collaboration User 1, on campus User 2, somewhere else BACKUP.
National Library of the Czech Republic as End-User of the Research Networks Adolf Knoll deputy director
1 Privacy Plan of Action © HIPAA Pros 2002 All rights reserved.
Sync and Exchange Research Data b2drop.eudat.eu This work is licensed under the Creative Commons CC-BY 4.0 licence B2DROP EUDAT’s Personal.
Preserving Electronic Mailing Lists as Scholarly Resources: The H-Net Archives Lisa M. Schmidt
Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
Office of Technical Assistance (OTA)1 Financial Intelligence Unit Development and the application of technology.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Creating Grid Resources for Undergraduate Coursework John N. Huffman Brown University Richard Repasky Indiana University Joseph Rinkovsky Indiana University.
Automated File Server Disk Quota Management May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department Sandia is.
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
1 Lesson 1: Computer Concepts Shalen Malabon. Computer Concepts Asian Institute of Computer Studies 222 Introduction.
WHAT IS CLOUD COMPUTING? Pierce County Library System.
Discovering Computers 2009 Chapter 1 Introduction to Computers.
IT Workshop Presented by CoM IT Thursday, November 20th 3:30 – 5:00 PM MSB ROOM 4051 Speakers: Jesse Fatherree, Wade Hedgren, Kent Norton We will be covering.
February 3, 2009 Bridging Academic and Medical Cultures Academic Research Systems and HIPAA William K. Barnett Anurag Shankar.
Providing Private Cloud Services to Support HIPAA Compliance Dennis Cromwell – Associate Vice President of Enterprise Infrastructure at Indiana University.
REDCap General Overview
Xythos in an Academic Medical Center JeffShare at Thomas Jefferson University Doug Herrick Sr. Director, Infrastructure Services Jefferson Information.
XenData SX-10 LTO Archive Appliance
Chapter 7: Using Windows Servers
Managing Explosive Data Growth
Research Data Management
File Syncing Technology Advancement in Seafile -- Drive Client and Real-time Backup Server Johnathan Xu CTO, Seafile Ltd.
Peer 2 Peer & Client Server
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Hybrid Cloud Solutions at IHG
WP18, High-speed data recording Krzysztof Wrona, European XFEL
SharePoint Solutions Architect, Protiviti
Introduction to Data Management in EGI
Putting All The Pieces Together: Developing a Cyberinfrastructure at the Georgia State University Library Tim Daniels, Learning Commons Coordinator Doug.
Chapter 1: Introduction
Success with Collaboration Software
XenData SX-550 LTO Archive Servers
A Complete Business Productivity Suite
University Information Technology Services
Interoperability of Digital Repositories
Next Gen: Campus Collaboration
digital archival storage
IT Office hours – 1 Data Sharing 101
IRB protocol no PI: Dr. David F. Chhieng
Division of Engineering Computing Services
IT Management of the Social Security
Presentation transcript:

Research Data Storage Resources at IU Anurag Shankar University Information Technology Services Indiana University March 2, 2012

Outline Data Storage Use Cases Data Storage Services University Information Technology Services November 13, 2018 Outline Data Storage Use Cases - Types of research data and the storage they require Data Storage Services - Where/how to store your data Storage of HIPAA Regulated Data - Storing sensitive data Real World Examples - How people are using the storage services

Data Types and Desired Storage Characteristics University Information Technology Services 11/13/2018 Data Types and Desired Storage Characteristics Type of Data Volume Throughput Access Speed Criticality Data being acquired MB -TB MB/second Fast High (not easy to reproduce) Data in analysis MB-GB/s Very fast Low - High (reproducible) Data being published/shared MB - GB KB-MB/s Moderate Low (reproducible) Archival data MB - PB Slow High if not also stored elsewhere

Research Data Storage Services University Information Technology Services 11/13/2018 Research Data Storage Services Data Capacitor Research File System (RFS) Scholarly Data Archive (SDA) Research Database Complex (RDC) Alfresco Share REDCap Slashtmp

How IU’s Research Data Storage Services Fit Data Types University Information Technology Services 11/13/2018 How IU’s Research Data Storage Services Fit Data Types Type of Data Resource/Service Space Available Eligibility Duration Data being acquired RFS, Data Capacitor GB – 100s of TB IU Days – Months Data in analysis RFS, Data Capacitor on Big Red/Quarry MB - TB Days - Months Data being published/shared Server disk, Alfresco Share, REDCap, Slashtmp MB - GB IU, PU, ND, outside users Months - Years Archival data SDA GB - PB Years

Data Storage Services by Use University Information Technology Services 11/13/2018 Data Storage Services by Use Use Service Access Backed Up? High Performance Storage Data Capacitor File System on Big Red/Quarry No Storage for In-Work, Data RFS Mapped Drive, Web, SFTP, OpenAFS client Yes Structured Data Storage RDC (Oracle, MySQL) Applications Shared Document Storage Alfresco Share Web, WebDAV Shared Storage for HIPAA data REDCap Web Archival Storage SDA Web, Mapped Drive, SFTP, Parallel FTP

Data Storage Services by Use University Information Technology Services 11/13/2018 Data Storage Services by Use Service Targeted For Not Good For RFS Storing relatively small files that are updated and/or accessed frequently, need group access Storing database files, backups SDA Storing large files or small files aggregated (zipped) into large files, long-term storage Storing small files, files requiring frequent/quick access, in work data RDC Relational databases Storing unstructured data Alfresco Share Sharing Word, Excel, PDF, text files Storing data REDCap Storing & sharing HIPAA data General storage Data Capacitor Temporary data being read or written on Big Red/Quarry requiring the fastest speeds available Slashtmp Temporary space to exchange files too large as email attachments

Storage Resource/Service Details University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Technology Capacity RFS OpenAFS 60TB SDA High Performance Storage System (HPSS) 15 PB tape, 150TB disk RDC Oracle, MySQL 200TB Alfresco Share 1TB REDCap Data Capacitor Lustre 360TB

Storage Resource/Service Details University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Default Quota Account Request RFS 100GB http://itaccounts.iu.edu SDA None RDC 10GB Alfresco Share http://www.indianactsi.org/alfrescorequest REDCap http://www.indianactsi.org/redcapacr Data Capacitor Big Red/Quarry account Slashtmp 4GB No Slashtmp account needed, only IU login to use

Storage Resource/Service Details University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Web Access URL More Help at RFS http://rfsweb.iu.edu http://kb.iu.edu/aroz.html SDA http://www.sdarchive.iu.edu http://kb.iu.edu/aiyi.html RDC Application specific http://kb.iu.edu/awmv.html Alfresco Share http://alfresco.uits.iu.edu http://www.indianactsi.org/kb/alfresco REDCap http://redcap.uits.iu.edu http://www.indianactsi.org/kb/redcap Data Capacitor N/A (accessed from the Unix command line) http://kb.iu.edu/data/avvh.html Slashtmp http://slashtmp.iu.edu http://kb.iu.edu/data/angt.html

Storage of HIPAA Regulated Data University Information Technology Services 11/13/2018 Storage of HIPAA Regulated Data HIPAA (Health Insurance Portability and Accountability Act) Security Rule regulates electronic protected health information (ePHI), i.e. identifiable patient information It mandates physical, administrative, and technical controls for storing ePHI

University Information Technology Services 11/13/2018 HIPAA Data … To support IU School of Medicine (IUSM) researchers, RT initiated a project in 2008 to align its systems and services with HIPAA The project was overseen by a committee consisting of IU’s compliance office, IT security and policy offices, the IUSM CIO, faculty, and IT staff Alignment included gap and risk analyses by an outside expert, filling gaps, and the creation of an ongoing risk management plan

University Information Technology Services 11/13/2018 HIPAA Data … In 2009, the compliance office blessed RT being capable of handling ePHI As of Dec. 31, 2011, this has resulted in the following (starting from zero): Number of biomedical user accounts on RT systems : 2800 Volume of biomedical data stored on RT systems : 500TB Use of computing cycles on RT supercomputers : 1 million SUs Number of biomedical databases : 450 Number of new RT services developed specifically for biomedical researchers : 10 Number of major NIH grants we are written into : 5 Number of FTEs these grants have funded : 6

University Information Technology Services 11/13/2018 Real World Examples A research group in the IUSM Dept. of Radiology was running out of space in the department to archive digital X-ray images (100-200MB/image). They were able to use the SDA to store tens of thousands of these images and now rely solely on SDA as their image archive. They have over 10TB of data currently stored.

University Information Technology Services 11/13/2018 Real World Examples … A research group needed to use an application to view a certain collection of data at the same time. They stored it in a group area in RFS, mapped it to drive R: on their individual Windows desktops, and accessed it simultaneously from various campus location as well as home/while traveling (using VPN).

University Information Technology Services 11/13/2018 Real World Examples … The state of Indiana collected geospatial data when they flew the state in 2005. Because of its size, no one in the state had the capacity to make these data available to the public. The SDA was used to store all 20TB of orthoquads and serves them currently over the web (see http://gis.iu.edu).

University Information Technology Services 11/13/2018 Real World Examples … A research in the School of Library Information Sciences at IUB wanted to explore relationships between fields of science in scientific journal publications. She was able to use the RDC to host nearly a TB of data in Oracle and do her relational work.

University Information Technology Services 11/13/2018 Real World Examples … An IU researcher had an urgent need for a shared space to store work documents for collaborators within and outside IU. He could not wait for affiliate accounts to be created for external users. He was able to use Alfresco Share to set up multiple collaboration “sites” for the project within the hour and invite collaborators to these sites. Each space provides not only a shared document library, but also shared wiki, blogs, etc.

University Information Technology Services 11/13/2018 Real World Examples … The Division of Biostatistics at IUPUI wanted to help a clinical researcher at IUSM migrate data in spreadsheets to a central, web-accessible database, and allow her to share the database with collaborators at University of Maryland who needed to add patient data (ePHI) they were acquiring. They used REDCap to accomplish all this AND export the data in a format ready for the SAS statistics package to analyze.

University Information Technology Services 11/13/2018 Contact Your single point of contact for all things RT: Anurag Shankar ashankar@iu.edu 812-325-8629 Local Contact: Carol Wood cfwood@iun.edu 219-980-7758