Download presentation
Presentation is loading. Please wait.
1
Research Data Storage Resources at IU
Anurag Shankar University Information Technology Services Indiana University March 2, 2012
2
Outline Data Storage Use Cases Data Storage Services
University Information Technology Services November 13, 2018 Outline Data Storage Use Cases - Types of research data and the storage they require Data Storage Services - Where/how to store your data Storage of HIPAA Regulated Data - Storing sensitive data Real World Examples - How people are using the storage services
3
Data Types and Desired Storage Characteristics
University Information Technology Services 11/13/2018 Data Types and Desired Storage Characteristics Type of Data Volume Throughput Access Speed Criticality Data being acquired MB -TB MB/second Fast High (not easy to reproduce) Data in analysis MB-GB/s Very fast Low - High (reproducible) Data being published/shared MB - GB KB-MB/s Moderate Low (reproducible) Archival data MB - PB Slow High if not also stored elsewhere
4
Research Data Storage Services
University Information Technology Services 11/13/2018 Research Data Storage Services Data Capacitor Research File System (RFS) Scholarly Data Archive (SDA) Research Database Complex (RDC) Alfresco Share REDCap Slashtmp
5
How IU’s Research Data Storage Services Fit Data Types
University Information Technology Services 11/13/2018 How IU’s Research Data Storage Services Fit Data Types Type of Data Resource/Service Space Available Eligibility Duration Data being acquired RFS, Data Capacitor GB – 100s of TB IU Days – Months Data in analysis RFS, Data Capacitor on Big Red/Quarry MB - TB Days - Months Data being published/shared Server disk, Alfresco Share, REDCap, Slashtmp MB - GB IU, PU, ND, outside users Months - Years Archival data SDA GB - PB Years
6
Data Storage Services by Use
University Information Technology Services 11/13/2018 Data Storage Services by Use Use Service Access Backed Up? High Performance Storage Data Capacitor File System on Big Red/Quarry No Storage for In-Work, Data RFS Mapped Drive, Web, SFTP, OpenAFS client Yes Structured Data Storage RDC (Oracle, MySQL) Applications Shared Document Storage Alfresco Share Web, WebDAV Shared Storage for HIPAA data REDCap Web Archival Storage SDA Web, Mapped Drive, SFTP, Parallel FTP
7
Data Storage Services by Use
University Information Technology Services 11/13/2018 Data Storage Services by Use Service Targeted For Not Good For RFS Storing relatively small files that are updated and/or accessed frequently, need group access Storing database files, backups SDA Storing large files or small files aggregated (zipped) into large files, long-term storage Storing small files, files requiring frequent/quick access, in work data RDC Relational databases Storing unstructured data Alfresco Share Sharing Word, Excel, PDF, text files Storing data REDCap Storing & sharing HIPAA data General storage Data Capacitor Temporary data being read or written on Big Red/Quarry requiring the fastest speeds available Slashtmp Temporary space to exchange files too large as attachments
8
Storage Resource/Service Details
University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Technology Capacity RFS OpenAFS 60TB SDA High Performance Storage System (HPSS) 15 PB tape, 150TB disk RDC Oracle, MySQL 200TB Alfresco Share 1TB REDCap Data Capacitor Lustre 360TB
9
Storage Resource/Service Details
University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Default Quota Account Request RFS 100GB SDA None RDC 10GB Alfresco Share REDCap Data Capacitor Big Red/Quarry account Slashtmp 4GB No Slashtmp account needed, only IU login to use
10
Storage Resource/Service Details
University Information Technology Services 11/13/2018 Storage Resource/Service Details Service Web Access URL More Help at RFS SDA RDC Application specific Alfresco Share REDCap Data Capacitor N/A (accessed from the Unix command line) Slashtmp
11
Storage of HIPAA Regulated Data
University Information Technology Services 11/13/2018 Storage of HIPAA Regulated Data HIPAA (Health Insurance Portability and Accountability Act) Security Rule regulates electronic protected health information (ePHI), i.e. identifiable patient information It mandates physical, administrative, and technical controls for storing ePHI
12
University Information Technology Services
11/13/2018 HIPAA Data … To support IU School of Medicine (IUSM) researchers, RT initiated a project in 2008 to align its systems and services with HIPAA The project was overseen by a committee consisting of IU’s compliance office, IT security and policy offices, the IUSM CIO, faculty, and IT staff Alignment included gap and risk analyses by an outside expert, filling gaps, and the creation of an ongoing risk management plan
13
University Information Technology Services
11/13/2018 HIPAA Data … In 2009, the compliance office blessed RT being capable of handling ePHI As of Dec. 31, 2011, this has resulted in the following (starting from zero): Number of biomedical user accounts on RT systems : 2800 Volume of biomedical data stored on RT systems : 500TB Use of computing cycles on RT supercomputers : 1 million SUs Number of biomedical databases : 450 Number of new RT services developed specifically for biomedical researchers : 10 Number of major NIH grants we are written into : 5 Number of FTEs these grants have funded : 6
14
University Information Technology Services
11/13/2018 Real World Examples A research group in the IUSM Dept. of Radiology was running out of space in the department to archive digital X-ray images ( MB/image). They were able to use the SDA to store tens of thousands of these images and now rely solely on SDA as their image archive. They have over 10TB of data currently stored.
15
University Information Technology Services
11/13/2018 Real World Examples … A research group needed to use an application to view a certain collection of data at the same time. They stored it in a group area in RFS, mapped it to drive R: on their individual Windows desktops, and accessed it simultaneously from various campus location as well as home/while traveling (using VPN).
16
University Information Technology Services
11/13/2018 Real World Examples … The state of Indiana collected geospatial data when they flew the state in Because of its size, no one in the state had the capacity to make these data available to the public. The SDA was used to store all 20TB of orthoquads and serves them currently over the web (see
17
University Information Technology Services
11/13/2018 Real World Examples … A research in the School of Library Information Sciences at IUB wanted to explore relationships between fields of science in scientific journal publications. She was able to use the RDC to host nearly a TB of data in Oracle and do her relational work.
18
University Information Technology Services
11/13/2018 Real World Examples … An IU researcher had an urgent need for a shared space to store work documents for collaborators within and outside IU. He could not wait for affiliate accounts to be created for external users. He was able to use Alfresco Share to set up multiple collaboration “sites” for the project within the hour and invite collaborators to these sites. Each space provides not only a shared document library, but also shared wiki, blogs, etc.
19
University Information Technology Services
11/13/2018 Real World Examples … The Division of Biostatistics at IUPUI wanted to help a clinical researcher at IUSM migrate data in spreadsheets to a central, web-accessible database, and allow her to share the database with collaborators at University of Maryland who needed to add patient data (ePHI) they were acquiring. They used REDCap to accomplish all this AND export the data in a format ready for the SAS statistics package to analyze.
20
University Information Technology Services
11/13/2018 Contact Your single point of contact for all things RT: Anurag Shankar Local Contact: Carol Wood
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.