Data uploading and sharing with CyVerse

Slides:



Advertisements
Similar presentations
KEYS TO SUCCESS DATA PREPARATION AND ORGANIZATION
Advertisements

IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store.
Backing Up Your Computer Hard Drive Lou Koch June 27, 2006.
Putting yourself online Making Web 2.0 work for you By Jonathan Smith Overseas School of Colombo.
Managing Data with iPlant Introduction to Uploading, Downloading, Sharing, and Metadata in the Data Store.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
1 iPlant Data Store (iDS) Supporting the Lifecycle of Data Nirav Merchant 1.
DDN & iRODS at ICBR By Alex Oumantsev History of ICBR  Campus wide Interdisciplinary Center for Biotechnology Research  Core Facility  Funded by the.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iCommands and Other Data Store Resources.
Preserving Digital Culture: Tools & Strategies for Building Web Archives : Tools and Strategies for Building Web Archives Internet Librarian 2009 Tracy.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
| nectar.org.au NECTAR TRAINING Module 3 Common use cases.
-- Don Preuss NCBI/NLM/NIH
Section 9: Configuring Roaming Profiles and Folder Redirection Managing User Profiles Configuring Folder Redirection Using Folder Redirection and Roaming.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop – Part 2 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 29, 2015,
Moving files to the server You MUST save any files that you want to access in the future to the server. Go to “My Computer”. Select the “C” drive to see.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store.
Public Relations Interim Image Archive Goal: Provide and INTERIM image archive solution for Public Relations 2 to 4 TB of images currently spread across.
The iPlant Collaborative Using iPlant for sharing, managing, and analyzing ecological data Ramona Walls Presented at ESA 2014 – Ignite session August 12,
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Overview of the iPlant Data Store.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
IPlant Collaborative Hands-on Cyberinfrastructure Workshop - Part 1 R. Walls University of Arizona Biodiversity Information Standards (TDWG) Sep. 28, 2015,
Backup. What to back up? Pictures Documents (include s if using Off Line Program) Video’s Music System Image – Programs – Registry – Operating System.
Knowledge Management Platform Communities of Practice User Guide for CoP users Copyright © 2010 Group Technology Solutions. All Rights Reserved.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop iPlant Data Store – Managing Your ‘Big’ Data.
You can use Springnote for… Personal journals Note taking Online discussions Springnote can be accessed from any computer with internet access, making.
Communications & Networks National 4 & 5 Computing Science.
MICROSOFT ONENOTE ADVANCED MODULE 1 EXPLORE ONENOTE 2010  Navigate in the OneNote program window  Work in the OneNote program window  Explore.
CyVerse-enabled NCBI Sequence Read Archive (SRA) Submission Pipeline
STORAGE LOCAL OR ONLINE. DATA STORAGE: DATA YOU STORE ONLINE FILES SUCH AS IMAGES, SPREADSHEETS, VIDEO OR MUSIC. ONLINE DATA STORAGE: WHEN FILES ARE STORES.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Data Demo and MAKER-P.
IPlant Collaborative Tools and Services Workshop Overview of the iPlant Discovery Environment Sriram Srinivasan.
CloudBerry Explorer for S3. CB Explorer Free to use Browse and manage files PowerShell functions Open and edit files  CloudBerry Explorer is an easy.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store Overview.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Transforming Science Through Data-driven Discovery Workshop Overview Ohio State University MCIC Jason Williams – Lead, CyVerse – Education, Outreach, Training.
CyVerse Focus Form: Collaborating with Image Data in Bisque Presenters: Kris Kvilekval, Dmitry Fedorov (Bisque, UC Santa Barbara) Ramona Walls (CyVerse,
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store – Managing your ‘Big’ Data Joslynn Lee, Ph.D. – Data Science.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Data Store – Managing your ‘Big’ Data Joslynn Lee – Data Science Educator.
CyVerse Data Store Managing Your ‘Big’ Data. Welcome to the Data Store Manage and share your data across all CyVerse platforms.
Transforming Science Through Data-driven Discovery Using CyVerse Cyberinfrastructure to Enable Data Intensive Research, Collaboration, and Education Atmosphere.
Joslynn S. Lee, PhD, Data Science Educator Cold Spring Harbor Laboratory, DNA Learning Center Transforming Science Through Data-driven Discovery.
Course: Cluster, grid and cloud computing systems Course author: Prof
Core ELN Training: Office Web Apps (OWA)
Backing Up Workstations: How to Protect Yourself on the Cheap
SharePoint ESSENTIALS TOOLKIT 2017 – Product Demo
CyVerse Tools and Services
Tools and Services Workshop
Joslynn Lee – Data Science Educator
CyVerse Discovery Environment
Amazon Storage- S3 and Glacier
The importance of being Connected
MANAGING, SHARING, AND PUBLISHING DATA WITH THE CYVERSE DATA STORE
A Few Questions Before We Begin
Tools and Services Workshop
Tools and Services Workshop Overview of the iPlant Data Store
SRA Submission Pipeline
Windows Operating Systems (Cont.)
BASICS 1 Windows XP.
Cyberinfrastructure for the Life Sciences
BusinessObjects IN Cloud ……InfoSol’s story
Storing and Accessing G-OnRamp’s Assembly Hubs outside of Galaxy
A quick ‘how to’ guide Lya Noon – H808
MCBIOS 2016 – University of Memphis, TN
Introducing MagicInfo 6
What is UiPATH? For more details visit this link online-training.
TERMS AND CONDITIONS   These PowerPoint slides are a tool for lecturers, and as such: YOU MAY add content to the slides, delete content from the slides,
Presentation transcript:

Data uploading and sharing with CyVerse Data Store – Managing Your ‘Big’ Data

Welcome to the Data Store Manage and share your data across all CyVerse platforms

Working with Big Data Challenges: the scope and scale of life sciences data continue to grow Big data a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time Big data sizes are a constantly moving target currently ranging from a few dozen terabytes (TB) to many petabytes of data in a single data set. Snijders, C.; Matzat, U.; Reips, U.-D. (2012). "'Big Data': Big gaps of knowledge in the field of Internet". International Journal of Internet Science 7: 1–5. “‘Big Data': Big gaps of knowledge in the field of Internet". International Journal of Internet Science 7: 1–5.

Working with Big Data Challenges: data generation is cheaper and faster http://www.genome.gov/sequencingcosts/

Biologists work with and require access to diverse data types Working with Big Data Challenge: biology encompasses more than sequence data Advanced Imaging Geospatial Network Biologists work with and require access to diverse data types

Changes in scale introduce quantitative and qualitative complications Working with Big Data Challenges: changes in data require changes in tools Difficult / slow transfers Expense for storage / backup Difficult to share and publish Analysis Metadata (What Is metadata?) Changes in scale introduce quantitative and qualitative complications

Data Store Overview The Data Store services all CyVerse platforms Access your data from multiple CyVerse services Automatic backup (redundant between University of Arizona and University of Texas Default 100 GB allocation, > 1 TB allocations available with justification

Data lifecycle support Discovery Upload Data Commons Repository (DCR), Elasticsearch Discovery Environment, iCommands, Cyberduck Metadata Add, delete, copy; metadata templates; bulk metadata Publication Analysis Data Commons Repository (DCR), NCBI-SRA Discovery Environment, Atmosphere, Agave API, BisQue, DNA Subway Sharing Community Data folders, Data Commons, quick share links

Data Store Overview Store any type of files related to your research Benefits Get Science Done Store any type of files related to your research An evolving “Data Commons” lets you access important datasets Reproducibility Metadata captures information needed for reproducibility Automatic backup and accessibility support your project’s data management plan Productivity iRODS makes high-speed transfers possible (100 GB in ~30 min) Share data instantly with collaborators within CyVerse

Key component of your data management Data Store Overview Some important things we will not “see” in the demo Data Backups Data Transfer Source Destination Copy Method Time (seconds) CD My Computer cp 320 Berkeley Server scp 150 External Drive 36 USB 2.0 Flash 30 Data Store iget 18 15 Arizona Texas Key component of your data management Worry-free Closer to optimum conditions: transfers between University of Arizona and UC Berkeley 100 GB: 26m15s, 1 GB 17.5s

Local connections and institutional policies limit data transfer Data Store Overview Some important things we will not “see” in the demo http://www.speedtest.net/ Local connections and institutional policies limit data transfer

Getting Data into CyVerse

CyVerse Data Store Point-and-click Command line Multiple ways to access Point-and-click Command line Cyberduck Discovery Environment iCommands

Discovery Environment Simple upload/download for small files Bulk upload files and folders (<10GB) Import from URL (no size limit) Advantage + Disadvantage - Covers most upload/download sharing needs Some size/speed limitations

Cyberduck Drag and drop files and folders No size limit, file editing/previews Easy Desktop functionality Advantage + Disadvantage - More like desktop file systems No permissions/metadata control

Requires some command line expertise iCommands Full flexibility Ability to script and automate Access from terminal/server Advantage + Disadvantage - Customizability Requires some command line expertise