Presentation is loading. Please wait.

Presentation is loading. Please wait.

IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.

Similar presentations


Presentation on theme: "IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology."— Presentation transcript:

1 iPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology

2 1973 Sharp, Sambrook, Sugden Gel Electrophoresis Chamber, $250 1958 Matt Meselson & Ultracentrifuge, $500,000 The Egalitarian Gene Agarose Gel Electrophoresis, 1973

3 The Egalitarian Genome Next Generation Sequencing, 2005 Bacterial coloniesPCR colonies (clusters, features) Hundreds of millions of…

4 Human Genome: $2.7 Billion, 13 Years Human Genome: $900, 6 Hours 2013? Oxford Nanopore MiniION 2003: ABI 3730 Sequencer The Egalitarian Genome Next-Next Generation Sequencing, 2013?

5

6

7 “BGI, based in China, is the world’s largest genomics research institute, with 167 DNA sequencers producing the equivalent of 2,000 human genomes a day. BGI churns out so much data that it often cannot transmit its results to clients or collaborators over the Internet or other communications lines because that would take weeks. Instead, it sends computer disks containing the data, via FedEx.” The Big Data Problem Storage and Analysis

8 Biology’s Other Big Data Phenomics Visualization

9 Paradigm Shift data limited > data unlimited world Hypotheses underdetermined by data > data underdetermined by hypotheses Reductive biology > constructive biology

10 The useful lifetime of our analysis toolchains is now 6 months -Matthew Trunnel, Broad Institute Requires a *platform* that can support diverse and constantly evolving needs. Cyberinfrastructure is the platform for a biological “App Store” that allows scientists to run tools and workflows they need.

11 An NSF project to develop a computer infrastructure to apply computational thinking to solve biological problems Virtual organization High performance computing Data and data analysis Learning and workforce iPlant Collaborative

12

13 UA TACC CSHL iPlant Collaborative A virtual organization

14 High Performance Computing Texas Advanced Computing Center (TACC) 2 of the three largest parallel computers in the XSEDE (formerly TeraGrid) System 90,000 Compute Cores Up to 1TB shared memory Growing to ~500,000 cores by end of 2012 Dan Stanzione, Deputy Director

15 iPlant Audiences: Converge on the Middle Ground High: bioinformaticians and computational biologists Mid: bright biological researchers who need to solve problems – but who aren’t bioinformaticians or don’t know one down the hall. Low: high school and college faculty engaged primarily in teaching

16 iPlant Collaborative

17 Ways to Access iPlant iData Store: All data large and small Atmosphere: For virtual hosting of web apps, sites, databases. Discovery Environment: Integrated Web apps. MyPlant: Social Networking. DNASubway: Annotation and more Standalone Apps: TNRS, TreeViewer, PhytoBisque, etc The API: for programmers embedding iPlant capabilities Command line for experts (thru TeraGrid/XSEDE)

18 Data Store Texas Advanced Computing Center Dan Stanzione: “We hit a billion files about a year ago, so when people ask us what we’re going to do about a billion files. The answer is we’re going to do this.” 100,000 Terabytes of disk and tape. Data Store moves > 2 GB files with ease

19 Atmosphere Cloud Computing for Biology Handle those big data Analogous to Amazon Elastic Compute Cloud (EC2) Default virtual machine (VM) has 6 CPUs with 16 GB of RAM compared to desktop or laptop 1-2 CPUs with 1-4 GB RAM Up to 16 CPU/32G RAM VM can be assigned on request Co-localize with your data from the iPlant Data Store Configure machine, data transformation to share with collaborators or with use case for students.

20 Discovery Environment A rich web client Consistent interface to a range of bioinformatics tools Integrated, extensible system of applications and services Add tools, build custom workflows

21 Other major projects are beginning to adopt the iPlant CI as their underlying infrastructure (some completely, some in limited ways): CoGe (auth service, hosting) BioExtract (web service platform) CiPRES (computation) Gates Integrated Breeding Platform (hosting, development) Galaxy (storage, for now)

22 iPlant APIs Resources The Biology App Store


Download ppt "IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology."

Similar presentations


Ads by Google