Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics Community of CNGrid A New Approach to Utilizing Grids

Similar presentations

Presentation on theme: "Bioinformatics Community of CNGrid A New Approach to Utilizing Grids"— Presentation transcript:

1 Bioinformatics Community of CNGrid A New Approach to Utilizing Grids
Yongwei Wu Tsinghua University Participants Sponsorship Beijing Institute of Genomics, CAS Tsinghua University

2 Outline Background Our Approach Achievements Concluding Remarks

3 Exponential Growth of Bio Data
We need more storage and processing power to store and analyze these data!

4 Grids Draw Much Attention
Bioinformatics is an important application domain of grid computing around the world!

5 Problems with Existing Bioinformatics Grids Practice
Professionals and sharing are not well balanced Resources are limited in the environments built under the leadership of domain scientists. The scale is also limited in the environments built only by the Bioinformatics researchers. For those environments built based on general infrastructure, they are usually not professional, hard to use No support for sharing GUI software Not highlighting data’s support for computation Data synchronization, backup and storage are beyond the ability of domain users, whereas IT developers know little about application requirements. Covering only partial research activities No support of daily communication, results sharing, …

6 Outline Background Our Approach Achievements Concluding Remarks

7 Key Points Domain scientists lead the bioinformatics community development Develops Nova to support GUI software sharing Nova is a toolkit for customizing app environment Highlights data support for computation Storage can be attached to the computing environment Introduces new functionalities Knowledge repository, data/software sharing, Q&A system

8 Nova: A Virtual Computing Toolkit
Nova aims to provide facilities for users to utilize physical infrastructures in an easier and more productive way. Customized Host Customized Cluster Customized Services Nova

9 Nova Architecture & Work Procedure
Master Node Worker Nodes Information Service Configuration ② Query ③ Create VM ① Request Worker Selection Data Storage ④ OS Image VM ⑤ Start VM ⑦ Notification ⑧ VNC Remote Desktop Data Storage KVM/XEN Hypervisor ⑥ App Image VM Monitor

10 Nova Features Install-/configuration-free client
users only need a Web browser to use the system High productivity pre-virtualized software and one-click configuration Inherent integration with storage cloud After VMs are created, personal space in storage cloud can be automatically attached as an independent driver, which then acts as a source for input data and as the destination of produced data

11 GUI Software Sharing by Nova
User Request Nova Core Services Worker Selection Image Loading

12 Providing Workflow Support
To improve research efficiency further Both workflow definition tool and workflow services are supplied.

13 Knowledge Repository Hot Research Topics Important Journals
Important Conferences/Workshops Famous Scholars Important Research Institutes Influential Surveys/Papers Important Organizations/Associations

14 Other Useful Resources
Q & A System For users to help each other System Announcement Seminar Research Breakthrough Conference/Workshop CFP Newly added functions

15 Outline Background Our Approach Achievements Concluding Remarks

16 The Community is on Service

17 Sequence Format Conversion (17)
234 tools are integrated! DNA Analysis (28) (SequenceViewer, WinGene, etc.) RNA Analysis (19) (miRanda, RNAshapes, etc.) Protein Analysis (17) (InterproScan, InterViewer, etc.) Protein Structure (38) (Protein Explorer, RasTop, etc.) 234 Evolution Analysis (41) (MEGA, GeneTree, etc.) Sequence Assemble and Alignment(74) (BLAST, ClustalX, BioEdit, etc.) Sequence Format Conversion (17) (SeqVerter, DataConvert, etc.)

18 47 databases are provided!
UCSC Genome full mirror + following

19 Community Usage More than 100 users now
More than 60 institutes are involved. Users’ preference to the resources provided Software tools 77.38% Database 67.86% Knowledge Repository and others 32.14% Software Tool Database Knowledge Repository

20 Scenario for HnNn Analysis

21 Outline Background Our Approach Achievements Concluding Remarks

22 Concluding Remarks User- and task-oriented design is important
The key to driving cloud computing successful The reason why we choose domain scientist-leading design Value-added services are important The key to attracting and retaining users The reason why we provide workflow support Challenges still ahead How to survive the data deluge? How to support new requirements?

23 Thanks!

Download ppt "Bioinformatics Community of CNGrid A New Approach to Utilizing Grids"

Similar presentations

Ads by Google