Dawei Lin, Ph.D. Director, Bioinformatics Core UC Davis Genome Center July 20, 2008, SLIMS (Solexa sequencing.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Cluster architecture for Java web hosting at CERN CHEP 2006, Mumbai Michał Kwiatek, CERN IT Department Database and Engineering Services Group.
Bioinformatics caacaagccaaaactcgtacaaCgagatatctcttggaaaaactgctcacaatattgacgtacaaggttgttcatgaaactttcggtaAcaatcgttgacattgcgacctaatacagcccagcaagcagaat Managing.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
World’s Leading Provider of Turn-key Compute Solutions for NGS / Bioinformatics.
WV-INBRE West Virginia IDeA Network of Biomedical Research Excellence Managing the NextGen data pipeline Jim Denvir, Ph.D.
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
Affymetrix Microarray and Illumina/ Solexa NextGen Sequencing Yuannan Xia, Ph.D Genomics Core Research Facility
Computing Resources Joachim Wagner Overview CNGL Cluster MT Group Cluster School Cluster Desktop PCs.
UK -Tomato Chromosome Four Sarah Butcher Bioinformatics Support Service Centre For Bioinformatics Imperial College London
September 22, 2009GONG H-alpha Review1 Data Acquisition System (DAS)
INSTALLING QNAP NAS FOR A SMALL NETWORK OF 5 PCS.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
THE QUE GROUP WOULD LIKE TO THANK THE 2013 SPONSORS.
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
DRAW+SneakPeek: Analysis Workflow and Quality Metric Management for DNA-Seq Experiments O. Valladares 1,2, C.-F. Lin 1,2, D. M. Childress 1,2, E. Klevak.
ww w.p ost ers essi on. co m E quipped with latest high end computing systems for providing wide range of services.
IPlant Collaborative Powering a New Plant Biology iPlant Collaborative Powering a New Plant Biology.
Hardware Overview Iomega Network Storage LENOVO | EMC CONFIDENTIAL. ALL RIGHTS RESERVED. Storage for SMB and Distributed Enterprise PX SERIES.
Technology Expectations in an Aeros Environment October 15, 2014.
Bioinformatics Core Facility Ernesto Lowy February 2012.
NCSU Libraries TRLN Digital Preservation Seminar NCSU.
Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.
Cluster Computing Applications for Bioinformatics Thurs., Aug. 9, 2007 Introduction to cluster computing Working with Linux operating systems Overview.
Planning and Designing Server Virtualisation.
DDN & iRODS at ICBR By Alex Oumantsev History of ICBR  Campus wide Interdisciplinary Center for Biotechnology Research  Core Facility  Funded by the.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
Sandor Acs 05/07/
NML Bioinformatics Service— Licensed Bioinformatics Tools High-throughput Data Analysis Literature Study Data Mining Functional Genomics Analysis Vector.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
BGBM IT infrastructure and collection management Anton Güntsch.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Backup. What to back up? Pictures Documents (include s if using Off Line Program) Video’s Music System Image – Programs – Registry – Operating System.
Data Workflow Overview Genomics High- Throughput Facility Genome Analyzer IIx Institute for Genomics and Bioinformatics Computation Resources Storage Capacity.
UK NGS Sequencing Update July 2009 Dr Gerard Bishop - Division of Biology Dr Sarah Butcher – Centre for Bioinformatics.
DROPBOX VS. GOOGLE DOCS WHICH ONE TO USE?. QUESTIONS TO ASK YOURSELF – SELF ASSESSMENT Do you have too many copies of the same files on multiple computers?
PC clusters in KEK A.Manabe KEK(Japan). 22 May '01LSCC WS '012 PC clusters in KEK s Belle (in KEKB) PC clusters s Neutron Shielding Simulation cluster.
Computational Research in the Battelle Center for Mathmatical medicine.
GroundMed Data Management System & On-Line Presentation of Monitoring Data Benedikt Andre Quintino Duarte.
Cancer Center Support Grant Site Review Date Cancer Center Support Grant Site Review Date Genomics High-Throughput Facility (GHTF) and Bioinformatics Core.
ClinicalSoftwareSolutions Patient focused.Business minded. Slide 1 Opus Server Architecture Fritz Feltner Sept 7, 2007 Director, IT and Systems Integration.
Taverna in App4Andy. Current status Version 1.0 – AWS-based NGS annotation pipeline – Completed Boran, N’Dama, Cape Buffalo Processed Watson data through.
Bio-IT World Conference and Expo ‘12, April 25, 2012 A Nation-Wide Area Networked File System for Very Large Scientific Data William K. Barnett, Ph.D.
Galaxy Community Conference July 27, 2012 The National Center for Genome Analysis Support and Galaxy William K. Barnett, Ph.D. (Director) Richard LeDuc,
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
Lars Ailo Bongo NBS meeting Tromsø, Jan 23, 2016 NeLS Norwegian e-Infrastructure for Life Sciences Overview and recent developments
Australian Synchrotron Data curatorship for protein crystallography Julian Adams & Richard Farnsworth.
HP Proliant Server  Intel Xeon E3-1220v3 (3.1GHz / 4-core / 8MB / 80W).  HP 4GB Dual Rank x8 PC E (DDR3-1600) Unbuffered Memory Kit.  HP Ethernet.
Computational Sciences at Indiana University an Overview Rob Quick IU Research Technologies HTC Manager.
Transforming Science Through Data-driven Discovery Tools and Services Workshop Atmosphere Joslynn Lee – Data Science Educator Cold Spring Harbor Laboratory,
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Canadian Bioinformatics Workshops
WHAT IS CLOUD COMPUTING? Pierce County Library System.
Basic Guide to Computer Backups Eric Moore Computer Users Group of Greeley September 13, 2008.
Basic Guide to Computer Backups
Cancer Genomics Core Lab
Enterprise Storage at the Institute for Advanced Study
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
CyVerse Tools and Services
CyVerse Discovery Environment
Welcome! Thank you for joining us. We’ll get started in a few minutes.
Bare Metal Server Backup Solution
The Ultimate Backup Solution.
Office 365 and Microsoft Project Integrations for HULAK Project Management Software Enable Teams to Remain Productive and Within Budget OFFICE 365 APP.
Presentation transcript:

Dawei Lin, Ph.D. Director, Bioinformatics Core UC Davis Genome Center July 20, 2008, SLIMS (Solexa sequencing Laboratory Information Management System)

Next Gen Sequencing Applications Deep Sequencing (de novo, resequencing) SNP discovery ChIP-Seq SAGE Run-through Sequencing Digital Expression Profiling ……

Illumina Sequencing Data 800GB 200GB Hundred of thousands files 17 hours of copying to a USB drive

Core Facility Specific Issues Stable and reliable Infrastructure Privacy - Multiple Customers Data Sharing Web access Interoperability Recharge Each lane can belong to different customer

Illumina Genome Analyzer (GA) 1TB/per data set per 3 days Solexa Server for image processing and base calling ( 2 Intel Xeon E5345 Quad-core 2.33GHz, 16GB RAM, ~8TB ) Processing time ~30 hours/data set Data retention time Up to 4 weeks (no long term storage) Copy on the fly Solexa Sequencing Data Flow (This infrastructure can hold two copies of data at least for three months) Linux Cluster alignment & assembly Sun Storagetek Tape Backup Library Online Data Access Server Sun Thumper x4500 (48TB) Data retention time up to 3 months 2 nd copy Web access Secure Shell access 1 st copy Mobile hard drive Data retention time – user specified 2 month Free access Mobile hard drive Self service recharge Disk to Disk backup/ Redundant Server

SLIMS workflow GA operation MySQL Central Storage Access VM rsych Web

Future Directions Open Source ( OpenID Integrated with different pipelines BioCloud

Acknowledgement Adam Schaal DB Programmer Brad Sickler System Programmer Charlie Nicolet Director of DNA technology Core Dawei Lin

Run view

Lane view

Summary

View files folder

Create a run

Status of rsync between different servers

Documentation