Dawei Lin, Ph.D. Director, Bioinformatics Core UC Davis Genome Center July 20, 2008, SLIMS (Solexa sequencing Laboratory Information Management System)
Next Gen Sequencing Applications Deep Sequencing (de novo, resequencing) SNP discovery ChIP-Seq SAGE Run-through Sequencing Digital Expression Profiling ……
Illumina Sequencing Data 800GB 200GB Hundred of thousands files 17 hours of copying to a USB drive
Core Facility Specific Issues Stable and reliable Infrastructure Privacy - Multiple Customers Data Sharing Web access Interoperability Recharge Each lane can belong to different customer
Illumina Genome Analyzer (GA) 1TB/per data set per 3 days Solexa Server for image processing and base calling ( 2 Intel Xeon E5345 Quad-core 2.33GHz, 16GB RAM, ~8TB ) Processing time ~30 hours/data set Data retention time Up to 4 weeks (no long term storage) Copy on the fly Solexa Sequencing Data Flow (This infrastructure can hold two copies of data at least for three months) Linux Cluster alignment & assembly Sun Storagetek Tape Backup Library Online Data Access Server Sun Thumper x4500 (48TB) Data retention time up to 3 months 2 nd copy Web access Secure Shell access 1 st copy Mobile hard drive Data retention time – user specified 2 month Free access Mobile hard drive Self service recharge Disk to Disk backup/ Redundant Server
SLIMS workflow GA operation MySQL Central Storage Access VM rsych Web
Future Directions Open Source ( OpenID Integrated with different pipelines BioCloud
Acknowledgement Adam Schaal DB Programmer Brad Sickler System Programmer Charlie Nicolet Director of DNA technology Core Dawei Lin
Run view
Lane view
Summary
View files folder
Create a run
Status of rsync between different servers
Documentation