Bioinformatics Outline

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

Supercomputing Institute for Advanced Computational Research © 2009 Regents of the University of Minnesota. All rights reserved. The Minnesota Supercomputing.
PowerEdge T20 Channel NDA presentation Dell Confidential – NDA Required.
Custom’s K-12 Education Technology Council Presents… Custom Computer Specialists Server Technology Solutions Designed for NYCDOE Affordable and.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
DAISY Pipeline in NLB Functional and technical requirements.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
HIGH PERFORMANCE COMPUTING ENVIRONMENT The High Performance Computing environment consists of high-end systems used for executing complex number crunching.
A+ Certification Guide
Tuesday, September 08, Head Node – Magic.cse.buffalo.edu Hardware Profile Model – Dell PowerEdge 1950 CPU - two Dual Core Xeon Processors (5148LV)
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Paper on Best implemented scientific concept for E-Governance Virtual Machine By Nitin V. Choudhari, DIO,NIC,Akola By Nitin V. Choudhari, DIO,NIC,Akola.
Cluster Components Compute Server Disk Storage Image Server.
BioPerl. cpan Open a terminal and type /bin/su - start "cpan", accept all defaults install Bio::Graphics.
SUNY IT Master's Project Using Open Source Virtualization Technology In Computer Education By: Ronny L. Bull Advised By: Geethapriya Thamilarasu, Ph.D.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
ww w.p ost ers essi on. co m E quipped with latest high end computing systems for providing wide range of services.
KYLIN-I 麒麟一号 High-Performance Computing Cluster Institute for Fusion Theory and Simulation, Zhejiang University
Project Overview:. Longhorn Project Overview Project Program: –NSF XD Vis Purpose: –Provide remote interactive visualization and data analysis services.
HPC at IISER Pune Neet Deo System Administrator
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Cluster Computing Applications for Bioinformatics Thurs., Aug. 9, 2007 Introduction to cluster computing Working with Linux operating systems Overview.
Planning and Designing Server Virtualisation.
Use/User:LabServerField Engineer Electrical Engineer Software Engineer Mechanical Engineer Requirements: Small form factor.
Network Setup Assignment Chris Moore, Kwan Tonpoobaln, Jon Light.
David R. McWilliams, Ph.D. Section of Statistical Genetics, Department of Biostatistical Sciences, Center for Public Health Genomics Bioinformatician IV.
A Data Communication Reliability and Trustability Study for Cluster Computing Speaker: Eduardo Colmenares Midwestern State University Wichita Falls, TX.
Current Challenges in Metagenomics: an Overview Chandan Pal 17 th December, GoBiG Meeting.
Introduction Sample Projects Resources Summary Future Plans Bioinformatics Support Information Session Karsten Hokamp TCD 3rd October, 2007.
GENOME CONSORTIUM ON ACTIVE TEACHING USING NEXT-GENERATION SEQUENCING Vince Buonaccorsi.
AASPI Software Computational Environment Tim Kwiatkowski Welcome Consortium Members November 10, 2009.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
Metagenomics.
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
The 2001 Tier-1 prototype for LHCb-Italy Vincenzo Vagnoni Genève, November 2000.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
Metagenomic dataset preprocessing – data reduction
Computer Performance. Hard Drive - HDD Stores your files, programs, and information. If it gets full, you can’t save any more. Measured in bytes (KB,
By: Joel Dominic and Carroll Wongchote 4/18/2012.
Storage at SMU OSG Storage 9/22/2010 Justin Ross Southern Methodist University.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Enterprise Vitrualization by Ernest de León. Brief Overview.
Workstations & Thin Clients
Low-Cost High-Performance Computing Via Consumer GPUs
The demonstration of Lustre in EAST data system
Installing Windows Server 2008
HP MediaSmart Server.
Cluster / Grid Status Update
Custom Configurations
Heterogeneous Computation Team HybriLIT
National Center for Genome Analysis Support
Low-Cost High-Performance Computing Via Consumer GPUs
Retail Price List 04 September 2017 ThinkServers TS150
Diskless Remote Boot Linux
Virtualization Cloud and Fedora
PK-CIIT Grid Operations in Pakistan
Hadoop Clusters Tess Fulkerson.
Inside the computer.
Overview Introduction VPS Understanding VPS Architecture
Custom Configurations
JDAT Production Hardware
Metagenomics Image: Iverson et al. 2012, Science.
Network Attached Storage NAS100
High Performance Computing in Bioinformatics
CS 345A Data Mining MapReduce This presentation has been altered.
An introduction to the Linux environment v
HPC for large NGS data: Microbial diversity analysis
Campus and Phoenix Resources
Presentation transcript:

Bioinformatics Outline What is bioinformatics? Who are bioinformaticians? Hardware Software

What is bioinformatics?

What is bioinformatics? Someone to analyze my data The boring stuff I do between experiments Someone to help me think about my data People sitting in a dark room analyzing data A person who writes complex algorithms perl python R linux java C++ bash ruby HTML A person who knows what an HMM is That bloke who fixes my computer Someone who builds websites

Who are bioinformaticians? Scientists trying to get tenure, get grants, publish papers, train students Scientists trying to help others analyze their data

Who are bioinformaticians? YOU!

Hardware

Torrent Server Recommended Processors - Two Six-core processors RAM - 48 GB RAM HDD Capacity - Eight 2 TB Hard drives in RAID 5 with 12 TB usable Network – Quad port gigabit NIC GPU - NVIDIA Graphic Processor Unit Chassis – Dell Precision T7500 tower. No rack mount available. Monitor⁄Keyboard – not included – file access available via SSH or web service $12,500

Computers My cluster 192 TB lustre FS 51 node cluster most nodes: 16 cpus, 8 cores each,132 GB RAM, 1TB local storage (/usr/data), infiniband interconnects (6,528 cores; 6,732 GB RAM; 50 TB scratch storage) 192 TB lustre FS connected to most nodes via infiniband

Computers rambox edwards.sdsu.edu 24 processors with 6 cores each 198 MB RAM edwards.sdsu.edu lab web server 24 processors, 6 cores each 50M RAM 19TB RAID 6 storage 18TB USED

Computers file servers and back up servers 4 secret servers! 48TB backups and archival storage

Software

Software Locally installed software Remote (web) software

Local Software bioperl biopython bowtie2 cdhit crass diamond fastQC focus FOCUS FragGeneScan genemark groopm idba_ud jellyfish last masurca mauve metabat metagenemark mira MUMmer Muscle PEAR phylip prinseq qiime qudaich rapsearch scaffold_builder seed-servers spades tagcleaner tRNAscan-SE velvet

Metagenomics Processing Merge paired-end reads Preprocessing Functional Assignments Taxonomic assignments Contamination removal Gene Prediction Contig Clustering Binning reads

Metagenomics Quality control – Prinseq Statistics Deconseq Annotation FOCUS Real time metagenomics mg-rast Super FOCUS Statistics STAMP Population genomes crAss metabat ContigClustering

Metagenomics Processing AbundanceBin CompostBin concoct crAss tetra Contig clustering FASTQC FastX Toolkit fitGCP NGS QC Toolkit Non-pareil Prinseq QC-Chain Streaming Trim Preprocessing FragGeneScan GlimmerMG MetaGeneAnnotator MetaGeneMark MetaGun Orphelia Prodigal Gene Prediction CARMA myTaxa FOCUS PhylopythiaS KRAKEN phymmbl LMAT RAIphy MEGAN TACOA Metaplan Taxy Taxonomic assignment CLAMS Sequedex DiScRIBinATE SORT-ITEMS genometa SPANNER GSMer SPHINX PPLACER TaxSOM RTMg Treephyler Functional assignment