Download presentation
Presentation is loading. Please wait.
1
Bioinformatics Outline
What is bioinformatics? Who are bioinformaticians? Hardware Software
2
What is bioinformatics?
3
What is bioinformatics?
Someone to analyze my data The boring stuff I do between experiments Someone to help me think about my data People sitting in a dark room analyzing data A person who writes complex algorithms perl python R linux java C++ bash ruby HTML A person who knows what an HMM is That bloke who fixes my computer Someone who builds websites
4
Who are bioinformaticians?
Scientists trying to get tenure, get grants, publish papers, train students Scientists trying to help others analyze their data
5
Who are bioinformaticians?
YOU!
6
Hardware
7
Torrent Server Recommended
Processors - Two Six-core processors RAM - 48 GB RAM HDD Capacity - Eight 2 TB Hard drives in RAID 5 with 12 TB usable Network – Quad port gigabit NIC GPU - NVIDIA Graphic Processor Unit Chassis – Dell Precision T7500 tower. No rack mount available. Monitor⁄Keyboard – not included – file access available via SSH or web service $12,500
8
Computers My cluster 192 TB lustre FS 51 node cluster
most nodes: 16 cpus, 8 cores each,132 GB RAM, 1TB local storage (/usr/data), infiniband interconnects (6,528 cores; 6,732 GB RAM; 50 TB scratch storage) 192 TB lustre FS connected to most nodes via infiniband
9
Computers rambox edwards.sdsu.edu 24 processors with 6 cores each
198 MB RAM edwards.sdsu.edu lab web server 24 processors, 6 cores each 50M RAM 19TB RAID 6 storage 18TB USED
10
Computers file servers and back up servers 4 secret servers!
48TB backups and archival storage
11
Software
12
Software Locally installed software Remote (web) software
13
Local Software bioperl biopython bowtie2 cdhit crass diamond fastQC
focus FOCUS FragGeneScan genemark groopm idba_ud jellyfish last masurca mauve metabat metagenemark mira MUMmer Muscle PEAR phylip prinseq qiime qudaich rapsearch scaffold_builder seed-servers spades tagcleaner tRNAscan-SE velvet
14
Metagenomics Processing
Merge paired-end reads Preprocessing Functional Assignments Taxonomic assignments Contamination removal Gene Prediction Contig Clustering Binning reads
15
Metagenomics Quality control – Prinseq Statistics Deconseq
Annotation FOCUS Real time metagenomics mg-rast Super FOCUS Statistics STAMP Population genomes crAss metabat ContigClustering
16
Metagenomics Processing
AbundanceBin CompostBin concoct crAss tetra Contig clustering FASTQC FastX Toolkit fitGCP NGS QC Toolkit Non-pareil Prinseq QC-Chain Streaming Trim Preprocessing FragGeneScan GlimmerMG MetaGeneAnnotator MetaGeneMark MetaGun Orphelia Prodigal Gene Prediction CARMA myTaxa FOCUS PhylopythiaS KRAKEN phymmbl LMAT RAIphy MEGAN TACOA Metaplan Taxy Taxonomic assignment CLAMS Sequedex DiScRIBinATE SORT-ITEMS genometa SPANNER GSMer SPHINX PPLACER TaxSOM RTMg Treephyler Functional assignment
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.