Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics.

Slides:



Advertisements
Similar presentations
Sook Jung, Taein Lee, Stephen Ficklin, Kate Evans, Cameron Peace and Dorrie Main.
Advertisements

Genetic Map and Forward Genetics Tools for C. briggsae Presented by Dan Koboldt Ray Miller’s Group.
Rainer Lehtonen PhD, Genomics and genetics project leader Metapopulation Research Group Department of Biological and Environmental Sciences, University.
Anopheles gambiae PopGenBase Groundwork for a vector population genetics database UC Davis - UCLA.
Differential insertion of transposable elements in Anopheles gambiae M & S genomes Jenica L. Abrudan, Ryan C. Kennedy, Maria F. Unger, Michael R. Olson,
The IWGSC: Building the sequence-based foundation for accelerated wheat breeding Kellye A. Eversole IWGSC Executive Director & The IWGSC Cereals for Food,
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Bioinformatics for high-throughput DNA sequencing Gabor Marth Boston College Biology New grad student orientation Boston College September 8, 2009.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Specie: Anopheles gambiae PEST Genome size: 260 Mb Status: 3rd assembly and annotation NIAID funded.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
Sequencing Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
We are developing a web database for plant comparative genomics, named Phytome, that, when complete, will integrate organismal phylogenies, genetic maps.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Sequence Variation Informatics Gabor T. Marth Department of Biology, Boston College BI420 – Introduction to Bioinformatics.
November 2007BRC5 Bethesda Variation data in VectorBase Dan Lawson, VectorBase EMBL-EBI.
ABSTRACT We have conducted an extensive computational analysis of the Culex quinquefasciatus genome to find and annotate a specific subfamily of the TEs:
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
- Delphine MUTHS & Jérôme BOURJEA - Connectivity of Marine Protected Areas in South-Western Indian Ocean: Using population genetics of reef fish to contribute.
NGS Analysis Using Galaxy
Update in GDR, The Genome Database for Rosaceae S Jung, T Lee, S Ficklin, CH Cheng, I Cho, P Zheng, K Evans, C Peace, N Oraguzie, A Abbott, D Layne, M.
Genome-scale Metabolic Reconstruction and Modeling of Microbial Life Aaron Best, Biology Matthew DeJongh, Computer Science Nathan Tintle, Mathematics Hope.
Applied Genetics: DNA Technology & Genomics
VectorBase A Resource Centre for Invertebrate Hosts of Human Pathogens Bob MacCallum Imperial College London.
VectorBase Seth Redmond Imperial College, London
Abstract Although transposable elements (TEs) were discovered over 50 years ago, the robust discovery of them in newly sequenced genomes remains a difficult.
CUGI Pilot Sequencing/Assembly Projects Christopher Saski.
EBI is an Outstation of the European Molecular Biology Laboratory. Bert Overduin Daniel Rios Stephen Fitzgerald Edinburgh, 24 & 25 February 2009 Ensembl.
Annotation of Anopheline Genomes at VectorBase Dan Lawson, VectorBase & The Anopheles Genomes Cluster Consortium EMBL-EBI.
The new VectorBase: our improved resource for invertebrate vectors Scott Emrich On behalf of VectorBase “bigger, better, faster” Or “ "consolidate, improve.
Visualising NGS data in GBrowse 2 August 2009 GMOD Meeting 6-7 August 2009 Dave Clements GMOD Help Desk National Evolutionary Synthesis Center (NESCent)
An Efficient Method of Generating Whole Genome Sequence for Thousands of Bulls Chuanyu Sun 1 and Paul M. VanRaden 2 1 National Association of Animal Breeders,
DAN LAWSON BRC 2011 – ANNUAL MEETING UT SOUTHWESTERN MEDICAL CENTER DALLAS, TX SEPTEMBER 2011 Challenges and opportunities of new sequencing technologies.
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
Alexis DereeperCIBA courses – Brasil 2011 Detection and analysis of SNP polymorphisms.
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
Contribution of Epigenetic Variation to Expression Changes Among Tissues and Genotypes Steve Eichten – Springer Lab PAG iPlant Workshop 1/17/12.
VectorBase BRC Overview Scott Emrich BRC 2011 – Annual Meeting UT Southwestern Medical Center Dallas, TX September 2011.
EB3233 Bioinformatics Introduction to Bioinformatics.
VectorBase Kolymbari Meeting July 2011 new genomes new features and future plans Daniel Lawson (on behalf of VectorBase)
Map-based Exploration of Population Biology Data in VectorBase What is VectorBase? We are a consortium of institutions that hosts the genomes of invertebrate.
Variation data in VectorBase NIH/NIAID VectorBase site visit March 2015.
VectorBase’s Population Biology Resources and How to Submit to Them Bob MacCallum Imperial College, London, UK July 16, 2013.
Software and Databases for managing and selecting molecular markers General introduction Pathway approach for candidate gene identification and introduction.
F2 population x 2. F2 population x 2 Progeny testing x 3.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Scalable Algorithms for Next-Generation Sequencing Data Analysis Ion Mandoiu UTC Associate Professor in Engineering Innovation Department of Computer Science.
MULTIPLE POPULATIONS OF ARTEMISININ-RESISTANT PLASMODIUM FALCIPARUM IN CAMBODIA MIOTTO ET. AL Presented by Josie Benson.
Sunflower Genomic Resources Consortium – Update Meeting (1)assemble, annotate, and curate the sunflower reference genome; (2)integrate the reference sequence.
Population genetics approach to understanding changing malaria transmission dynamics. Evidence for clonal expansion and epidemic propagation of malaria.
Joachim De Schrijver.  Short introduction on 454 sequencing  Variant Identification pipeline  Possibilities of a DB oriented pipeline  Examples ◦
Progress on TripalBIMS Breeding Information Management System in Tripal Sook Jung, Taein Lee, Chun-Huai Chen, Jing Yu, Ksenija Gasic, Todd Campbell, Kate.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
From Reads to Results Exome-seq analysis at CCBR
Expediting Precision Medicine Initiatives for Clinical Genomics and Pharma through the Use of Knowledge Automation and Analytics Presenters: Dr. Scott.
SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.
Behavior and Phenotype in GMOD Natural Diversity in GMOD
Disease risk prediction
Breeding Information Management System
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
Whole Genome Sequencing of Brucella melitensis Isolates for the Identification of Biovar, Variants and Relationship within a Biovar *Shaheed F [1], Habibi.
A multi-strain, high-resolution mouse haplotype map reveals three distinctive genetic signatures Laboratory of Population Genetics.
Using the Drupal Content Management Software (CMS) as a framework for OMICS/Imaging-based collaboration.
Stuff to Do.
Jin Zhang, Jiayin Wang and Yufeng Wu
Discovery tools for human genetic variations
Introduction to Sequencing
Mapping of srt1 by BSA-seq.
Presentation transcript:

Scott Emrich Assistant Professor, Computer Science and Engineering Scientific Manager, VectorBase University of Notre Dame A flexible, scalable genomics framework for integrating heterogeneous vector sequence data

Assembly required…

VectorBase is here to help (esp. –OMICs data) Please see me and/or Dan Lawson (EBI) anytime this meeting

Anopheles gambiae M & S Lawnziak, Emrich et al. (2010, Science)

Some genomic regions display footprint of strong, recent selection Lawniczak, Emrich et al Science

A C G T C G T T A C T G CReference: A C G T C G A T A C T G CSample_1: A C G T C G T T A T T G CSample_2: A C G T C G A T A T T G C A C G T C G A T A C T G C A C G T C G T T A T T G C FlexReseq tool for integrating diverse sequence data

FlexReseq implementation Genome Analysis Toolkit (GATK): Map-Reduce framework that allows efficient access to large resequencing data sets FlexReseq: A module for GATK: Configurable interface allows easy data exploration Modular implementation of rules allows for easy extension of software Saves you from lots of scripting (Perl) code! McKenna et al., Genome Research, 2010

A malaria use-case for FlexReseq Samarakoon, Regier, et al., BMC Genomics, 2011 Why are some parasites drug-resistant? Goal: we want to connect genotype (genome) to phenotype (drug response) How did drug-resistance evolve?

1. Whole genome shotgun sequencing 2. Reference genome mapping NCBI Trace Archive [28] Reference genome (3D7) Parental genomes [shotgun libraries] Progeny genomes [shotgun libraries] PlasmoDB (v5.4) [27] Mapped: SSAHA2 anger.ac.uk Parents HB3, Dd2 Progeny recombinants SC05 7C126 Shotgun libraries GS-FLX technology 454/Roche Genetic cross Wellems et al [24]

A more detailed map of P. falciparum Dd2HB3 Chromosome position Chromosom e (A) 7C126(B) SC05

Association of 2La with clines of aridity in Nigeria… Modified from Coluzzi et al (1979) 24,000 mosquitoes 194 sampling localities

High-throughput sequencing Data from Besansky lab Illumina Genome Analyzer 4 population pools (S-form) SHRiMP alignment BWA works also C. Cheng et al, unpublished

Differential mapping biases do exist

Population haplotyping

In situ error isolation Has been shown to be important in ancient DNA-based ecology

Thanks to… VectorBase (NIH/NIAID) Dr. Nora Besansky (ND) Dr. Frank Collins (ND) Rory Carmichael, Andrew Shehan, Nate Konopinski, Dave Campbell (ND), others… Notre Dame Bioinformatics Lab, Summer 2010 Anopheles genome cluster group i5K Arthropod Genomics Consortium steering committee