Whole Genome Sequencing: Background and View of Current Efforts Larry Kuehn, Ph.D. Research Geneticist USDA, ARS, U.S. Meat Animal Research Center The.

Slides:



Advertisements
Similar presentations
Lecture 2 Strachan and Read Chapter 13
Advertisements

Reliable genomic evaluations across breeds and borders Sander de Roos CRV, the Netherlands.
Genetic Analysis of Genome-wide Variation in Human Gene Expression Morley M. et al. Nature 2004,430: Yen-Yi Ho.
Coming soon to a genetics lab near you! NBCEC Beef Genetic Workshop Clay Center, NE March 27, 2004 Marker adjusted EPDs.
Pooling DNA to Investigate Cattle Infertility Tara McDaneld USDA/ARS Meat Animal Research Center Clay Center, NE, 68933, USA.
Use of the genomic data o Reconstruction of metabolic properties o Nature’s Microbiome o NGS in Population Genetics.
Matt Spangler University of Nebraska- Lincoln DEVELOPMENT OF GENOMIC EPD: EXPANDING TO MULTIPLE BREEDS IN MULTIPLE WAYS.
Bob Weaber, Ph.D. Cow-Calf Extension Specialist Assistant Professor Dept. of Animal Sciences and Industry
Next-generation sequencing
Fokkerij in genomics tijdperk Johan van Arendonk Animal Breeding and Genomics Centre Wageningen University.
Genetic Selection Tools in the Genomics Era Curt Van Tassell, PhD Bovine Functional Genomics Laboratory & Animal Improvement Programs Laboratory Beltsville,
NHGRI/NCBI Short-Read Archive: Data Retrieval Gabor T. Marth Boston College Biology Department NCBI/NHGRI Short-Read.
Bull selection based on QTL for specific environments Fabio Monteiro de Rezende Universidade Federal Rural de Pernambuco (UFRPE) - Brazil.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Genome Browsers Ensembl (EBI, UK) and UCSC (Santa Cruz, California)
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Restriction Fragment Length Polymorphisms (RFLPs) By Amr S. Moustafa, M.D.; Ph.D. Assistant Prof. & Consultant, Medical Biochemistry Dept. College of.
Quantitative Genetics
Wiggans, 2013RL meeting, Aug. 15 (1) Dr. George R. Wiggans, Acting Research Leader Bldg. 005, Room 306, BARC-West (main office);
BEEF CATTLE GENETICS By David R. Hawkins Michigan State University.
Training, Validation, and Target Populations Training, Validation, and Target Populations Mark Thallman, Kristina Weber, Larry Kuehn, Warren Snelling,
Beef Relationships using 50K Chip Information L.A. Kuehn, J.W. Keele, G.L. Bennett, T.G. McDaneld, T.P.L. Smith, W.M. Snelling, T.S. Sonstegard, and R.M.
Introduction to Animal Breeding & Genomics Sinead McParland Teagasc, Moorepark, Ireland.
What’s coming next in genomics? Ben Hayes, Department of Primary Industries, Victoria, Australia.
Matt Spangler Beef Genetics Specialist University of Nebraska-Lincoln.
WiggansARS Big Data Workshop – July 16, 2015 (1) George R. Wiggans Animal Genomics and Improvement Laboratory Agricultural Research Service, USDA Beltsville,
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
ARC Biotechnology Platform: Sequencing for Game Genomics Dr Jasper Rees
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
DEPARTMENT OF PRIMARY INDUSTRIES 1 Discovering Genes for Beef Production Mike Goddard University of Melbourne and Department of Primary Indusries, Victoria.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD G.R. WiggansAg Discovery.
Module 1 Section 1.3 DNA Technology
1 Application of Molecular Technologies in Beef Production Dan W. Moser, Ph.D Department of Animal Sciences and Industry Kansas State University, Manhattan.
An Efficient Method of Generating Whole Genome Sequence for Thousands of Bulls Chuanyu Sun 1 and Paul M. VanRaden 2 1 National Association of Animal Breeders,
13-1 Changing the Living World
Breeding Jersey cattle for Africa in the era of genomics Prof Norman Maiwashe 1,2 (PhD, Pri. Sci. Nat) 1 ARC-Animal Production Institute Private Bag X2.
Bovine Genomics The Technology and its Applications Gerrit Kistemaker Chief Geneticist, Canadian Dairy Network (CDN) Many slides were created by.
Warm-Up #33 Answer questions #1-5 on Text page 321, Section Assessment.
Use of DNA information in Genetic Programs.. Next Four Seminars John Pollak – DNA Tests and genetic Evaluations and sorting on genotypes. John Pollak.
CHANGING THE LIVING WORLD OBJECTIVES: 13.1 Explain the purpose of selective breeding. Describe two techniques used in selective breeding. Tell why breeders.
E XOME SEQUENCING AND COMPLEX DISEASE : practical aspects of rare variant association studies Alice Bouchoms Amaury Vanvinckenroye Maxime Legrand 1.
1 DNA Polymorphisms: DNA markers a useful tool in biotechnology Any section of DNA that varies among individuals in a population, “many forms”. Examples.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Genetic Cancer Susceptibility (GCS) Genomics in I4C James McKay.
P.M. VanRaden and D.M. Bickhart Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA
What have we learned from sequencing efforts to date? Jared Decker Assistant Professor Beef Genetics Specialist Computational Genomics.
Fine Mapping and Discovery of Recessive Mutations that Cause Abortions in Dairy Cattle P. M. VanRaden 1, D. J. Null 1 *, T.S. Sonstegard 2, H.A. Adams.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD G.R. WiggansADSA 18.
Whole genome selection and the 2000 bull project at USMARC Larry Kuehn Research Geneticist.
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD Select Sires‘ Holstein.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 2011 National Breeders.
Jack Ward American Hereford Association October 31, 2012.
Big Data in Biology: A focus on genomics. Bioinformatics and Genomics O Applications: O Personalized cancer medicines O Disease determination O Pathway.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
Across Breed EPD and multi-breed genetic evaluation developments
My vision for dairy genomics
Gene Editing Opportunity for Accelerating Gain
Lesson: Sequence processing
Ch 15 DNA Technology/ Genetic Engineering
Nucleotide variation in the human genome
The is a Critical Resource for Developing and Refining Trait-Predictive DNA Tests Cameron Peace, Daniel Edge-Garza, Terry Rowland, Paul Sandefur.
Bellwork: What is the human genome project. What was its purpose
Content and Labeling of Tests Marketed as Clinical “Whole-Exome Sequencing” Perspectives from a cancer genetics clinician and clinical lab director Allen.
DNA Polymorphisms: DNA markers a useful tool in biotechnology
Genomic Evaluations.
Genomic Selection in Dairy Cattle
Jan – Dec RuminOmics Connecting the animal genome, the intestinal microbiome and nutrition to enhance the efficiency of ruminant.
Presentation transcript:

Whole Genome Sequencing: Background and View of Current Efforts Larry Kuehn, Ph.D. Research Geneticist USDA, ARS, U.S. Meat Animal Research Center The USDA is an equal opportunity employer.

Genome sequence Trying to read the base pairs (the code) along the whole genome genome.gov

Whole genome sequencing Loosely defined as reading and recording all 3 billion bases in the cattle genome –Introns Do not code for any protein Over 98% of genome Likely some regulatory function –Exons Remainder Coding regions

First cattle genome sequence Dominette –USDA-ARS – Miles City Montana –Inbred Hereford female Impact –Reference Bovine genome –Cost over $53 million –Led to DNA chips (50K chip, etc.) –Base for further sequencing projects

1992 Pharmacia manual slab gel – about 20,000 bases wk -1 machine -1, about $7 per 1,000 base read; used for microsatellite sequencing ($7,000 per Mb) 1997 ABI 377 gel-based system – about 380,000 bases wk -1 machine -1, about $3 per 800 base read; used for small scale gene sequencing and marker development ($3,750 per Mb) 1999 ABI 3700 capillary system – about 3.5 Mb wk -1, about $2 per 600 base read; used for “large scale” EST sequencing, marker development and start of SNP discovery program ($3,300 per Mb) 2003 ABI 3730 capillary system – about 7.5 Mb wk -1, about $0.80 per 600 base read; used for EST sequencing, BAC-targeted SNP discovery, lots of amplicon resequencing in breed panels ($1,300 per Mb) USMARC sequence technology progression:

Next generation sequencing Multiple platforms available

Next generation sequencing Accuracy ranges from 87% to 99% –General trend is decreased accuracy with longer read-length –Generally read 50 to bases per read –Easier to assemble longer reads Net effect is cheaper genome sequencing –Costs from $0.05 to $10 per million bases –Enough sequence for $150 whole genome

Some complications Initial cost –Sequencer costs from $80,000 to $1 million Genome assembly –Overlap helps Long reads better Less likely sequence similarities occur by chance Helps to compare to base sequence (other cattle)

Complications Tremendous amount of data –Hard to store, let alone manipulate –Makes assembly even tougher –Have to develop tools based on results For example, genotyping based on new markers discovered in sequence (50K chip) Hard to decide what ‘differences’ among animals are important –As with chips, phenotypes are still very important

Complications Multiple reads required –Reading 3 billion bases does not mean whole genome read Each animal has two full genomes (diploid) Genome is fragmented randomly into small pieces Same section of genome likely read numerous times Some advocate sequencing 20-50x when targeting whole genome –Focusing on one animal reduces impact of examining multiple animals (diversity, discovery, etc.)

Current efforts So we have potential for a lot of data Now what? Current efforts at USMARC and some selected programs elsewhere

Why bother? From a geneticist perspective: –Interested in sequencing to improve our chance at finding causal variation Examine differences in sequence Leads to new markers and mutations Mutations may change protein structure or protein regulation –Ultimately differences we see due to genetics lie with differences in the genome or its regulation

Two different strategies Compare genomes of diverse animals to see where they are different –Marker development, mutational candidates –Follow up with genotyping platform Associate differences in sequence among several animals with trait variation –Requires large numbers of sequenced animals First strategy cheaper, more restrictive

GPE Cycle VII Population AI Sires: AN, HH, AR, SM, CH, LM, GV Base Cows: AN, HH, MARC III  F 1 CowsF 1 Steers  F 1 2 CowsF 1 2 Steers F 1 Bulls AN & HH only

GPE low coverage sequencing Foundational bulls from cycle VII –151 AI bulls –45 F 1 bulls with most progeny Targeting 2x coverage on each bull –Use information from relationships to improve individual bull coverage –Assuming higher value from broadly covering all base animals vs. a few bulls

GPE low coverage sequencing Plan is to impute large portions of bull sequence to phenotyped descendents –Population is currently imputed to 770K Functional variants are first target –Alison detailed some examples counts in first portion of presentation

Functional variants ImpactEffect HighSPLICE_SITE_ACCEPTORFRAME_SHIFT SPLICE_SITE_DONORSTOP_GAINED START_LOSTSTOP_LOST EXON_DELETEDRARE_AMINO_ACID ModerateNON_SYNONYMOUS_CODINGCODON_CHANGE CODON_INSERTIONCODON_DELETION UTR_5_DELETEDUTR_3_DELETED CODON_CHANGE_PLUS_CODON_INSERTION CODON_CHANGE_PLUS_CODON_DELETION LowSYNONYMOUS_STARTNON_SYNONYMOUS_START SYNONYMOUS_CODINGNON_SYNONYMOUS_STOP SYNONYMOUS_STOPSTART_GAINED ModifierAll other effects Congolani et al., 2012

Next steps Same population as well as extreme animals for feed intake and growth are being targeted for whole exon capture –Recall exon refers to coding regions –Less costly to generate higher coverage –Further contribution to loss of function mutations

Other efforts Other projects are conducting large sequencing projects to detect further genomic variation University of Missouri leading a grant aimed at sequencing for variation in fertility –Around animals with whole genome sequencing –Loss of function mutations in exons are likely candidates for prenatal death

Other efforts Genome Canada –Approximately 30 animals sequenced at 5X coverage for several breeds 1,000 bovine genomes project –Ben Hayes (Australia) lead – –Ultimately ‘public’ exchange of sequence Imputation and mutation discovery objectives

Other sequencing applications Metagenomics –Loosely defined as whole genome sequencing of all organisms in a sample/population –Microorganisms –Gut, soil, skin, nasal mucosa, etc. –Whole genome cost-prohibitive Often target variable genomic regions that can be used to classify microbes

Nasal microbiome

Other sequencing applications Using sequencing to genotype –Can result in significant cost savings with large numbers of animals and specific targets –Currently a collaboration between USMARC (Mark Thallman, Amanda Lindholm-Perry) and Eureka Genomics –Must be able to target and maintain identity of animal in the sample

Unlimited animal ID barcodes added with two PCR primers RHS 3’ LH S 5’ Left Animal Barcode; L1 to L8 Right Animal Barcode; 1 to 384

Read Counts per Locus for a Single Animal Currently have a working parentage panel using a standard set of markers

Next step Sparse genome scan –Run sparse genome scan on most or all of the animals in the breed –Run the 50K chip on most or all AI sires in the breed –Individually sequence as many of the highly influential sires in the breed as is feasible

Sparse Genome Scan

Summary Mention of a trade name, proprietary product, or specific equipment does not constitute a guarantee or warranty by the USDA and does not imply approval to the exclusion of other products that may be suitable. Sequencing offers many possibilities – Move toward causal variation – Increase selection opportunities – Detect microbial interactions with economically relevant traits – Lower genotyping costs Questions