The Bovine Genome Sequence: potential resources and practical uses. Nicola Hastings, Andy Law and John L. Williams * * Department of Genetics and Genomics,

Slides:



Advertisements
Similar presentations
Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Advertisements

ILVO - Plant (Applied Genetics and Breeding) Development of EST markers and evaluation of their use in evergreen.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
9 Genomics and Beyond Brief Chapter Outline
Physical Mapping I CIS 667 February 26, Physical Mapping A physical map of a piece of DNA tells us the location of certain markers  A marker is.
Bull selection based on QTL for specific environments Fabio Monteiro de Rezende Universidade Federal Rural de Pernambuco (UFRPE) - Brazil.
Mining SNPs from EST Databases Picoult-Newberg et al. (1999)
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
How to access genomic information using Ensembl August 2005.
Human Genome Project. Basic Strategy How to determine the sequence of the roughly 3 billion base pairs of the human genome. Started in Various side.
Sequence Analysis. Today How to retrieve a DNA sequence? How to search for other related DNA sequences? How to search for its protein sequence? How to.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
Sequence comparison: Local alignment
Working with the Conifer_dbMagic database: A short tutorial on mining conifer assembly data. This tutorial is designed to be used in a “follow along” fashion.
Mouse Genome Sequencing
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
DEPARTMENT OF PRIMARY INDUSTRIES 1 Discovering Genes for Beef Production Mike Goddard University of Melbourne and Department of Primary Indusries, Victoria.
Whole genome scans to localise QTL X. Likely positionQTL Chromosome with mapped markers BAC Contig Spanning QTL region New MarkersCandidate Genes Fine.
Tomato Chromosome 4: A Mapping & Sequencing Update 28 th September 2005 Christine Nicholson Mapping Core Group Welcome Trust Sanger Institute, UK.
Fine mapping QTLs using Recombinant-Inbred HS and In-Vitro HS William Valdar Jonathan Flint, Richard Mott Wellcome Trust Centre for Human Genetics.
Fig Chapter 12: Genomics. Genomics: the study of whole-genome structure, organization, and function Structural genomics: the physical genome; whole.
Construction of an Enriched Microsatellite Library for the Lizard Sceloporus undulates erythrocheilus Wendy Jin, Matthew Rand, Stefano Zweifel Department.
ANALYSIS AND VISUALIZATION OF SINGLE COPY ORTHOLOGS IN ARABIDOPSIS, LETTUCE, SUNFLOWER AND OTHER PLANT SPECIES. Alexander Kozik and Richard W. Michelmore.
Part I: Identifying sequences with … Speaker : S. Gaj Date
PreDetector : Prokaryotic Regulatory Element Detector Samuel Hiard 1, Sébastien Rigali 2, Séverine Colson 2, Raphaël Marée 1 and Louis Wehenkel 1 1 Department.
© 2010 by The Samuel Roberts Noble Foundation, Inc. 1 The Samuel Roberts Noble Foundation, 2510 Sam Noble Parkway, Ardmore, OK, 73401, USA 2 National Center.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
 Read quality  Adaptor trimming  Read sequence collapse Preprocessing Genome mapping  Map read to the spruce genome (Pabies1.0- genome.fa) using Patman
Julia N. Chapman, Alia Kamal, Archith Ramkumar, Owen L. Astrachan Duke University, Genome Revolution Focus, Department of Computer Science Sources
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Lettuce/Sunflower EST CGPDB project. Data analysis, assembly visualization and validation. Alexander Kozik, Brian Chan, Richard Michelmore. Department.
P.M. VanRaden and D.M. Bickhart Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA
What do we already know ? The rice disease resistance gene Pi-ta Genetically mapped to chromosome 12 Rybka et al. (1997). It has also been sequenced Bryan.
Simple-Sequence Length Polymorphisms SSLPs Short tandemly repeated DNA sequences that are present in variable copy numbers at a given locus. Scattered.
__________________________________________________________________________________________________ Fall 2015GCBA 815 __________________________________________________________________________________________________.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – the Transcription.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
MASTITIS RESISTANCE New breeding tools for improving mastitis resistance in European dairy cattle QLK5-CT
GENOME ORGANIZATION AS REVEALED BY GENOME MAPPING WHY MAP GENOMES? HOW TO MAP GENOMES?
Radiation hybrid map of the zebrafish genome
Simple-Sequence Length Polymorphisms
Rennie C1 Hulme H2 Fisher P2 Hall L3 Agaba M4 Noyes HA1 Kemp SJ1,4
Fall HORT6033 Molecular Plant Breeding
Virginia Commonwealth University
Identifying candidate genes for the regulation of the response to Trypanosoma congolense infection Introduction African cattle breeds differ significantly.
Human Genome Project.
MOLECULAR MARKERS.
The is a Critical Resource for Developing and Refining Trait-Predictive DNA Tests Cameron Peace, Daniel Edge-Garza, Terry Rowland, Paul Sandefur.
Noyes HA1 Agaba M2 Gibson J3 Ogugo M2 Iraqi F2 Brass A4 Anderson S5
Sequence comparison: Local alignment
Lettuce/Sunflower EST CGPDB project.
Introduction to bioinformatics lecture 11 SNP by Ms.Shumaila Azam
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.
Predicting Active Site Residue Annotations in the Pfam Database
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Geneomics and Database Mining and Genetic Mapping
Identify D. melanogaster ortholog
Lecture 9 Genome Mapping By Ms. Shumaila Azam
for the Cotton Community
Basic Local Alignment Search Tool
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Problems from last section
Human Genome Project Seminal achievement. Scientific milestone.
Basic Local Alignment Search Tool
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Forensic DNA Sadeq Kaabi
Presentation transcript:

The Bovine Genome Sequence: potential resources and practical uses. Nicola Hastings, Andy Law and John L. Williams * * Department of Genetics and Genomics, Roslin Institute, Roslin, Midlothian, Scotland EH25 9PS Introduction We are currently characterising Quantitative Trait Loci (QTL) involved in resistance to Mastitis. This EC-funded study requires new microsatellite markers specifically targeted to the QTL regions on bovine chromosomes. One potential resource of new microsatellites is the bovine genome sequence. The first assembly of the bovine genome sequence has been released and is comprised of short contig assemblies. However, the preliminary assembly does not assign contig sequences to chromosomes. In order to develop the targeted microsatellites we combined the publicly-available trace files from the bovine genome sequencing project along with the full assembly of the human genome sequence and radiation hybrid mapping data using a series of in silico steps. The technique described below allows the discovery of new microsatellite markers in targeted bovine chromosomes in a timely and cost-effective manner using only bioinformatic tools. Identification of microsatellite-containing sequences Six million bovine genomic sequences were downloaded from the Ensembl Trace Archive on 5th May 2004 from which we constructed a blast-able database which was searched using a sequence file containing all possible microsatellite motifs. The blast output files were processed using a filter script written in python to identify 'good' microsatellite hits. To be classified as a 'good' microsatellite, the sequence hit had to be at least 13 bases long or at least 4 multiples of the motif, whichever was longer e.g. (A)13, (CA)7, (GCG)5, (AATG)4 would all qualify. Multiple hits to the same sequence separated by gaps of less than 35 base pairs were merged into a single record. This process resulted in sequences being identified as containing microsatellite sequences. Results This approach yielded approximately 4000 new microsatellite sequences per bovine chromosome. Sequences that had multiple high-scoring hits against the human genome, and those sequences that corresponded with previously defined bovine microsatellites were removed from the list, leaving 471 di-, 315 tri-, 157, tetra and 382 penta-nucleotide repeats (see Table 1). The quality of new markers was assessed manually. To date, approximately 25 markers have been mapped using radiation hybrid mapped to confirm marker position and the polymorphic information content assessed in a panel of cattle. 70% of the new microsatellites were successfully mapped onto the predicted bovine chromosomes, 25% did not map to their predicted chromosomes and 15% proved difficult to map. This means that approximately 10,000 new markers are potentially available across the genome for fine mapping studies. Conclusions 1. The publicly available bovine genome sequence trace files are a rich source of potential microsatellite sequences. 2. It is possible to use the human genome sequence to selectively extract microsatellites targeted to specific bovine chromosome regions based on previous knowledge of conservation of synteny between the two species. 3. The rate limiting steps in this procedure are: a. the radiation hybrid mapping required to confirm the targeted chromosomal location. b. the confirmation of usable polymorphisms in the population under study. 4. This method successfully allowed the project to develop microsatellite markers from available in silico resources well in advance of a ‘full’ genome build. Chromosome targeting To localise the markers to specific chromosomes we chose to follow a comparative genomics-based strategy. The genomic sequence of regions of the human chromosomes corresponding to the bovine chromosomes known to contain putative QTLs for mastitis resistance were extracted from Ensembl. These human sequences were then combined into a blast-searchable database and the bovine sequences identified as containing apparent microsatellites were searched against the human sequence, having first been masked using xnun to prevent false matches against the microsatellite motifs (Figure 1). Type of Repeat Di-nucleotideTri-nucleotideTetra-nucleotidePenta-nucleotide Total No No. of single hits Removal of known acc. no Table 1. Total number of sequences containing di, tri, tetra and penta nucleotides. No of single hits refers to the total number once multiple hits were removed. The bottom row refers to total number left after the removal of multiple hits and removal of known accession numbers as these may have already been used for microsatellite discovery, this in turn is also the total number of filtered usable microsatellite sequences. Figure 1: Diagram describing the method involved to discover new microsatellites on any chosen bovine chromosome using both the human genome sequence and the newly sequenced bovine genome. References: Human genome sequence; Bovine genome sequence; ftp://ftp.ensembl.org/pub/traces/bos_taurus QTL Chromosome with mapped markers Problem: Gap without markers Bovine Genome Sequence Extract trace files into a single database (DB1) Search database for all sequences containing microsatellite motifs Create database with homologous human chromosome (DB3) Human Genome Sequence Blast search DB2 against DB3 Extract sequences and create database (DB2) Output: sequences containing microsatellites to targeted bovine chromosome Funded by EC QLK5-CT : New breeding tools for improving mastitis resistance in Euro pean dairy cattle