Center of Statistical Genetics University of Pisa Traceability of cattle breeds by DNA analysis Silvano Presciuttini Limoges, 29 juin 2007.

Slides:



Advertisements
Similar presentations
PV92 PCR/Informatics Kit
Advertisements

Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Fundamentals of Forensic DNA Typing Slides prepared by John M. Butler June 2009 Appendix 3 Probability and Statistics.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Dr. Chris L. S. Coryn Spring 2012
Genetica per Scienze Naturali a.a prof S. Presciuttini Human and chimpanzee genomes The human and chimpanzee genomes—with their 5-million-year history.
1 A Bayesian Non-Inferiority Approach to Evaluation of Bridging Studies Chin-Fu Hsiao, Jen-Pei Liu Division of Biostatistics and Bioinformatics National.
IGES 2003 How many markers are necessary to infer correct familial relationships in follow-up studies? Silvano Presciuttini 1,3, Chiara Toni 2, Fabio Marroni.
Sampling Methods and Sampling Distributions Chapter.
Population Genetics What is population genetics?
Sampling and Randomness
Exam 1 Review GOVT 120.
THE POWER OF AN IBS-BASED METHOD TO INFER RELATIONSHIPS BETWEEN PAIRS OF INDIVIDUALS Silvano Presciuttini 1, Chiara Toni 1, Simonetta Verdiani 2, Lucia.
Genetica per Scienze Naturali a.a prof S. Presciuttini Mutation Rates Ultimately, the source of genetic variation observed among individuals in.
Assigning individuals to ethnic groups based on 13 STR loci X. Fosella 1, F. Marroni 1, S. Manzoni 2, A. Verzeletti 2, F. De Ferrari 2, N. Cerri 2, S.
Quantitative Genetics
Chapter 4 Selecting a Sample Gay, Mills, and Airasian
Determining the Size of
Review Session Monday, November 8 Shantz 242 E (the usual place) 5:00-7:00 PM I’ll answer questions on my material, then Chad will answer questions on.
Chemometrics Method comparison
Blueprint of Life Topic 9: Pedigrees
Confidence Intervals and Hypothesis Testing - II
Fundamentals of Hypothesis Testing: One-Sample Tests
Determining Sample Size
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination CHAPTER Eleven.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Random Sampling, Point Estimation and Maximum Likelihood.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Chapter 5 Selecting a Sample Gay, Mills, and Airasian 10th Edition
Population Genetics is the study of the genetic
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
Chapter 5 Characterizing Genetic Diversity: Quantitative Variation Quantitative (metric or polygenic) characters of Most concern to conservation biology.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
SAMPLING TECHNIQUES. Definitions Statistical inference: is a conclusion concerning a population of observations (or units) made on the bases of the results.
1 Chapter Two: Sampling Methods §know the reasons of sampling §use the table of random numbers §perform Simple Random, Systematic, Stratified, Cluster,
SAMPLING TECHNIQUES AND METHODS ‘CHAR’ FMCB SEMINAR PRESENTER: DR KAYODE. A. ONAWOLA 03/07/2013.
Pea plants have several advantages for genetics.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Market research for a start-up. LEARNING OUTCOMES By the end of this lesson I will be able to: –Define and explain market research –Distinguish between.
Speaker: Bin-Shenq Ho Dec. 19, 2011
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter Thirteen Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination.
Chapter 16 Table of Contents Section 1 Genetic Equilibrium
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
C82MST Statistical Methods 2 - Lecture 1 1 Overview of Course Lecturers Dr Peter Bibby Prof Eamonn Ferguson Course Part I - Anova and related methods (Semester.
Genetic differentiation of caribou herds and reindeer in Northern Alaska Karen H. Mager, Kevin E. Colson, and Kris J. Hundertmark Institute of Arctic Biology,
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Topics Semester I Descriptive statistics Time series Semester II Sampling Statistical Inference: Estimation, Hypothesis testing Relationships, casual models.
Individual Identity and Population Assignment Lab. 8 Date: 10/17/2012.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Lesson Overview Lesson Overview The Work of Gregor Mendel Lesson Overview 11.1 The Work of Gregor Mendel.
Virtual University of Pakistan
Sampling and Sampling Distribution
Topics: Essentials Hypotheses Testing Examples
DNA Sire Identification Meat Animal Research Center Clay Center, NE
Sample Size Determination
Topics: Essentials Hypotheses Testing Examples
There is a Great Diversity of Organisms
Linkage Analysis Problems
Last Update 12th May 2011 SESSION 41 & 42 Hypothesis Testing.
Presentation transcript:

Center of Statistical Genetics University of Pisa Traceability of cattle breeds by DNA analysis Silvano Presciuttini Limoges, 29 juin 2007

Center of Statistical Genetics S. Presciuttini – University of Pisa What is traceability in the food chain? Traceability of animal products to their source breed represents a fundamental aspect for granting food quality, safety and authenticity, and protects both consumers and producers from possible frauds. Traceability of animal products to their source breed represents a fundamental aspect for granting food quality, safety and authenticity, and protects both consumers and producers from possible frauds. In 2002, a regulation by the European Union defined traceability as “the ability to trace and follow food, feed and ingredients through all stages of production, processing and distribution”. In 2002, a regulation by the European Union defined traceability as “the ability to trace and follow food, feed and ingredients through all stages of production, processing and distribution”.

Center of Statistical Genetics S. Presciuttini – University of Pisa Three levels of traceability INDIVIDUAL INDIVIDUAL  At the time of slaughter, a small sample of tissue is taken from each carcass, and this sample provides a unique DNA profile of the animal. POPULATION OR BREED POPULATION OR BREED  Breed traceability is essential whenever products with Protected Designation of Origin (PDO) or Protected Geographical Indication (PGI) are obtained from animals of particular breeds, for which costs or logistic or technical reasons do not make it convenient or possible to recourse to individual traceability. SPECIES SPECIES  methods based upon DNA fragments analysis are helpful to identify species in sterilized fish products, i.e. canned tuna.

Center of Statistical Genetics S. Presciuttini – University of Pisa The importance of breed traceability Breed names are more widely used as a brand name, and there is a growing interest by food producers in the ability to assign anonymous samples to known populations. Therefore, tests for breed identity would be valuable means to validate quality and origin of livestock products. Breed names are more widely used as a brand name, and there is a growing interest by food producers in the ability to assign anonymous samples to known populations. Therefore, tests for breed identity would be valuable means to validate quality and origin of livestock products. Tracing the breed of origin of animal products represents an opportunity for the promotion of local genetic resources with benefits for local economy, breed valorization and sustainable conservation of biodiversity. Tracing the breed of origin of animal products represents an opportunity for the promotion of local genetic resources with benefits for local economy, breed valorization and sustainable conservation of biodiversity. For these reasons breed traceability is an important topic of research, particularly in Mediterranean countries (Italy, Spain and France), where a high number of typical products are declared to be mono- breed. For these reasons breed traceability is an important topic of research, particularly in Mediterranean countries (Italy, Spain and France), where a high number of typical products are declared to be mono- breed.

Center of Statistical Genetics S. Presciuttini – University of Pisa Breed assignment by DNA analysis Two different approaches are possible: Two different approaches are possible: 1)BASED ON ANONYMOUS MARKERS  If there is substantial variation in allele frequencies among different breeds, a large number of loci typed in an individual may provide sufficient statistical power to assign it to its true breed of origin. This approach requires the creation of a database of allele frequencies in the relevant breeds. 2)BASED ON BREED-SPECIFIC TRAIT LOCI  If we identifiy the loci and the alleles that are responsible of the breed phenotypic characteristics, we may assign individuals to their breed by inferring the phenotype from the genotype.

Center of Statistical Genetics S. Presciuttini – University of Pisa Use of microsatellites to assign individuals to populations: an example from human

Center of Statistical Genetics S. Presciuttini – University of Pisa Background  Brescia ranks among the highest towns in Italy for the proportion of censused immigrants from non-EU countries (about 10% of the local resident population)  Most blood crimes in this area happens within ethnically defined groups  A test that could assign a biological stain to a subject of a particular population would be highly welcomed

Center of Statistical Genetics S. Presciuttini – University of Pisa Population samples Unrelated individuals from the historical series of the Institute of Legal Medicine of the University of Brescia, including a large number of immigrants from a variety of countries, were selected for the present analysis Unrelated individuals from the historical series of the Institute of Legal Medicine of the University of Brescia, including a large number of immigrants from a variety of countries, were selected for the present analysis In addition, blood samples from subjects of known ethnicity were collected from the local hospital, following IRB approval In addition, blood samples from subjects of known ethnicity were collected from the local hospital, following IRB approval The composition of the final sample is shown below The composition of the final sample is shown below Subjects were typed with the Profiler Plus TM and SGM Plus TM kits (totaling 13 loci), following standard protocols Subjects were typed with the Profiler Plus TM and SGM Plus TM kits (totaling 13 loci), following standard protocols

Center of Statistical Genetics S. Presciuttini – University of Pisa Population assignment test (1) Theoretically, assigning individuals to a given population based on a multilocus genotype is surprisingly easy. The frequency of the D19S433 2/8 genotype is 38-fold higher in Blacks than in Italians The likelihood of the BL- 160 multilocus genotype is 3,700 higher in Blacks than in Italians

Center of Statistical Genetics S. Presciuttini – University of Pisa Population assignment test (2) A computational trouble (a division by zero) arises when a particular allele is absent from a population sample; in this case, an arbitrary frequency must be given to that allele in that population.

Center of Statistical Genetics S. Presciuttini – University of Pisa Blacks vs Italians We assigned a value of 1/200 (=0.005) to the frequency of the missing alleles and calculated the likelihoods and the likelihood ratios using the allele frequencies estimated from all available data (about 100 subjects typed for each locus in both samples),

Center of Statistical Genetics S. Presciuttini – University of Pisa Simulated samples In order to estimate the statistical power of the assignment test with higher accuracy, we simulated 10,000 individuals for both the Italian and the Black samples

Center of Statistical Genetics S. Presciuttini – University of Pisa Statistical power of discriminating between the two simulated samples (Blacks vs Italians)

Center of Statistical Genetics S. Presciuttini – University of Pisa Conclusions The 13 STR loci included in two commercial kits provided a limited but significant power to infer the ethnicity of immigrant groups. The 13 STR loci included in two commercial kits provided a limited but significant power to infer the ethnicity of immigrant groups. Not surprisingly, the highest level of discrimination was achieved by contrasting Blacks with resident Whites. Not surprisingly, the highest level of discrimination was achieved by contrasting Blacks with resident Whites. When two alternative hypotheses about the ethnic origin of a sample can be formulated with confidence, a population assignment test can already be applied to real cases When two alternative hypotheses about the ethnic origin of a sample can be formulated with confidence, a population assignment test can already be applied to real cases

Center of Statistical Genetics S. Presciuttini – University of Pisa The objective of the present study was to assess the practicability of assigning individuals among four cattle breeds using STR. This goal was divided into three major tasks: 1) validating the markers used in the assignment tests through analysis of genetic heterogeneity; 2) calculating the likelihood that each animal originated from its true breed as well as from any of the others; 3) performing a statistical analysis of the assignment tests in terms of sensitivity and specificity. Chianina (N = 67) is a large-size, high- priced beef breed, which originated in central Italy and is the source of the renowned “Florentine steak”. Chianina (N = 67) is a large-size, high- priced beef breed, which originated in central Italy and is the source of the renowned “Florentine steak”. Charolaise (N = 69) Limousine and (N=67) are beef breeds of French origin, which share an important part of the Italian beef market. Charolaise (N = 69) Limousine and (N=67) are beef breeds of French origin, which share an important part of the Italian beef market. The Italian Friesian (N = 66) is the main dairy breed reared in Italy, but it is also a relevant source of meat. The Italian Friesian (N = 66) is the main dairy breed reared in Italy, but it is also a relevant source of meat.

Center of Statistical Genetics S. Presciuttini – University of Pisa Estimate of allele frequencies When one allele is missing from a putative source breed, the multilocus likelihood is zero, and the value of the LR is undetermined. Several solutions have been proposed:  adding the test genotype to all samples  assigning arbitrarily low values to the missing alleles  replacing them with the inverse number of gene copies in each sample  using a uniform prior distribution of allele frequencies Since the practical application of assignment tests may ultimately imply charges of fraud, we devised a conservative method of estimating allele frequencies: p i = (f i + 1)/(n i + a), where f i is the number of copies of an allele observed in breed i, n i is the number of gene copies for that locus in that breed (equal to twice its sample size), and a is the number of alleles at that locus observed in the total sample.

Center of Statistical Genetics S. Presciuttini – University of Pisa Discrimination between two cattle breeds using15 STR Limousine vs. Charolaise True positives: False positives: Probability of assignment: Charolaise vs. Limousine True positives: False positives: Probability of assignment: 0.963

Center of Statistical Genetics S. Presciuttini – University of Pisa A more complete picture of individual allocation among four cattle breeds

Center of Statistical Genetics S. Presciuttini – University of Pisa Breed allocation using coat colour loci Whereas anonymous microsatellites markers have been extensively used for the purpose of breed allocation (and more recently anonymous SNPs have been proposed), use of coat color genes has received minor attention so far. Whereas anonymous microsatellites markers have been extensively used for the purpose of breed allocation (and more recently anonymous SNPs have been proposed), use of coat color genes has received minor attention so far. However, coat colour has been used as a trademark for different cattle breeds at least during the past 200 years in Europe, so that a systematic selection has been applied to particular alleles expressed at the level of the color phenotype However, coat colour has been used as a trademark for different cattle breeds at least during the past 200 years in Europe, so that a systematic selection has been applied to particular alleles expressed at the level of the color phenotype As a consequence, some breeds carry specific alleles that are directly related to their morphological identification, for example in loci associated to coat colour. As a consequence, some breeds carry specific alleles that are directly related to their morphological identification, for example in loci associated to coat colour.

Center of Statistical Genetics S. Presciuttini – University of Pisa Private alleles and fixed alleles Breed allocation based on breed-specific trait loci relies on the concepts of “private alleles” and “fixed alleles” Breed allocation based on breed-specific trait loci relies on the concepts of “private alleles” and “fixed alleles”  Private alleles: alleles found only in a single population  Fixed alleles: alleles for which all members of a population under study is homozygous, so that no other allele for this locus segregates in that population. When we identify an allele that is connected to a phenotypic trait, and it is both a private allele in a particular breed, and also this breed is fixed for that allele, the identification of that breed as the source of any of its products from which DNA can be amplified is virtually certain. When we identify an allele that is connected to a phenotypic trait, and it is both a private allele in a particular breed, and also this breed is fixed for that allele, the identification of that breed as the source of any of its products from which DNA can be amplified is virtually certain.

Center of Statistical Genetics S. Presciuttini – University of Pisa A pilot study at the University of Limoges The goal of the analysis was to explore the feasibility of using coat color genes for breed traceability in cattle. The goal of the analysis was to explore the feasibility of using coat color genes for breed traceability in cattle. A total of 819 animals from 22 French cattle breeds had been typed by Labogena for three coat color genes: MC1R, Silver, and Agouti A total of 819 animals from 22 French cattle breeds had been typed by Labogena for three coat color genes: MC1R, Silver, and Agouti After some data-cleaning (removing duplicated records and animals with one or more blank loci, and also the breeds with <25 animals), the final database included 624 animals from 18 breeds, or 34.7 animals per breed on average. After some data-cleaning (removing duplicated records and animals with one or more blank loci, and also the breeds with <25 animals), the final database included 624 animals from 18 breeds, or 34.7 animals per breed on average.

Center of Statistical Genetics S. Presciuttini – University of Pisa Allele frequencies at the three loci This allele is both fixed and private!

Center of Statistical Genetics S. Presciuttini – University of Pisa Breeds excluded based on presence/absence of alleles Based on the allele frequencies, the genotypes of each breed were checked for compatibility with the alleles present in all other breeds. Based on the allele frequencies, the genotypes of each breed were checked for compatibility with the alleles present in all other breeds. When a genotype included an allele that was missing in another breed, it was declared to be incompatible. When a genotype included an allele that was missing in another breed, it was declared to be incompatible. The Table shows, for each breed, the number of breeds and percentage for which all animals of that breed are incompatible. The Table shows, for each breed, the number of breeds and percentage for which all animals of that breed are incompatible. The only breed for which all animals are incompatible with all others is Charolais (100% incompatible breeds), but also Normande (82% incompatibilities), and Blanc Bleu Belge, Prim' Holstein, and Tarentaise (76% incompatibilities) are well discriminated. The only breed for which all animals are incompatible with all others is Charolais (100% incompatible breeds), but also Normande (82% incompatibilities), and Blanc Bleu Belge, Prim' Holstein, and Tarentaise (76% incompatibilities) are well discriminated.

Center of Statistical Genetics S. Presciuttini – University of Pisa Breeds excluded based on genotype likelihood A more refined measurement of the usefulness of these three markers in cattle breeds traceability is based on the calculation of multilocus likelihoods. A more refined measurement of the usefulness of these three markers in cattle breeds traceability is based on the calculation of multilocus likelihoods. Based on the genotype at the three loci, the likelihood that each animal of each breed was assigned to its own breed as well as to all other breeds was calculated, and then converted into posterior probabilities assuming equal priors. Based on the genotype at the three loci, the likelihood that each animal of each breed was assigned to its own breed as well as to all other breeds was calculated, and then converted into posterior probabilities assuming equal priors. By taking a mean value of this probability over all animals of a breed <1% as an evidence of exclusion, the number of breeds for which any given breed is incompatible is modified as shown in the Table. By taking a mean value of this probability over all animals of a breed <1% as an evidence of exclusion, the number of breeds for which any given breed is incompatible is modified as shown in the Table. Both Charolais and Normande show 100% incompatibilities. The next highest values are those of Prim’Holstein (94% incompatibility) and Blanc Bleu Belge (82% incompatibilities). Both Charolais and Normande show 100% incompatibilities. The next highest values are those of Prim’Holstein (94% incompatibility) and Blanc Bleu Belge (82% incompatibilities).

Center of Statistical Genetics S. Presciuttini – University of Pisa Breeds clustered by coat colour genes The average probabilities of assignment of the animal of each breed may easily be converted into a similarity matrix, from which a neighbour-joining dendrogram can be obtained. The average probabilities of assignment of the animal of each breed may easily be converted into a similarity matrix, from which a neighbour-joining dendrogram can be obtained. Six major clusters of breeds can be distinguished. Six major clusters of breeds can be distinguished. The genotypes at the three investigated loci make it possible to easily assign animals to a cluster, but not to assign them to a breed within a cluster. The genotypes at the three investigated loci make it possible to easily assign animals to a cluster, but not to assign them to a breed within a cluster. The figure also shows the prevalent genotypes that are mostly responsible of the observed clustering. The figure also shows the prevalent genotypes that are mostly responsible of the observed clustering.

Center of Statistical Genetics S. Presciuttini – University of Pisa Perspectives In conclusion, this work shows that the three typed loci could form a reasonable basis to implement a system of traceability for French cattle breeds. In conclusion, this work shows that the three typed loci could form a reasonable basis to implement a system of traceability for French cattle breeds. More work is necessary to increase the breed sample size (ideally, at least 100 animals from each breed should be typed to enter a validated database for estimating allele frequencies more precisely). More work is necessary to increase the breed sample size (ideally, at least 100 animals from each breed should be typed to enter a validated database for estimating allele frequencies more precisely). In addition, other genes responsible for variation in coat color could be typed, thus increasing the discrimination capacity of a test that can be easily implemented by the industry. In addition, other genes responsible for variation in coat color could be typed, thus increasing the discrimination capacity of a test that can be easily implemented by the industry.