Refined pea core collection based on qualitative and quantitative characteristics Clarice J. Coyne, USDA-ARS Western Regional Plant Introduction, Washington.

Slides:



Advertisements
Similar presentations
CLUSTERING.
Advertisements

This demo will show the analysis functionality of Phenom-Networks based on a dataset generated in the Hebrew University, the Faculty of Agriculture in.
Clustering II.
CHAPTER 27 Mantel Test From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Discrimination and Classification. Discrimination Situation: We have two or more populations  1,  2, etc (possibly p-variate normal). The populations.
Logistics Network Configuration
An Introduction to Multivariate Analysis
Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 1 An Introduction to Business Statistics.
Population Population
Cluster Analysis Hal Whitehead BIOL4062/5062. What is cluster analysis? Non-hierarchical cluster analysis –K-means Hierarchical divisive cluster analysis.
2004/05/03 Clustering 1 Clustering (Part One) Ku-Yaw Chang Assistant Professor, Department of Computer Science and Information.
Clustering (1) Clustering Similarity measure Hierarchical clustering Model-based clustering Figures from the book Data Clustering by Gan et al.
Cost Assessment of Cellulosic Ethanol Production and Distribution in the US William R Morrow W. Michael Griffin H. Scott Matthews.
Clustering II.
QUANTITATIVE DATA ANALYSIS
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
JYC: CSM17 BioinformaticsCSM17 Week2: Biological Classification Fundamental concepts Traditional methods Nomenclature (naming) Taxonomy & systematics Overview.
Case Study: pea phenotyping A.Cooperators B.Funded evaluations C.Publication data mining D.Funded research projects E.Survey germplasm users (NCRPIS) F.Curator.
Introduction to Educational Statistics
Clustering Ram Akella Lecture 6 February 23, & 280I University of California Berkeley Silicon Valley Center/SC.
Ulf Schmitz, Pattern recognition - Clustering1 Bioinformatics Pattern recognition - Clustering Ulf Schmitz
JYC: CSM17 BioinformaticsCSM17 Week 2: Biological Classification Fundamental concepts Traditional methods Nomenclature (naming) Taxonomy & systematics.
Quantitative Genetics
How to Analyze Data? Aravinda Guntupalli. SPSS windows process Data window Variable view window Output window Chart editor window.
Chapter 1 Displaying the Order in a Group of Numbers and… Intro to SPSS (Activity 1) Thurs. Aug 22, 2013.
Multivariate Statistical Data Analysis with Its Applications
Biodiversity in Agroecosystems Milano, February 2011 UNIVERSITY of FLORENCE Department of Plant, Soil and Environmental Science EVALUATION OF THE.
ConceptS and Connections
Southern Taiwan University Department of Electrical engineering
1 1 Slide Data and Data Sets n Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation. and summarized.
Describing Data Statisticians describe a set of data in two general ways. Statisticians describe a set of data in two general ways. –First, they compute.
Ex St 801 Statistical Methods Introduction. Basic Definitions STATISTICS : Area of science concerned with extraction of information from numerical data.
BIA 2610 – Statistical Methods Chapter 1 – Data and Statistics.
Learning the threshold in Hierarchical Agglomerative Clustering
Multivariate Data Analysis  G. Quinn, M. Burgman & J. Carey 2003.
Harry Rukavina 1, Randy Johnson 2 and Harrison Hughes 1 1 Colorado State University, Department of Horticulture and Landscape Architecture 2 USDA Forest.
HY436: Mobile Computing and Wireless Networks Data sanitization Tutorial: November 7, 2005 Elias Raftopoulos Ploumidis Manolis Prof. Maria Papadopouli.
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Quantitative analysis of 2D gels Generalities. Applications Mutant / wild type Physiological conditions Tissue specific expression Disease / normal state.
Creating Synthetic Microdata from Official Statistics: Random Number Generation in Consideration of Anscombe's Quartet Kiyomi Shirakawa Hitotsubashi University.
An Overview of Clustering Methods Michael D. Kane, Ph.D.
Updates to the Cool Season Food Legume Genome Database Dorrie Main, Chun-Huai Cheng, Rebecca McGee, Clarice Coyne, Stephen Ficklin, Taein Lee, Sook Jung,
Understanding Network Concepts in Modules Dong J, Horvath S (2007) BMC Systems Biology 2007, 1:24.
GEOG 370 Christine Erlien, Instructor
J. B. Cole 1,*, P. M. VanRaden 1, and C. M. B. Dematawewa 2 1 Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville,
STATISTICS AND OPTIMIZATION Dr. Asawer A. Alwasiti.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Definition Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to)
Ch1 Larson/Farber 1 Elementary Statistics Math III Introduction to Statistics.
Canadian Bioinformatics Workshops
Little millet, Panicum sumatrense, an Under-utilized Multipurpose Crop
The Cool Season Food Legume Database: An Integrated Resource for Basic, Translational and Applied Research Dorrie Main, Chun-Huai Cheng, Stephen Ficklin,
Discrimination and Classification
PRINCIPLES OF CROP PRODUCTION ABT-320 (3 CREDIT HOURS)
Chapter 12 Using Descriptive Analysis, Performing
Distributions (Chapter 1) Sonja Swanson
Classification (Dis)similarity measures, Resemblance functions
Statistical Data Analysis
Elementary Statistics (Math 145)
Anastasia Baryshnikova  Cell Systems 
Population Population
Chapter 8 Supplement Forecasting.
Multivariate Methods Berlin Chen
Group 9 – Data Mining: Data
Multivariate Methods Berlin Chen, 2005 References:
Population Population
Histograms.
Patterns of amino acid usage and its GC-content of synonymous codons in 65 nuclear genomes in this study. Patterns of amino acid usage and its GC-content.
Presentation transcript:

Refined pea core collection based on qualitative and quantitative characteristics Clarice J. Coyne, USDA-ARS Western Regional Plant Introduction, Washington State University, Pullman, WA Poster 537. ASA-CSSA-SSSA Annual Meeting Denver, CO. November 1-6, References Coyne, Razai, Baik, Gruzak Variation for protein in the USDA pea core collection. NAPIA Biannual Meeting Abstracts. Grusak, Knewtson, Ibrikci, Muehlbauer Potential for Improving Micronutrient Nutrition in Cool Season Legumes. International Plant & Animal Genomes XI Conference, pag.org/11/abstracts/W17_W110_XI.html. McPhee and Muehlbauer Biomass production and related characters in the core collection of Pisum germplasm. Genetic Resources and Crop Evolution 48: Rohlf NTSYSpc: Numerical Taxonomy and Multivariate Analysis System, version 2.1. Exeter Software, NY. Simon and Hannan Development and use of core subsets of cool-season food legume germplasm collections. HortScience 30:907. Sneath and Sokal Numerical Taxonomy. W.H. Freeman and Company. San Francisco. Materials and methods The STAND module (NTSYSpc) was used to standardize the variables. STAND performs a variety of linear transformations of the variables in a data matrix. The default options correspond to the usual standardization of a matrix used in numerical taxonomy. This module was used prior to SIMINT so as to reduce the effects of different scales of measurement in different characters. The linear transformation used is of the form: y’= (y-a/b)-c where several optional values for a, b, and c are provided. Note that by using the proper codes for a and b several different standardizations are possible. Dissimilarity coefficients for interval measure (quantitative) data were generated using the SIMINT module (NTSYSpc). The input was in the form of a rectangular data matrix after standardization and the output was a symmetric matrix. The default parameter DIST average taxonomic distance (NTSYSpc) was used to generate the matrix. A dendrogram was generated from the Sequential, Agglomerative, Hierarchical, and Nested (SAHN) clustering method using the unweighted pair-group method, arithmetic average (UPGMA) (Sneath and Sokal, 1973; Rohlf, 2000) using NTSYSpc SAHN module. The EUCLID coefficient was used to generate the dissimilarity matrix in Euclidean distances for the new core selected. Random numbers were assigned to accessions in similar clusters and used to select the accessions for the refined core (Figure 2). Table 1. Trait data used to refine the pea core collection. Figure 1. The frequency histogram graphically represents the variation of total seed protein found in the first USDA Pisum core collection and indicates a two-fold difference between the lowest and highest protein concentration (Coyne et al 2003). A normal distribution for the trait is also indicated. The Pisum germplasm collection contains 3918 accessions. The first Pisum core collection created contained 504 accessions. This core was selected based on geographical origin and flower color (Simon and Hannan). Extensive phenotypic data has since been entered into the National Plant Germplasm System Germplasm Resources Information Network (GRIN) database by cooperators and 26 quantitative traits was used to select two new core collections. The new pea core collection consists of 310 accessions, a subset of the original 504. The new core and a mini-pea core collection of 50 accessions was created using yield component data published by McPhee (390 accessions), mineral nutrient data (481 accessions) collected by Grusak and seed protein content (481 accessions) from Coyne. A comparison of means, minimum and maximum values indicates no loss of genetic variation in the trait values used in this analysis. Next, the complete data set from GRIN ( be analyzed to further refine the new core collection. Approximately 550 pea accessions are genetic stocks ( so a core of 10% would be 336 accessions. Figure 2. Refined pea core collection of 310 Accessions based on 26 quantitative trait data.