ULUSAL TARIM KONGRESİ, 26-29 EKİM 2013, ANTALYA APPLICATION OF CLASSIFICATION AND REGRESSION TREE METHODS IN AGRICULTURE Ecevit EYDURAN1 Adile TATLIYER2.

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Statistics 350 Lecture 21. Today Last Day: Tests and partial R 2 Today: Multicollinearity.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Quantitative Genetics
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Mating Programs Including Genomic Relationships and Dominance Effects
Chapter 11 Simple Regression
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Simple Linear Regression
Linear Regression. Simple Linear Regression Using one variable to … 1) explain the variability of another variable 2) predict the value of another variable.
Understanding Statistics
Estimate of Swimming Energy Expenditure Utilizing an Omnidirectional Accelerometer and Swim Performance Measures Jeanne D. Johnston and Joel M. Stager,
Understanding Regression Analysis Basics. Copyright © 2014 Pearson Education, Inc Learning Objectives To understand the basic concept of prediction.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 9 – Classification and Regression Trees
Correlational Research Chapter Fifteen Bring Schraw et al.
CORRELATION. Bivariate Distribution Observations are taken on two variables Two characteristics are measured on n individuals e.g : The height (x) and.
Examining Relationships in Quantitative Research
Bivariate Poisson regression models for automobile insurance pricing Lluís Bermúdez i Morata Universitat de Barcelona IME 2007 Piraeus, July.
Examining Relationships in Quantitative Research
Missing Girls: An Evaluation of Sex Ratios in China As humanitarian and gender equality issues have been gaining more attention in the last several decades,
The Practice of Social Research Chapter 6 – Indexes, Scales, and Typologies.
Dealing with continuous variables and geographical information in non life insurance ratemaking Maxime Clijsters.
SIMPLE LINEAR REGRESSION AND CORRELLATION
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Advanced Animal Breeding
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
Chapter 2 Bivariate Data Scatterplots.   A scatterplot, which gives a visual display of the relationship between two variables.   In analysing the.
Multivariate Analysis - Introduction. What is Multivariate Analysis? The expression multivariate analysis is used to describe analyses of data that have.
Stats Methods at IC Lecture 3: Regression.
Howard Community College
Lecture 9 Sections 3.3 Objectives:
32931 Technology Research Methods Autumn 2017 Quantitative Research Component Topic 4: Bivariate Analysis (Contingency Analysis and Regression Analysis)
INTRODUCTION AND DEFINITIONS
Clustering CSC 600: Data Mining Class 21.
Bivariate & Multivariate Regression Analysis
Running models and Communicating Statistics
CHAPTER 6, INDEXES, SCALES, AND TYPOLOGIES
station of northwest of Iran
Understanding Regression Analysis Basics
Ovarian and Hormonal Changes During Ovsynch Program in Buffalo-cows
Ecevit Eyduran Adile Tatlıyer Abdul Waheed
To compare the economy of elite and non-elite men and women runners.
Statistical Data Analysis
Animal Production Research Institute,
SIMPLE LINEAR REGRESSION MODEL
Chapter 12: Regression Diagnostics
12 Inferential Analysis.
Stats Club Marnie Brennan
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Decision Trees.
STATISTICAL AGENCY UNDER PRESIDENT OF THE REPUBLIC OF TAJIKISTAN
females males Analyses with discrete variables
12 Inferential Analysis.
© 2017 by McGraw-Hill Education
Statistical Data Analysis
CORRELATION AND MULTIPLE REGRESSION ANALYSIS
15.1 The Role of Statistics in the Research Process
DEVELOPMENT OF A GENETIC INDICATOR OF BIODIVERSITY FOR FARM ANIMALS
Definition of EBVs of Economically Relevant Traits in Sheep Production
Percent of total breedings
Cost behaviour, cost drivers and cost estimation
Multivariate Analysis - Introduction
Organic Animal Production in Denmark
Presentation transcript:

ULUSAL TARIM KONGRESİ, 26-29 EKİM 2013, ANTALYA APPLICATION OF CLASSIFICATION AND REGRESSION TREE METHODS IN AGRICULTURE Ecevit EYDURAN1 Adile TATLIYER2 Mohammad Masood TARIQ3 Abdul WAHEED4 1Iğdir University, Animal Science Department, Biometry And Genetics Unit, Iğdir 2Süleyman Demirel University, Animal Science Department, Biometry And Genetics Unit, Isparta. 3Center for Advanced Studies in Vaccinology and Biotechnology (CASVAB), University of Balochistan, Quetta, Balochistan, Pakistan.  4Faculty of Veterinary Sciences, Bahauddin Zakariya University, Multan, Pakistan  *Corresponding author: ecevit.eyduran@gmail.com Abstract The aims of this investigation were to apply classification regression tree methods for different agricultural data sets and to illustrate how to interpret the obtained results, statistically-agriculturally. These analysis methods in visual form could be used instead of General Linear Model (GLM) in the presence of multicollinearity, outliers, and missing data. As a result, it was emphasized in the investigation that applying classification regression tree methods in place of revealing homogenous sub-groups could yield more detailed information on the data sets examined. Key Words: Agricultural Sciences, Classification Tree, Regression Tree. Introduction In agricultural sciences, describing the sophisticated relationships between significant measurable characteristics and yield characteristics is a considerable matter for attaining desirable genetic progress in yield characteristics overemphasized in animal and plant breeding. In general, the relationships have been probed by the well-recognized statistical techniques such as Pearson correlation, simple linear regression, multiple linear regression, ridge regression, and path analysis on the basis of some assumptions. Simply, investigating a bivariate relationship between the inspected yield characteristics and others by using the first two techniques is not an influent approach, and causes information loss, agriculturally. For instance, a multiple linear regression analysis technique may not submit a good interpretation in the presence of multicollinearity and outliers (Jahan et al., 2013; Eyduran et al., 2013). Due to these reasons, the well-chosen flexible statistical techniques such as Classification and Regression Tree methods should be applied in the sense of properly assessing and commenting the complex relationships, methodologically. Implementing Classification and Regression Tree methods gain more advantageous results in the occurrence of multicollinearity, outliers, and non-linear problems, and most especially in statistically evaluating high number independent variables (Mendes and Akkartal, 2009; Karabag et al., 2010). Applicability of such methods in agricultural sciences is few, but may bring into prominence. The current investigation was to apply classification and Regression Tree methods, which exhibit statistical results in visual form, to agricultural data sets. Material and Method In the current investigation, data of Mengali lambs in Pakistan were provided for applying classification and regression tree methods. Variable structures can be written as follows: BWT: Birth weight =continuous variable TOB: Type of Birth =discrete variable (single and twin) YEAR: Year of Birth =discrete variable (2006, 2007, 2008 and 2009) SEX: Discrete variable (male and female) DAM AGE: continuous variable DAM WEIGHT: Dam weight at birth (continuous variable) Motivating examples Classification tree method The first example is on applying classification tree method for the data regarding animal science which is one arm of agricultural sciences. The obtained classification tree diagram is illustrated in Fig.1. In the classification tree diagram, sex was considered as a dependent variable and the data of 138 Mengali lambs were evaluated. Of these lambs, 55.1% was female and 44.9% was male at Node 0, a root node, at the top of the classification tree diagram. Node 0 was divided into child Nodes 1 and 2. For example, Node 2, which was the group of lambs with BWT > 3.75 kg, had 63.3% male and 36.7% female. This means that male lambs were biologically heavier than female ones. Node 1, which was observed as the group of lambs with BWT < 3.75 kg, was branched into two nodes (Nodes 3 and 4) on the basis of birth type (TOB), respectively. For instance, male and female proportions were 28.6 % and 71.4% for single lambs with BWT < 3.75 kg. Regression Tree Method The second example is on applying regression tree method for the data about animal science which is a significant arm of agricultural sciences. The regression tree diagram is depicted in Fig.2. As the most effectual variable, TOB significantly affected BWT (P<0.01). Node 0 was divided into Node 1 (single lambs) and Node 2 (twin lambs) with respect to TOB, respectively. SEX factor had a statistically significant effect on BWT of only single lambs (P<0.01). Node 2 is called a terminal node due to stopping re-division. Node 1 was branched into Nodes 3 (single-female) and 4 (single-male) according to sex factor. BWT of single-female lambs in Nodes 3 was statistically influenced by year of birth, but dam age factor effected BWT of single-male lambs (P<0.01). Average BWT (4 kg) of male Mengali lambs born from dams with age > 59 months (Node 10) was heavier than Nodes 8 (the male lambs born from dams at age < 36 months) and 9 (the male lambs born from dams at 36 < age < 59 months). Average BWT from the years 2006 to 2009 for single- female lambs increased with a range of 3.391 to 3.728 kg. In conclusion, the heaviest average BWT (4 kg) was taken from Mengali lambs born from dams with age > 59 months (Node 10). Fig 1. The diagram of Classification Tree Method. Conclusion Use of Classification and Regression Tree methods is recommended in the existence of multicollinearity, outliers, and non-linear problems, and especially in assessing complex data sets with high number independent variables. In conclusion, application of such methods in agricultural sciences is scanty, but must be taken into consideration. References M. Jahan, M. M. Tariq, M. A. Kakar, E. Eyduran and A. Waheed (2013). Predicting Body Weight from Body and Testicular Characteristics of Balochi Male Sheep in Pakistan using Different Statistical Analyses. J. Anim. Plant Sci. 23(1). E. Eyduran, I. Yilmaz, M. M. Tariq and A. Kaygisiz (2013). Estimation of 305-d Milk Yield using Regression Tree Method in Brown Swiss Cattle. J. Anim. Plant Sci. 23(3). Mendes, M. and E. Akkartal (2009). Regression tree analysis for predicting slaughter weight in broilers. Italian J. Anim. Sci. 8: 615-624. Karabağ, K., Alkan ,S., Mendeş, M (2010). Classification Tree Method for Determining Factors that Affecting Hatchability in Chukar Partridge (Alectoris chukar) Eggs. Kafkas Univ Vet Fak Derg, 16 (5): 723-727. Fig 2. The diagram of Regression Tree Method