G2 Crop CIS meeting Ispra, May 14 – 15, 2012 Presented by: Institute of Geodesy and Cartography.

Slides:



Advertisements
Similar presentations
Regression analysis Relating two data matrices/tables to each other Purpose: prediction and interpretation Y-data X-data.
Advertisements

CROP-CIS User utility assessment of Geoland2 BioPar products Comparison of G2 BioPar vs. JRC-MARSOP SPOT- VGT NDVI & fAPAR products M. Meroni, C. Atzberger,
Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Maximum Covariance Analysis Canonical Correlation Analysis.
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
A quick introduction to the analysis of questionnaire data John Richardson.
Chapter 3 Summarizing Descriptive Relationships ©.
Chapter 11 Multiple Regression.
Part II – TIME SERIES ANALYSIS C2 Simple Time Series Methods & Moving Averages © Angel A. Juan & Carles Serrat - UPC 2007/2008.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Norms & Norming Raw score: straightforward, unmodified accounting of performance Norms: test performance data of a particular group of test takers that.
Simple Linear Regression Analysis
Classification and Prediction: Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
CHAPTER 18 Models for Time Series and Forecasting
Relationships Among Variables
Factor Analysis Psy 524 Ainsworth.
Objectives of Multiple Regression
Least-Squares Regression
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Hydrologic Modeling: Verification, Validation, Calibration, and Sensitivity Analysis Fritz R. Fiedler, P.E., Ph.D.
Linear Functions 2 Sociology 5811 Lecture 18 Copyright © 2004 by Evan Schofer Do not copy or distribute without permission.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION © 2012 The McGraw-Hill Companies, Inc.
METHODS IN BEHAVIORAL RESEARCH NINTH EDITION PAUL C. COZBY Copyright © 2007 The McGraw-Hill Companies, Inc.
Examining Relationships in Quantitative Research
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Managerial Economics Demand Estimation & Forecasting.
Chapter 16 Data Analysis: Testing for Associations.
The Pearson Product-Moment Correlation Coefficient.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Residuals Recall that the vertical distances from the points to the least-squares regression line are as small as possible.  Because those vertical distances.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Power Point Slides by Ronald J. Shope in collaboration with John W. Creswell Chapter 12 Correlational Designs.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 8 Relationships Among Variables. Outline What correlational research investigates Understanding the nature of correlation What the coefficient.
BIVARIATE/MULTIVARIATE DESCRIPTIVE STATISTICS Displaying and analyzing the relationship between continuous variables.
1 ICP PPP Methods Regional Course on Price Statistics and ICP Male, Maldives September 2005 TIMOTHY LO Statistician, International Comparison Program.
Model based approach for estimating and forecasting crop statistics: Update, consolidation and improvement of AGROMET model “AGROMET Project” Working Group.
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
Prediction of lung cancer mortality in Central & Eastern Europe Joanna Didkowska.
Chapter 12 REGRESSION DIAGNOSTICS AND CANONICAL CORRELATION.
Lecture 9 Forecasting. Introduction to Forecasting * * * * * * * * o o o o o o o o Model 1Model 2 Which model performs better? There are many forecasting.
Linear Regression 1 Sociology 5811 Lecture 19 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Stats Methods at IC Lecture 3: Regression.
Chapter 12 Understanding Research Results: Description and Correlation
Summarizing Descriptive Relationships
Chapter 7. Classification and Prediction
Regression Analysis AGEC 784.
Part 5 - Chapter
Part 5 - Chapter 17.
Statistics 101 Chapter 3 Section 3.
Regression Analysis Module 3.
Stats Club Marnie Brennan
Introduction to Statistics
Part 5 - Chapter 17.
What is Regression Analysis?
Shudong Wang, NWEA Liru Zhang, Delaware DOE G. Gage Kingsbury, NWEA
Regression Analysis.
Seasonal Forecasting Using the Climate Predictability Tool
Introduction to Regression
Summarizing Descriptive Relationships
Presentation transcript:

G2 Crop CIS meeting Ispra, May 14 – 15, 2012 Presented by: Institute of Geodesy and Cartography

ISPRA Utility assessment of BioPAR products for wheat yield forecasting in Europe. Crop yield estimation. Detailed description of methods and comparison of results on MARSOP and BioPar data

ISPRA Utility Assessment – IGiK contribution The objective of the work is to test the performance of MARS and BioPar indicators for yield forecast on an European window. The purpose is to show and assess their practical use in crop monitoring/yield forecasting. The work is aimed at comparing the differences in yield estimation accuracy, based on the two data sets. Objective

ISPRA European agro-climatic zones Iglesias, A., Garrote, L., Quiroga, S., Moneo, M.: Impacts of climate change in agriculture in Europe. PESETA-Agriculture study. EUR EN; DOI /33218; EC 2009.

ISPRA Another grouping of regions mean ordinal number of the decade in which the annual maximum of NDVI occurred

ISPRA Statistical model Partial Least Squares Regression Partial Least Squares Regression (PLSR) - to choose a few components being linear combinations of explanatory variables X and to perform linear regression of response variable Y on these variables instead of performing regression with use of all X-variables Y - response variable (yield value); X n - explanatory variables (values of vegetation indices); n - sequential number of ten-day period taken into account; d_beg, d_end – number of ten-day period corresponding to the beginning and the end of growing season, respectively (different for different agro-climatic zones); c Nn - function f – coefficients generated by the PLS regression algorithm.

ISPRA Statistical model Partial Least Squares Regression Partial Least Squares Regression (PLSR) - generalization of multiple regression - many (correlated) predictor variables - few observations - to derive orthogonal components using the cross-covariance matrix between the response variable and the explanatory variables - dimension reduction technique similar to Principal Component Regression (PCR)  PCR - the coefficients reflect the covariance structure between the predictor variables X  PLSR – the coefficients reflect the covariance structure between the predictor X and response Y variables

ISPRA Model evaluation One-leave-out One-leave-out cross-validation: - for each year of data the PLS regression model was built with this year excluded - the yield prediction for excluded year was performed - predicted and actual yield values were compared

ISPRA Model evaluation One-leave-out One-leave-out cross-validation: Performances were evaluated in terms of cross-validation mean errors: MPE Mean Percentage Error (MPE) MAPE Mean Absolute Percentage Error (MAPE) RMSE Root Mean Square Error (RMSE) Yield_obs i – actual yield in year i, Yield_pred i –yield prediction made for year i, N – number of observations (years) taken into account

ISPRA Results - cross validation Agro-climatic zones B i o P a r M A R S

ISPRA Results - cross validation maxNDVI B i o P a r M A R S

Chosen regions For each european NUTS region WA - wheat area harvested (from Eurostat, mean value of 11 considered years) TA - total arable land area (from arable land mask) 12 Ispra, May 14 – 15, 2012

Chosen regions DK Atlantic Central ES Mediterranean North DE Continental North DEE Continental North ES Mediterranean North mean lowest Ispra, May 14 – 15, 2012

Prediction errors 14 Ispra, May , 2012

Prediction errors 15 Ispra, May , 2012

Year 2009 yield prognosis 16 Ispra, May , 2012

Year 2009 yield prognosis 17 Ispra, May , 2012

Year 2009 yield prognosis 18 Ispra, May , 2012

Year 2009 yield prognosis 19 Ispra, May , 2012

Year 2009 yield prognosis 20 Ispra, May , 2012

Year 2009 prediction errors 21 Ispra, May , 2012

Year 2009 prediction errors 22 Ispra, May , 2012

Year 2009 prediction errors 23 Ispra, May , 2012

Year 2009 regression coefficients 24 Ispra, May , 2012

Year 2009 regression coefficients 25 Ispra, May , 2012

Year 2009 regression coefficients 26 Ispra, May , 2012

Models for aggregated data 27 Ispra, May , 2012 A strategy to increase the number of observations by grouping the NUTS As the number of years of yield data is small, the possibility of building PLS Regression models for aggregated data was investigated. Levels of NUTS-2 regions aggregation considered: o agro-climatic zone, o country, o country / agro-climatic zone, o NUTS-1 / agro-climatic zone.

Models for aggregated data 28 Ispra, May , 2012 For each NUTS-2 region, yield data was standardized. yield standardized = (yield – mean) / standard deviation Standardized yield values and values of vegetation indices from all NUTS-2 regions constituting one aggregated region were used to build PLS regression model for aggregated region.

Models for aggregated data 29 Ispra, May , 2012 Cross-validation The predictive ability of the model for aggregated region was assessed with cross-validation. For each year of the data: The PLS regression model was built on the basis of data that did not contain data for year considered (the standardization procedure for each NUTS-2 region was repeated). For each NUTS-2 region constituting the aggregated region, the prediction of standardized yield for year considered was performed and the destandardized yield value was calculated. This predicted yield value was compared with observed yield. Cross-validation MAPE, MPE, Nash-Sutcliffe coefficient were calculated.

Models for aggregated data 30 Ispra, May , 2012 Nash–Sutcliffe model efficiency coefficient

Models for aggregated data 31 Ispra, May , 2012 Nash–Sutcliffe efficiencies can range from −∞ to 1. An efficiency of 1 (E = 1) corresponds to a perfect match of modeled discharge to the observed data. An efficiency of 0 (E = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas an efficiency less than zero (E < 0) occurs when the observed mean is a better predictor than the model or, in other words, when the residual variance (described by the numerator in the expression above), is larger than the data variance (described by the denominator). Essentially, the closer the model efficiency is to 1, the more accurate the model is. NSC = 1 - a perfect match of modeled to the observed data. NSC = 0 - the model predictions are as accurate as the mean of the observed data NSC < 0 - the observed mean is a better predictor than the model The closer the model efficiency is to 1, the more accurate the model is.

Aggregation for agro-climatic zones 32 Ispra, May , 2012 Nash–Sutcliffe efficiencies can range from −∞ to 1. An efficiency of 1 (E = 1) corresponds to a perfect match of modeled discharge to the observed data. An efficiency of 0 (E = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas an efficiency less than zero (E < 0) occurs when the observed mean is a better predictor than the model or, in other words, when the residual variance (described by the numerator in the expression above), is larger than the data variance (described by the denominator). Essentially, the closer the model efficiency is to 1, the more accurate the model is. Number of regions NDVI_MARSfAPAR_MARSNDVI_BioParfAPAR_BioPar MPEMAPENSCMPEMAPENSCMPEMAPENSCMPEMAPENSC Alpine Atlantic Central Atlantic North Atlantic South Boreal Continental North Continental South Mediterranean North Mediterranean South

Country / agro-climatic zone 33 Ispra, May , 2012 Nash–Sutcliffe efficiencies can range from −∞ to 1. An efficiency of 1 (E = 1) corresponds to a perfect match of modeled discharge to the observed data. An efficiency of 0 (E = 0) indicates that the model predictions are as accurate as the mean of the observed data, whereas an efficiency less than zero (E < 0) occurs when the observed mean is a better predictor than the model or, in other words, when the residual variance (described by the numerator in the expression above), is larger than the data variance (described by the denominator). Essentially, the closer the model efficiency is to 1, the more accurate the model is. CountryAgro-Climatic zone Number of NUTS regions NSC MARSBioPar VCIFCIVCIFCI Austria AT_Alpine AT_Continental North BelgiumBE_Atlantic Central Germany DE_Atlantic Central DE_Continental North DenmarkDK_Atlantic Central Spain ES_Atlantic South ES_Mediterranean North ES_Mediterranean South FinlandFI_Boreal France FR_Atlantic Central FR_Atlantic South FR_Mediterranean North Hungary HU_Continental North HU_Continental South IrelandIE_Atlantic North Italy IT_Alpine IT_Mediterranean North IT_Mediterranean South LithuaniaLT_Continental North NederlandsNL_Atlantic Central PolandPL_Continental North PortugalPT_Atlantic South RomaniaRO_Continental South Sweden SE_Atlantic Central SE_Boreal SlovakiaSK_Continental North Great Britain UK_Atlantic Central UK_Atlantic North

THANK YOU VERY MUCH