Nairobi 1-2 October Some Approaches to Agricultural Statistics NOTES 1. PLACE, DATE AND EVENT NAME 1.1. Access the slide-set place, date and event name text box beneath the JRC logo from the Slide Master Do not change the size nor the position of that text box Replace the mock-up texts for the place (“Place”), the date (“dd Month YYYY”) and the event name (“Event Name”) with your own texts Set it in MetaPlus Book Roman, if you own the typeface. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (white) Keep the original font body size (7 pt) and the text on one single line. 2. SLIDE NUMBER 2.1. The slide number on the banner’s lower right-hand side is automatically generated. 3. SLIDES 3.1. Duplicate the first slide as needed Do not change the size nor the position of the slide’s text box Try not to place more text on each slide than will fit in the given text box Replace the mock-up heading text (“Joint Research Centre (JRC)”) with your own text heading Set it in Eurostile Bold Extended Two or in Helvetica Rounded Bold Condensed, if you own one of these typefaces. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (100c 80m 0y 0k) Keep the original font body size (28 pt) and the heading on one single line whenever possible. Reduce the font body size if needed Replace the mock-up text (“The European Commission’s Research-Based Policy Support Organisation)”) with your own text Set it in MetaPlus Book Roman, if you own the typeface. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (100c 80m 0y 0k). Use black if you need a second colour Keep the original font body size (22 pt) or reduce it if unavoidable Replace the EU-27 map mock-up illustration with your own illustration(s) Try to keep your illustration(s) right- and top- or bottom-aligned with the main text box whenever possible. NOTES 1. PLACE, DATE AND EVENT NAME 1.1. Access the slide-set place, date and event name text box beneath the JRC logo from the Slide Master Do not change the size nor the position of that text box Replace the mock-up texts for the place (“Place”), the date (“dd Month YYYY”) and the event name (“Event Name”) with your own texts Set it in MetaPlus Book Roman, if you own the typeface. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (white) Keep the original font body size (7 pt) and the text on one single line. 2. SLIDE NUMBER 2.1. The slide number on the banner’s lower right-hand side is automatically generated. 3. SLIDES 3.1. Duplicate the first slide as needed Do not change the size nor the position of the slide’s text box Try not to place more text on each slide than will fit in the given text box Replace the mock-up heading text (“Joint Research Centre (JRC)”) with your own text heading Set it in Eurostile Bold Extended Two or in Helvetica Rounded Bold Condensed, if you own one of these typefaces. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (100c 80m 0y 0k) Keep the original font body size (28 pt) and the heading on one single line whenever possible. Reduce the font body size if needed Replace the mock-up text (“The European Commission’s Research-Based Policy Support Organisation)”) with your own text Set it in MetaPlus Book Roman, if you own the typeface. Otherwise, keep the original typeface – Arial Keep the original flush-left justification Keep the original font colour (100c 80m 0y 0k). Use black if you need a second colour Keep the original font body size (22 pt) or reduce it if unavoidable Replace the EU-27 map mock-up illustration with your own illustration(s) Try to keep your illustration(s) right- and top- or bottom-aligned with the main text box whenever possible.
Nairobi 1-2 October Main approaches to agricultural statistics (1) Expert subjective estimations In each administrative unit, a local expert fills a form with his assessment Census Farm census General population census (households) List frame surveys Statistical sampling – Small administrative units can be used as first sampling stage Selected farms (“purposive sampling”)
Nairobi 1-2 October Main approaches to agricultural statistics (2) Area frame sampling Observations on the ground – Crop area – Yield Expert eye estimations Objective measurements Remote sensed observations – Photo-interpretation – Classified images – Vegetation or yield indicators Agro-meteorological models
Nairobi 1-2 October Expert subjective estimates Advantages Cheap if there is a network of agricultural experts No sampling error Easy to manage All items can be addressed – Crop area and yield – Livestock – Means of production Disadvantages No idea of the accuracy of data Difficult to control the quality (non-sampling error), unless a sampling survey is made Changes are often underestimated Generally not recommended, but in some situations it can be the only alternative. Can be used as covariables to improve the accuracy of a sample survey Regression estimator or similar
Nairobi 1-2 October Farm census Advantages No sampling error Detailed information on the farm structure Disadvantages Expensive: In general it can be made only every 10 years. – Heavy to manage in many countries. – Items that change every year are not included (e.g: crop area and yield) Only farms above a size threshold are included: bias (sometimes >10% of systematic underestimate for a recent census). Possible additional bias if farmers think that data can be used for tax purposes. Can be used as list frame for sample surveys
Nairobi 1-2 October Population census (agricultural holdings) Subset of the population census: holdings with some agricultural activity. Compared to farm census: Advantages Agricultural and non-agricultural activity (income) can be analysed together Smaller bias (part-time farming included) Disadvantages Farm structure is more difficult to analyse More problematic to use as list frame for sampling surveys
Nairobi 1-2 October List frame surveys from census A statistical sample is selected in the census (farms or households) Advantages Flexible: general or specialized surveys possible. Stratification can be very efficient if farms have heterogeneous sizes or they are specialised in different productions. Often smaller bias than census (quality control is easier) Disadvantages Bias can be important if: – Census is incomplete or not updated – Answers of farmers are not fully reliable
Nairobi 1-2 October Two-stage list frame if census is unavailable A statistical sample of (small) administrative units is selected A “mini-census” is made in each of the selected administrative units A statistical sample is selected in each of the mini-censuses Advantages compared to list frame survey on a census Easier to update the sampling frame (smaller bias) Disadvantages Less flexible than sampling on a proper census Less efficient stratification
Nairobi 1-2 October Purposive sampling of farms A set of farms is selected without a proper statistical method. Advantages Provides an emergency solution if a proper statistical method is not applicable – Crisis situations Avoids high rates of non-response – Data difficult to provide or sensitive (accountancy) Disadvantages Requires a good knowledge of covariables in the population for extrapolation Very high risk of bias May produce acceptable results on the inter-annual change rates.
Nairobi 1-2 October Area Frame Sampling The sampling frame is not a list of farms, but the geographic space divided into sampling units: Segments: portions of territory, generally 9 ha – 400 ha. – Physical boundaries: segments are delimited by roads, rivers or field limits – Geometric shape: squares…. Points: in practice a “point” is conceived as a piece of land (3 x 3 m) Transects: straight lines of a given length – Often used for environmental observations. Observation mode: Direct on the ground: crop, yield estimation…. Interview with the farmers who manage the selected fields Sampling techniques: Random or systematic Clustered or unclustered Stratified or non-stratified, ……
Nairobi 1-2 October Area Frame Sampling (2) Advantages The sampling frame coincides quite precisely with the population – No (few) missing elements in the frame – No repeated elements Sampling units easy to define – Except for segments with physical boundaries Objective (if direct observations) Can be combined with remote sensing for further improvement Drawbacks For direct observation, the date of the field visit can be critical. Problems appear if – Crop not yet emerged – Already harvested and insufficient traces left – Not clear if it will be harvested Locating the units (segments or points) requires reliable field survey material – Aerial photographs with a proper enlargement – GPS Daily access to a reliable power supply Technical ability to operate the device Sometimes limitations due to security-military concerns
Nairobi 1-2 October Yield observations on the ground (1) Expert eye estimates on a statistical sample Possibly with the help of a table: number of ears per m 2, number of grains per ear, size of the grains… Advantages Cheap Possibility of providing geo-referenced data to combine with satellite images Good results if the experts are reliable Drawbacks Difficult to assess accuracy. Possible strong bias. (interest to have higher/lower estimates: aids..) Need of quality control
Nairobi 1-2 October Yield observations on the ground (2) Objective measurements. The crop is cut in a square of a given size (e.g.: 1 m 2 ) Precise weighing in laboratory. Advantages Statistical error can be computed Possibility of providing geo-referenced data to combine with satellite images Good results if the enumerators are meticulous Drawbacks Difficult to be precise in the application of the rules of sample collection. – Enumerators tend to avoid parts of the field with lower yield Possible bias (over-estimation), even applying coefficients for harvest loss.
Nairobi 1-2 October Remote sensed observations (satellite images) Can provide information on area or yield As covariable: combined with a consistent ground survey For area estimation Should not be used to substitute ground survey, except in particular cases: – Conflict (dangerous to go to the field) – No authorization (illegal crops, North Korea,…) – Very high accuracy in the identification of crops (>95%). E.g: large fields of rice For yield estimation Vegetation indexes in arid regions give good indications on inter-annual change Co-variable to be combined with geo-referenced measurements
Nairobi 1-2 October Remote sensing: cost-efficiency assessment The accuracy of a (normal) ground survey + remote sensing = accuracy of a more intense ground survey. Which option is cheaper? Elements needed to assess: – Cost structure of the of the ground survey Fixed cost Cost per additional Primary Sampling Unit (Administrative unit?) in the sample Cost per additional Secondary Sampling Units (farms, segments, points, fields) – Cost of remote sensing Images Image processing Combining remote sensing with ground data
Nairobi 1-2 October Agro-meteorological models Require a relatively complex information Soil map (water capacity) Phenological calendar (planting, flowering..) Coefficients describing the physiology of the plant. “Clean” meteorological data in nearly-real-time (10 days..) But simplified versions are also possible. For yield estimates, results need to be combined with historical statistical data Historical results of the agro-met model needed (long process) Inter-annual yield change indicators can be good without historical statistical data Geographical analysis of areas of concern Possibility to combine with coarse resolution satellite images (vegetation indexes….)
Nairobi 1-2 October Some vocabulary Bias and standard error Bias ~ non-sampling error Systematic tendency No reduction with a higher amount of data – Cannot be removed with an exhaustive census, classified image of the whole territory, etc Usually difficult to compute – In general no formula available Standard error ~ sampling error Due to the randomness of the sample Decreases when the sample grows In general formulas are available – Sometimes very complicated: simulation possible (bootstrap)
Nairobi 1-2 October Variables and co-variables Use of these terms in the context of this presentation (and often in sampling survey techniques) We have a targeted result (e.g. Crop Area) Variable (or main variable): usually refers to a magnitude that (nearly) coincides conceptually with the targeted result (direct observation) measured on a sample of units – farms, – households, – small administrative units, – territorial segments, – points, – fields Measurements nearly unbiased Co-variable: usually refers to more biased measurements known for the whole population or a very large sample Subjective estimates Classified images Vegetation indexes.
Nairobi 1-2 October Variables and co-variables (2) Variables and co-variables can be combined Regression estimators and similar (difference, ratio…) Calibration estimators Small area estimators If the main variable is (nearly) unbiased, the combined estimator is (nearly) unbiased Even if the covariable is biased If the variable and the co-variable are well correlated, the combined estimator has a smaller standard error – But the gain is limited – It is important that the estimator based on the main variable has a decent standard error – Good ground or farm survey. – Quality control When combining a variable (known for a sample) and a covariable (exhaustive knowledge), it is important that the covariable has the same quality in the sample and out of the sample – Do not improve the co-variable in the sample if you cannot improve it out of the sample.