Introduction to Niche Modeling

Slides:

Advertisements

Similar presentations

Outcomes of The Living Murray Icon Sites Application Project Stuart Little Project Officer, The Living Murray Environmental Monitoring eWater CRC Participants.

Advertisements

GARP Genetic Algorithm for Rule-set Production

Bioclimatic Modelling BIOCLIM Arthur D. Chapman Kakadu National Park.

July 3 rd, 2014 Charlotte Germain-Aubrey ECOLOGICAL NICHE MODELING: INTRODUCTION.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Evolution of Biodiversity

Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.

Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.

Maxent interface.

1 Quantifying Opinion about a Logistic Regression using Interactive Graphics Paul Garthwaite The Open University Joint work with Shafeeqah Al-Awadhi.

Statistical Methods Chichang Jou Tamkang University.

Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.

Classification and Prediction: Regression Analysis

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Pipelines and Scientific Workflows with Ptolemy II Deana Pennington University of New Mexico LTER Network Office Shawn Bowers UCSD San Diego Supercomputer.

Ryan DiGaudio Modified from Catherine Jarnevich, Sunil Kumar, Paul Evangelista, Jeff Morisette, Tom Stohlgren Maxent Overview.

NR 422- Habitat Suitability Models Jim Graham Spring 2009.

Montane Frogs in Rainforest 2013, Marcio et al., Understanding the mechanisms underlying the distribution of microendemic montane frogs (Brachycephalus.

Candidate KBA Identification: Modeling Techniques for Field Survey Prioritization Species Distribution Modeling: approximation of species ecological niche.

Using Climatic data in Diva GIS Franck Theeten, Royal Museum for central Africa Cabin training 2013.

PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?

Museum and Institute of Zoology PAS Warsaw Magdalena Żytomska Berlin, 6th September 2007.

Role of Spatial Database in Biodiversity Conservation Planning Sham Davande, GIS Expert Arid Communities Technologies, Bhuj 11 September, 2015.

Applications of Spatial Statistics in Ecology Introduction.

Extent and Mask Extent of original data Extent of analysis area Mask – areas of interest Remember all rasters are rectangles.

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.

Niches, Interactions and Movements. Calculating a Species Distribution Range Jorge Soberon M. A. Townsend Peterson.

Enrique Martínez-Meyer

Niches, distributions… and data Miguel Nakamura Centro de Investigación en Matemáticas (CIMAT), Guanajuato, Mexico Warsaw, November 2007.

Steps 3 & 4: Analysis & interpretation of evidence.

Chapter1: Introduction Chapter2: Overview of Supervised Learning

PCB 3043L - General Ecology Data Analysis.

Remote-sensing and biodiversity in a changing climate Catherine Graham SUNY-Stony Brook Robert Hijmans, UC-Berkeley Lianrong Zhai, SUNY-Stony Brook Sassan.

Ryan DiGaudio Modified from Catherine Jarnevich, Sunil Kumar, Paul Evangelista, Jeff Morisette, Tom Stohlgren Maxent Overview.

Building ecological concepts. Something that you can´t explain in your own words is unknown Here is a list of concepts that you will need for the next.

Machine Learning 5. Parametric Methods.

CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.

Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.

Who will you trust? Field technicians? Software programmers?

‘Recording effort’ (ln+1 transformed)

Introduction to species distribution Models

Chapter 7. Classification and Prediction

On the Meaning and Interpretation of Predictive Future Models:

PCB 3043L - General Ecology Data Analysis.

Combining Ocean Observing Systems with Statistical Analysis to Account for a Dynamic Habitat Collin Dobson1,John Manderson2,Josh Kohut1,Laura Palamara1,Oscar.

CJT 765: Structural Equation Modeling

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Data Mining (and machine learning)

UNIT-4 BLACKBOX AND WHITEBOX TESTING

Chapter 3: Communities, Biomes, and Ecosystems

Figure 1. Spatial distribution of pinyon-juniper and ponderosa pine forests is shown for the southwestern United States. Red dots indicate location of.

Statistical Methods For Engineers

Species distribution modeling ideas

Species Distribution Models

Modelling alien invasives using the GARP system

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

What Is Science? Read the lesson title aloud to students.

Chapter 3.3 – Studying Organisms in Ecosystems

The loss function, the normal equation,

Lev Tarasov, Radford Neal, and W. R. Peltier University of Toronto

Mathematical Foundations of BME Reza Shadmehr

On Maxent Jorge Soberon University of Kansas.

(-4)*(-7)= Agenda Bell Ringer Bell Ringer

Chapter 9 Hypothesis Testing: Single Population

Reuben Feinman Research advised by Brenden Lake

UNIT-4 BLACKBOX AND WHITEBOX TESTING

More on Maxent Env. Variable importance:

Presentation transcript:

Introduction to Niche Modeling A small bit of theory re: niches How niche modeling works G-space and E-space How it came to be Uses in ecology and evolution - present, past and future modeling of species distributions - predicting disease spread - predicting invasive species spread - niche conservation Note: some material has been used from internet sources in regards to niche modeling pedagogy so thanks to Arthur Chapman, Town Peterson, Enrique Martinez-Meyer and others.

Niche Distinctions Grinnellian Eltonian Hutchinsonian Spatially explicit Focus on Non-interactive Requirements for populations to thrive Measurable from distribution Eltonian Focus on community impacts, biotic interactions, i.e. species functional roles Hutchinsonian Also focus on non-interactive requirements Defined Fundamental Niche– mostly what we think of as environmental variables Defined Realized Niche– subset of Fundamental Niche + biotic interactions

Chthamalus and Balanus In the intertidal. Balanus cannot stand Two barnacle species, Chthamalus and Balanus In the intertidal. Balanus cannot stand exposure to air - similar fundamental and realized Niche. Chthalamus cannot compete with Balanus but if Balanus is removed, it can survive lower in the intertidal - different fundamental and realized niche. Balanus - You can get a sense for the difference between fundamental and realized niche in this slide. - You are looking at two species of barnacles that occur in the intertidal. One species occurs in the lower intertidal and the other in the high intertidal. Balanus has a physical limiting factor --- it cannot be exposed to much air. It is therefore restricted to the lower intertidal. Chthamalus is not limited in its distribution by any physical limiting factor– it could occur in the lower intertidal, but it doesn’t show up there because it is out competed by Balanus. Therefore its realized niche is smaller than its potential niche. 3

HOW CAN WE RECONSTRUCT THE FUNDAMENTAL NICHE? (we can start by looking at where a species occurs) Poecile gambeli – Mountain chickadee Dots are occurrences of Poecile gambeli across its range 4

How Can We Model the Fundamental Niche? Geographic Space Ecological Space ecological niche modeling temperature Model of niche in ecological dimensions precipitation occurrence points on current distribution We can take occurrence points in geographic space and ask what environmental conditions are like where those occurrences are located. We can then generate a graph of something like temperature and precipitation, delimiting where the species falls in that space. 5

From Peterson and Soberon Geographic Space Ecological Space ecological niche modeling temperature Model of niche in ecological dimensions precipitation occurrence points on current distribution Projection back onto onto climate landscapes at the Last Glacial Maximum Current range prediction Last Glacial Maximum prediction From Peterson and Soberon

SOME TERMINOLOGY Geographic Space Environmental Space G is the geographic space, typically composed of 2-D pixels Ga , Gp = The abiotically suitable area (potential distribution) Gb = The biotically suitable area Gm = Accessable area through dispersal Gi = Invadable distributional area Go = Occupied distributional area Gdata = set of observations (presences, and, if existing, true absences). E Environmental space of environmental variables. Ea Scenopoetic fundamental niche Ei Invadable niche space Eo Occupied niche space Ep Biotically reduced niche

Example Mapping Between Geographic Space and Environmental Space Porque no occupado? Ea Note: This Area is occupied but not sampled --- (because you are Omiscient In this example. Work with me.) Eo Go is shown as gray shading, and Ga is “white”

General species’ distribution modeling approach Notice that the diagram uses the same hypothetical case as in the previous slide. A modelling technique (e.g. GARP, Maxent) is used to characterize the species’ niche in environmental space by relating observed occurrence localities to a suite of environmental variables. Notice that, in environmental space, the model may not identify either the species’ occupied niche or fundamental niche; rather, the model identifies only that part of the niche defined by the observed records. When projected back into geographical space, the model will identify parts of the actual distribution and potential distribution. For example, the model projection labeled 1 identifies the known distributional area. Projected area 2 identifies part of the actual distribution that is currently unknown; however, a portion of the actual distribution is not predicted because the observed occurrence records do not identify the full extent of the occupied niche (i.e. there is incomplete sampling; see area D on the previous slide). Similarly, modeled area 3 identifies an area of potential distribution that is not inhabited (the full extent of the potential distribution is not identified because the observed occurrence records do not identify the full extent of the fundamental niche due to, for example, incomplete sampling, biotic interactions, or constraints on species dispersal; see areas D and E on the previous slide). Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

Key factors determining the degree to which observed localities can be used to estimate the niche or distribution: Equilibrium: A species is said to be at equilibrium with current environmental conditions if it occurs in all suitable areas, whilst being absent from all unsuitable areas. What causes disequilibrium? Sampling adequacy: The extent to which the observed occurrence records provide a sample of the environmental space. The importance of this cannot be overestimated How could you possibly know? Regarding equilibrium: For example, the degree of equilibrium for different groups of organisms in Europe varies considerably, with more dispersive species (e.g. birds) relatively closer to equilibrium than less dispersive species (e.g. reptiles) (Araújo & Pearson 2005). More or less BIOLOGICAL Regarding sampling adequacy: In some instances, very few occurrence records may be available, perhaps due to limited survey effort. In such cases, the available records are unlikely to provide a sample of available environments that is sufficient to enable to the full range of conditions occupied by the species to be identified. In other cases, surveys may have provided extensive occurrence records that provide a fairly accurate picture as to the environments inhabited by a species in a particular region (e.g. European plant data). Each of these factors (equilibrium and sampling adequacy) should be carefully considered to ensure appropriate use of a species’ distribution model More or less EFFORT/LOGISTIC, based on human liabilities not biology Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

The Ideal Scenario: at equilibrium and good sampling In the ideal case, the species of interest would be at equilibrium and we would have complete sampling of the environment. In such a case the actual and potential distributions would be identical and we would expect to model both accurately. However, how useful is the model under these circumstances? Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

Suppose high equilibrium but poor sampling (in both geographical and environmental space) New areas to survey! In this case, the model identified part of the species actual distribution that is unknown. GREAT for predicting new areas to survey! Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

Suppose high equilibrium and poor sampling in geographical space, but good sampling in environmental space Note that there is not necessarily a direct relationship between sampling adequacy in geographical space and in environmental space. It is quite possible that poor sampling in geographical space could still result in good sampling in environmental space. For example, if a geographic area that has not been sampled has environmental conditions that are similar to those in an area that has been sampled, then sampling adequacy in environmental space will not be affected. Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

Suppose low equilibrium but good sampling Potential Distribution Fundamental Niche In this case the model identified an area of potential distribution that is environmentally similar to where the species has been observed, but which is not inhabited. This type of prediction may be very useful for identifying sites suitable for the reintroduction of an endangered species, identifying areas where the species may become invasive, or guiding field surveys toward the discovery of unknown species that are closely related. Distribution models may therefore prove useful even in cases where species’ equilibrium is low. Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

Circle A represents area where abiotic conditions are right for a species to occur (Ga) Circle B represent the area where lack of competition, disease, and occurrence of mutualists allows populations to grow. Circle M is area within which individuals & populations are capable of moving due to lack of dispersal barriers. Go is occupied area Gi is invadable area Note: niche modeling pulls occurrences from that intersection.

From Soberon and Peterson, 2005, Biodiversity Informatics Circle A represents area where abiotic conditions are right for a species to occur (=Fundamental niche Ea) Circle B represent the area where lack of competition,disease, and occurrence of mutualists allows populations to grow Circle M is area within which individuals & populations are capable of moving due to lack of dispersal barriers Intersection of A and B is biotically reduced niche (Ep) Intersection A, M, B is occupied niche space (Eo). E From Soberon and Peterson, 2005, Biodiversity Informatics

SOME POSSIBLE OUTCOMES Best Case: Weak, diffuse abiotic interactions and lack of dispersal barriers create general overlap. No dispersal barriers, but area of “correct” biotic interactions different from area of correct abiotic conditions. Estimate of FN using occurrence data should be carefully examined FN (and potential distribution) will be much larger than actual distribution due to dispersal limitations From Soberon and Peterson, 2005, Biodiversity Informatics

What abiotic factors determine fundamental niche? The answer is complicated (but important) Species have physiological tolerances, migration limitations and evolutionary forces that limit adaptation A starting point for physiology may be traits A starting point for abiotic factors is often climate Climate variables often also correlate with other variables (elevation, land cover)

“Easy” In Theory --- But how does it work in practice? The development of spatial ecological modeling approaches occurs in 90s But has origins in ongoing innovations from the 70s forward A bit of history…

How do we in practice model the “scenopoetic” ecological niche How do we in practice model the “scenopoetic” ecological niche? and How do we determine a species distribution (actual and potential) and what is the difference?

Around 1990 three things happened Large databases of presences of species (mainly computerized scientific collections) began being accessible at significant amounts

II. GIS… Geographical Information Systems technology became widely accessible to ecologists and biogeographers

IV. Worldwide Environmental Data Layers Remote sensing data Land cover/land type Vegetation Terrain Ocean SST, chlorophyll Slope, aspect, flow rate hydrology data Climatology databases Worldclim (what we’ll use in this class) Models of worldwide past and future climates (IPCC) All other ancillary data layers (roads, human population density, etc)

Which leads to an NCEAS Working Group Title: Choosing (and making available) the right environmental layers for modeling how the environment controls the distribution and abundance of organisms Aim: To generate co-registered environmental data layers at 1km resolution representing climate, vegetation/landcover, hydrology/topography, marine.

A TOUGH GIG (Actually this meeting was a lot of work!)

WORLDWIDE MEAN ANNUAL TEMPERATURES (GREEN=cold, RED=hot)‏ NOW We have amazing resources now for looking at worldwide climate These maps show reconstructions of climate across the world, now and at 18,000 years before present, based on general circulation models. The red colors are warm, and the green colors cold If you are interested in general circulation models and how they are constructed, talk to me after class Note that at the LGM it was much colder in north temperate regions than now. - LGM (based on General Circulation Models)‏ 27

WORLDWIDE MEAN ANNUAL TEMPERATURES (GREEN=cold, RED=hot)‏ NOW North America Here is a view of climate conditions now in North America, on the upper panel - The lower panel shows predicted temperatures if we double CO2 concentrations in the atmosphere over the next one hundred years. Note further warming, especially in the north. Double CO2, 2100 CE, North America (CCM models)‏ 28

stack of environmental data layers Set of occurrence record precipitation Inputs into a niche model: stack of environmental data layers Set of occurrence records representing presences temperature elevation soils

NICHE AND DISTRIBUTION MODELING Input: Species Presence Input Env. Data Layers CAN WE PREDICT NICHE AND DISTRIBUTION FROM SUCH DATA? (answer: maybe!) From Maxent presentation by Pearson

The outcome of a niche model is: a prediction of suitable habitats for that taxon (based on the input data). Output of suitability can be a yes/no or a probability function from 0-100. Panel B - input data points in black and suitable habitat in the western US for Neotoma cinerea Panel D - close-up of suitable/unsuitable areas in the Great Basin of Western NA.

PART 1 : Idealized Workflow for building and validating a species distribution model: Acquire species occurrence data (e.g. fieldwork, museum voucher specimens, observations, surveys, etc) Map/vet the species’ distribution data; especially if coordinates are from third-party sources (e.g. removing geographic and environmental outliers) Apply modeling algorithm (e.g. Bioclim, Maxent, artificial neural network, general linear model, boosted regression tree) Process environmental layers to generate predictor variables important in defining species’ distributions (e.g. maximum daily temperature, frost days, soil water balance) and convert to appropriate formats Collate GIS database of environmental layers (e.g. temperature, precipitation, soil type) Model calibration (select suitable parameters, test importance of alternative predictor variables)

PART 2 : Idealized Workflow for building and validating a species distribution model: Test model performance through additional fieldwork or statistical approach (e.g. AUC or Kappa or null model comparisons) If possible, test model against observed data, such as occurrence records in an invaded region, or distribution shifts over recent decades Model species’ distribution in a different region (e.g. for an invasive species) or for a different time period (e.g. under future climate scenario) Create map of current modeled distribution The steps shown here follow those on the previous slide and are for testing the model and making a prediction once the model has been calibrated. The first step is to test how well the model is able to predict the known distribution, and the second step is to make a prediction (e.g. into a new region or for a different climate scenario) and, if possible, test if the predictions agree with observed data. Modified from NCEP module Species distribution modeling for conservation educators and practitioners.

A stopping point

Adapted from a presentation by Enrique Martinez-Meyer and others SOME ISSUES WITH MODELING Determining Species Distribution given that: Most occurrence data available for the vast majority of species are presence-only Sampling effort across most species’ distributional ranges is uneven and eco-geographically biased We do not know what environmental variables are relevant for each species.

Modeling Niches All niche modeling approaches model the function approximating the true relationship between the environment (i.e., the niche) and species geographic occurrences/distribution.

Modeling Niches P2 All want to estimate function f = μ(Gdata, E) - that is the result of applying an algorithm to data given an environmental space E to estimate G (distribution) Different algorithms have different data requirements True presence-only Presence-absence Presence-background (can be any sample from within environment) Presence-pseudoabsence (a pseudoabsence cannot be where a species is known to occur)

Algorithms Applied to the Problem Method(s) Model/software name Species data type Climatic envelope BIOCLIM Presence-only Gower Metric DOMAIN Ecological Niche Factor Analysis (ENFA) BIOMAPPER Presence/background Maximum Entropy MAXENT Genetic algorithm GARP Presence/pseudo-absence Regression: Generalized linear model (GLM) and Generalized additive model (GAM) GRASP Presence/absence Artificial Neural Network (ANN) SPECIES Classification and regression trees (CART), GLM, GAM and ANN BIOMOD Boosted decision trees (implemented in R) Multivariate adaptive regression splines (MARS) From Richard Pearson et al. 2006

Niche Modeling Has Problems PT 2 tradeoffs w/algorithms Many algorithms do not handle asymmetric data (e.g. GLM, GAM) Many don’t handle interaction effects (BioClim) - Some of the do not handle nominal environmental variables (e.g. soil classes) [e.g. BioClim, ENFA] - Many stochastic algorithms present different solutions even under identical parameterization and input data (e.g. GARP) - We do not know the ‘real’ distribution of species, so we do not know when models are making mistakes and when are filling knowledge gaps.

Modeling Approaches Presence only (bioclimatic envelopes or mahalanobis distance) – points inside envelope suitable or distance of points away from mean values (farther away equals less suitable) Presence-absence – GAMs, GLMs, MARs, CARTs. Use a link or function or set of logical statements describing the multivariate relationship between mean of response variable and predictor variables. Note: best for determining occupied distribution (not potential dist.) Presence-background – Maxent finds the probability distribution most spread out, or closest to uniform, subject to constraints given observed occurrence records information and environmental conditions across study area. All regression techniques work with background as well. Presence-pseudoabsence – GARP. Rule set predictions.

Example of Presence-Only Envelope Approach - BioClim Heuristic based model Works with presence-only data Simple to use 35-dimensional Hypercube in climate-space (19 in Diva-GIS) Tends to over-predict Works with small number of records Will work in batch mode Can’t make quantitative predictions or provide confidence levels Used for predicting potential distributions Versions incorporated into Diva-GIS

BioClim Type Modeling The dot-dash line square is the BioClim fit of the data (for two dimensions ) This defines an range of the values in the occupied by a species across all environmental variables for all axes. Anything in this box might be considered “suitable”. From Peterson et al. ms. Ecological Niches and Geographic Distributions: A Modeling Perspective

Presence-Background Modeling No known absences How to determine false absences from true absences then? Solution (of sorts): Compare background is the set of grid cells used in modeling Note: These points include input true presences Question: What does this mean for model validation?

Modeling with Maxent Assume presence records come from some unknown probability distribution called p How to estimate probability function over a set of grid cells, G? What is the probability that any one grid cell, g, is suitable for a species?

variables and determine means, SDs in terms of experienced climate Modeling in Maxent We can join the presence records for a taxon to the underlying environmental variables and determine means, SDs in terms of experienced climate annual minimum coolest month maximum warmest month range coolest quarter warmest Wettest dryest Mean 17.2 6.2 26.1 19.9 12.3 21.3 20.0 13.8 S.D. 1.8 2.0 1.6 2.1 3.6 Min 12.1 0.2 23.9 18.1 5.8 18.3 10.6 5%-ile 25%-ile 16.4 6.1 24.6 18.5 11.8 20.2 2.5 75%-ile 7.2 12.8 2.8 14.8 95%-ile 19.6 9.2 29.0 23.0 15.2 23.7 23.4 17.6 max 29.4 25.4 23.8 23.6 Temperature profiles for Acacia orites

Modeling with Maxent Each grid cell has a set of “features” defined by the environment. Features can be the raw environment or some more complex function of those environmental variables (linear, quadratic, logistic) Grid cells with presences can be summed to determine means and SDs across all environmental variables in order to estimate p Means of the probability distribution match the observed means Find the flattest function (one that maximizes entropy)

Modeling with Maxent Maxent is an iterative approach Starts with a fully uniform distribution over all grid cells Conducts optimization routine to maximize “gain” Gain is likelihood statistic maximizing the probability of the presences given input data and in relation to the background data Gain will asymptote (maximizing fit) leading final probability distribution Distribution becomes the basis for fitted predictor variable coefficients These coefficients are used to assess probability of presence

Maxent Maxent is run by first selecting a set of input environmental data layers in a common GIS forrmat (gridded .ASC giles) Next select a set of species occcurrence locations defined by lat/lon Important to subset data into training and testing. Training data builds model, testing data is used for validation

More on Maxent maximum spread = maximizing the log likelihood of the data associated with the presence sites minus a penalty term (think AIC) Penalty term is basically related to a weighting based on how much information the environmental data adds to the model. The best weighting term is discovered through a sequential updating algorithm run a specified number of iterations (you can change this parameter)

More on Maxent Maxent regularization parameter determines “penalty function” - smaller values tend to overfit models (typically leading to smaller geo. distributions) & larger values do the opposite. You can choose culmulative versus logistic outputs. Logistic is interpreted as probability of presence (e.g. what you most often want) Definitely create response curves What about features?

More on Maxent What are features? The environmental layers are used to produce "features", which constrain the probability distribution that is being computed. The available feature types are linear, quadratic, product, threshold and hinge/discrete. Some features give Maxent a lot of latitude in deriving response variables. You can choose to include different types of features

More on Maxent What does a Maxent run produce? A HTML file showing run outputs A grid file importable into a GIS CSV files containing ommission, prediction details Focus on the HTML file, which contains: A picture of the map A table of different thresholds * A model validation statistical summary * An explanation of importance of variables Response curves * we’ll discuss model validation tomorrow