MIXING MODELS AND END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES Mark Williams and Fengjing Liu Department of Geography and Institute of Arctic and.

Slides:



Advertisements
Similar presentations
Component Analysis (Review)
Advertisements

Lecture 3: A brief background to multivariate statistics
Surface normals and principal component analysis (PCA)
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Dimension reduction (1)
Chapter 9 Gauss Elimination The Islamic University of Gaza
Visual Recognition Tutorial
Lecture 7: Principal component analysis (PCA)
Principal Component Analysis CMPUT 466/551 Nilanjan Ray.
A Physicists’ Introduction to Tensors
Motion Analysis Slides are from RPI Registration Class.
MIXING MODELS AND END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES Matt Miller and Nick Sisolak Slides Contributed by: Mark Williams and Fengjing Liu.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Evaluating Hypotheses
Hydrologic Mixing Models Ken Hill Andrew McFadden.
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
Mark Williams, CU-Boulder Using isotopes to identify source waters: mixing models.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
MOHAMMAD IMRAN DEPARTMENT OF APPLIED SCIENCES JAHANGIRABAD EDUCATIONAL GROUP OF INSTITUTES.
1 Pertemuan 21 Matakuliah: I0214 / Statistika Multivariat Tahun: 2005 Versi: V1 / R1 Analisis Struktur Peubah Ganda (I): Analisis Komponen Utama.
Tables, Figures, and Equations
Lecture II-2: Probability Review
Algebra Problems… Solutions
Multivariate Data and Matrix Algebra Review BMTRY 726 Spring 2012.
Separate multivariate observations
Chemometrics Method comparison
Calibration & Curve Fitting
Eigenvectors and Eigenvalues
Probability of Error Feature vectors typically have dimensions greater than 50. Classification accuracy depends upon the dimensionality and the amount.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
Examining Relationships in Quantitative Research
Elementary Linear Algebra Anton & Rorres, 9th Edition
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
U.S. Department of the Interior U.S. Geological Survey End-Member Mixing Analysis Applied to the Karstic Madison Aquifer Using Water Chemistry in the Southern.
MIXING MODELS AND END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES
Section 2.3 Properties of Solution Sets
Source waters and flow paths in an alpine catchment, Colorado, Front Range, United States Fengjing Liu, Mark W. Williams, and Nel Caine 2004.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
1 Sample Geometry and Random Sampling Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
ECE 8443 – Pattern Recognition LECTURE 08: DIMENSIONALITY, PRINCIPAL COMPONENTS ANALYSIS Objectives: Data Considerations Computational Complexity Overfitting.
Principal Components Analysis. Principal Components Analysis (PCA) A multivariate technique with the central aim of reducing the dimensionality of a multivariate.
END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES Mark Williams and Fengjing Liu Department of Geography and Institute of Arctic and Alpine Research,
Chapter 9 Gauss Elimination The Islamic University of Gaza
Correlation & Regression Analysis
Source waters, flowpaths, and solute flux in mountain catchments
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
1 DETERMINATION OF SOURCES AND FLOWPATHS USING ISOTOPIC AND CHEMICAL TRACERS, GREEN LAKES VALLEY, ROCKY MOUNTAINS Fengjing Liu and Mark Williams Department.
Principal Component Analysis
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computer Graphics Mathematical Fundamentals Lecture 10 Taqdees A. Siddiqi
END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES Fengjing Liu University of California, Merced.
Precalculus Fifth Edition Mathematics for Calculus James Stewart Lothar Redlin Saleem Watson.
The simple linear regression model and parameter estimation
Statistical Data Analysis - Lecture /04/03
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Chemical Hydrograph Separation
CHAPTER 29: Multiple Regression*
Using isotopes to identify source waters: mixing models
X.1 Principal component analysis
Principal Component Analysis (PCA)
Feature space tansformation methods
Presentation transcript:

MIXING MODELS AND END-MEMBER MIXING ANALYSIS: PRINCIPLES AND EXAMPLES Mark Williams and Fengjing Liu Department of Geography and Institute of Arctic and Alpine Research, University of Colorado, Boulder, CO80309

OUTLINES OF LECTURE  OVERVIEW OF MIXING MODEL  OVERVIEW OF END-MEMBER MIXING ANALYSIS (EMMA) -- PRINCIPAL COMPONENT ANALYSIS (PCA) -- STEPS TO PERFORM EMMA  APPLICATIONS OF MIXING MODEL AND EMMA -- GREEN LAKES VALLEY -- LEADVILLE MINE SITUATION

PART 1: OVERVIEW OF MIXING MODEL  Definition of Hydrologic Flowpaths  2-Component Mixing Model  3-Component Mixing Model  Generalization of Mixing Model  Geometrical Definition of Mixing Model  Assumptions of Mixing Model

HYDROLOGIC FLOWPATHS

MIXING MODEL: 2 COMPONENTS One Conservative Tracer Mass Balance Equations for Water and Tracer

MIXING MODEL: 3 COMPONENTS (Using Specific Discharge) Two Conservative Tracers Mass Balance Equations for Water and Tracers Simultaneous Equations Solutions Q - Discharge C - Tracer Concentration Subscripts - # Components Superscripts - # Tracers

MIXING MODEL: 3 COMPONENTS (Using Discharge Fractions) Two Conservative Tracers Mass Balance Equations for Water and Tracers Simultaneous Equations Solutions f - Discharge Fraction C - Tracer Concentration Subscripts - # Components Superscripts - # Tracers

MIXING MODEL: Generalization Using Matrices One tracer for 2 components and two tracers for 3 components N tracers for N+1 components? -- Yes However, solutions would be too difficult for more than 3 components So, matrix operation is necessary Simultaneous Equations Where Solutions Note: C x -1 is the inverse matrix of C x This procedure can be generalized to N tracers for N+1 components

MIXING MODEL: Geometrical Perspective For a 2-tracer 3-component model, for instance, the mixing subspaces are defined by two tracers. If plotted, the 3 components should be vertices of a triangle and all streamflow samples should be bound by the triangle. If not well bound, either tracers are not conservative or components are not well characterized. f x can be sought geometrically, but more difficult than algebraically.

ASSUMPTIONS FOR MIXING MODEL  Tracers are conservative (no chemical reactions);  All components have significantly different concentrations for at least one tracer;  Tracer concentrations in all components are temporally constant or their variations are known;  Tracer concentrations in all components are spatially constant or treated as different components;  Unmeasured components have same tracer concentrations or don’t contribute significantly.

A QUESTION TO THINK ABOUT  What if we have the number of conservative tracers much more than the number of components we seek for, say, 6 tracers for 3 components?  For this case, it is called over-determined situation  The solution to this case is EMMA, which follows the same principle as mixing models.

PART 2: EMMA AND PCA  EMMA Notation  Over-Determined Situation  Orthogonal Projection  Notation of Mixing Spaces  Steps to Perform EMMA

DEFINITION OF END-MEMBER  For EMMA, we use end-members instead of components to describe water contributing to stream from various compartments and geographic areas  End-members are components that have more extreme solute concentrations than streamflow [Christophersen and Hooper, 1992]

EMMA NOTATION (1)  Hydrograph separations using multiple tracers simultaneously;  Use more tracers than necessary to test consistency of tracers;  Typically use solutes as tracers Modified from Hooper, 2001

EMMA NOTATION (2)  Measure p solutes; define mixing space (S- Space) to be p-dimensional  Assume that there are k linearly independent end-members (k < p)  B, matrix of end-members, (k  p); each row b j (1  p)  X, matrix of streamflow samples, (n observations  p solutes); each row x i (1  p)

PROBLEM STATEMENT  Find a vector f i of mixing proportions such that  Note that this equation is the same as generalized one for mixing model; the re- symbolizing is for simplification and consistency with EMMA references  Also note that this equation is over- determined because k < p, e.g., 6 solutes for 3 end-members

SOLUTION FOR OVER- DETERMINED EQUATIONS  Must choose objective function: minimize sum of squared error  Solution is normal equation [Christophersen et al., 1990; Hooper et al., 1990]:  Constraint: all proportions must sum to 1  Solutions may be > 1 or < 0; this issue will be elaborated later

ORTHOGONAL PROJECTIONS  Following the normal equation, the predicted streamflow chemistry is [Christophersen and Hooper, 1992]:  Geometrically, this is the orthogonal projection of x i into the subspace defined by B, the end- members

This slide is from Hooper, 2001

OUR GOALS ACHIEVED SO FAR? We measure chemistry of streamflow and end-members. Then, we can derive fractions of end-members contributing to streamflow using equations above. So, our goals achieved? Not quite, because we also want to test end-members as well as mixing model. We need to define the geometry of the solute “cloud” (S-space) and project end-members into S-space! How? Use PCA to determine number and orientation of axes in S-space. Modified from Hooper, 2001

EMMA PROCEDURES Identification of Conservative Tracers - Bivariate solute-solute plots to screen data; PCA Performance - Derive eigenvalues and eigenvectors; Orthogonal Projection - Use eigenvectors to project chemistry of streamflow and end-members; Screen End-Members - Calculate Euclidean distance of end- members between their original values and S-space projections; Hydrograph Separation - Use orthogonal projections and generalized equations for mixing model to get solutions! Validation of Mixing Model - Predict streamflow chemistry using results of hydrograph separation and original end-member concentrations.

STEP 1 - MIXING DIAGRAMS Look familiar? This is the same diagram used for geometrical definition of mixing model (components changed to end-members); Generate all plots for all pair-wise combinations of tracers; The simple rule to identify conservative tracers is to see if streamflow samples can be bound by a polygon formed by potential end-members or scatter around a line defined by two end-members; Be aware of outliers and curvature which may indicate chemical reactions!

STEP 2 - PCA PERFORMANCE For most cases, if not all, we should use correlation matrix rather than covariance matrix of conservative solutes in streamflow to derive eigenvalues and eigenvectors; Why? This treats each variable equally important and unitless; How? Standardize the original data set using a routine software or minus mean and then divided by standard deviation; To make sure if you are doing right, the mean should be zero and variance should be 1 after standardized!

APPLICATION OF EIGENVALUES Eigenvalues can be used to infer the number of end-members that should be used in EMMA. How? Sum up all eigenvalues; Calculate percentage of each eigenvalue in the total eigenvalue; The percentage should decrease from PCA component 1 to p (remember p is the number of solutes used in PCA); How many eigenvalues can be added up to 90% (somewhat subjective! No objective criteria for this!)? Let this number be m, which means the number of PCA components should be retained (sometimes called # of mixing spaces); (m +1) is equal to # of end-members we use in EMMA.

STEP 3 - ORTHOGONAL PROJECTION X - Standardized data set of streamflow, (n  p); V - Eigenvectors from PCA, (m  p); Remember only the first m eigenvectors to be used here! Use the same equation above; Now X represents a vector (1  p) for each end-member; Remember X here should be standardized by subtracting streamflow mean and dividing by streamflow standard deviation! Project End-Members

STEP 4 - SCREEN END-MEMEBRS Plot a scatter plot for streamflow samples and end-members using the first and second PCA projections; Eligible end-members should be vertices of a polygon (a line if m = 1, a triangle if m = 2, and a quadrilateral if m = 3) and should bind streamflow samples in a convex sense; Calculate the Euclidean distance between original chemistry and projections for each solute using the equations below: Algebraically Geometrically j represent each solute and b j is the original solute value Those steps should lead to identification of eligible end-members!

STEP 5 - HYDROGRAPH SEPARATION Use the retained PCA projections from streamflow and end- members to derive flowpath solutions! So, mathematically, this is the same as a general mixing model rather than the over-determined situation.

STEP 6 - PREDICTION OF STREAMFLOW CHEMISTRY Multiply results of hydrograph separation (usually fractions) by original solute concentrations of end-members to reproduce streamflow chemistry for conservative solutes; Comparison of the prediction with the observation can lead to a test of mixing model.

PROBLEM ON OUTLIERS PCA is very sensitive to outliers; If any outliers are found in the mixing diagrams of PCA projections, check if there are physical reasons; Outliers have negative or > 1 fractions; See next slide how to resolve outliers using a geometrical approach for an end-member model.

RESOLVING OUTLIERS A, B, and C are 3 end- members; D is an outlier of streamflow sample; E is the projected point of D to line AB; a, b, d, x, and y represent distance of two points; We will use Pythagorean theorem to resolve it. The basic rule is to force f c = 0, f A and f B are calculated below [Liu et al., 2003]:

APPLICATION IN GREEN LAKES VALLEY: RESEARCH SITE Sample Collection Stream water - weekly grab samples Snowmelt - snow lysimeter Soil water - zero tension lysimeter Talus water – biweekly to monthly Sample Analysis Delta 18 O and major solutes Green Lake 4

GL4:  18 O IN SNOW AND STREAM FLOW

V  R  OF  18 O IN SNOWMELT  18 O gets enriched by 4% o in snowmelt from beginning to the end of snowmelt at a lysimeter; Snowmelt regime controls temporal variation of  18 O in snowmelt due to isotopic fractionation b/w snow and ice; Given f is total fraction of snow that have melted in a snowpack,  18 O values are highly correlated with f (R 2 = 0.9, n = 15, p < 0.001); Snowmelt regime is different at a point from a real catchment; So, we developed a Monte Carlo procedure to stretch the dates of  18 O in snowmelt measured at a point to a catchment scale using the streamflow  18 O values.

GL4: NEW WATER AND OLD WATER Old Water = 64%

STREAM CHEMISTRY AND DISCHARGE

MIXING DIAGRAM: PAIRED TRACERS

FLOWPATHS: 2-TRACER 3- COMPONENT MIXING MODEL

MIXING DIAGRAM: PCA PROJECTIONS PCA Results: First 2 eigenvalues are 92% and so 3 EMs appear to be correct!

FLOWPATHS: EMMA

DISTANCE OF END-MEMBERS BETWEEN U-SPACE AND THEIR ORIGINAL SPACE (%)

EMMA VALIDATION: TRACER PREDICTION

LEADVILLE CASE STUDY  Rich mining legacy  Superfund site: over $100M so far  Complicated hydrology: Mine shafts Faults Drainage tunnels We know nothing about mountain groundwater!  What are water sources to drainage tunnel?  Complicated, rigorous test

COMPLICATED GEOLOGY, HYDROLOGY

APPLICATION AT LEADVILLE

 18 O IN VARIOUS SAMPLES GW: from BMW-3 to YT-BH; SFW: from CG-03 to PWCW; SPR: from EFS-1 to SPR-23 Note: * means outlier

TRITIUM IN VARIOUS SAMPLES GW: from BMW-3 to YT-BH; SFW: from CG-03 to PWCW; SPR: from EFS-1 to SPR-23

VARIATION OF TRITIUM AND  18 O Seasonal variation of tritium and  18 O is less marked at INF-1 than EMET; Hydrological regime (flowpath) appears to be different at INF-1 and EMET.

MIXING DIAGRAMS Potential end-members are clustered and circled; Unique end-members generally cannot be identified; The bigger the circle, the higher the uncertainty in identifying a unique end-member; Recall from the last slide that tritium has increased 4 TU from Nov’02 to Feb’03 at EMET; This leads to recognition of Elkhorn to be an unambiguous EM.

MIXING DIAGRAMS EM used in the triangle is a representative from the circle only and not our current recommendation; # of EM and EM themselves may change from time to time due to sampling problem; The value of  18 O at EMET in June 2003 may be due to analytical problem, or mixing with rainwater, or with water from Marion which generally has higher  18 O.

MIXING DIAGRAMS Mixing diagram of  18 O and tritium for July 2003 is somewhat troubled; the circles are inter-crossed.

SUMMARY FOR MIXING DIAGRAMS OF TRITIUM AND  18 O EMs may change from time to time within a water year; Except for Elkhorn, unique EMs cannot be identified at this time; However, EM clusters are usually consistent from time to time; One cluster includes: WO3, CT, YT, and WCCPZ-1; The other cluster generally includes: SPR-23, PWBEINF, SDDS, SDDS-2, SHG07A, EFS-1, BMW-4, CG-03, CG-04; Particularly, some EMs could be excluded from a potential EM list: OG1TMW-1, BMW-3, MAB, and SPR-20.

PCA RESULTS: EIGENVALUES The first 2 PCA components explain 80% and 85% of total variance at INF-1 and EMET, respectively; The first 3 PCA components explain 95% of total variance at both sites; Either 3 or 4 EMs appear to be appropriate in EMMA.

PCA MIXING DIAGRAMS FOR INF-1 PCA conducted by 10 tracers:  18 O, 3 H, Alkalinity, Temperature, Conductance, Ca 2+, Mg 2+, Na +, SO 4 2-, and Si; Note that conservativity of tracers used here are not justified by pair-wise mixing diagrams.

PCA MIXING DIAGRAMS FOR INF-1 Same as the last one, but enlarged by eliminating some EMs; Unique EMs still cannot be identified; One EM appears to be missing.

PCA MIXING DIAGRAMS FOR EMET Use 9 tracers without Alkalinity; Unique EMs cannot be identified this time.

SUMMARY FOR PCA AND EMMA Unique EMs cannot be identified at this time; However, some potential end-members are consistent with the mixing diagrams of tritium and  18 O such as Elkhorn, CT, and CG-03; Future work is needed to plot mixing diagrams for all tracers so that non-conservative tracers can be eliminated;

IMPLICATION FOR FUTURE SAMPLING SCHEME Monthly or bi-monthly sampling scheme does capture seasonal signal within a water year; But this scheme may miss temporal variation within all seasons; Hydrological regime may change from season to season and within seasons; So, temporally intensive sampling scheme may be needed to capture within-season variation in order to unanimously identify EMs using EMMA.

SUMMARY: MIXING MODEL VS EMMA  Easy to understand and manipulate!  Doable with limited measurements of solutes!  But different tracers may yield different results! General Mixing Model EMMA  Use more tracers than necessary to lead to consistent results;  Provide a framework for analyzing watershed chemical data sets;  Generate testable hypotheses that focus future field efforts!

REDERENCES  Hooper, R., 2001, http: //  Christophersen, N., C. Neal, R. P. Hooper, R. D. Vogt, and S. Andersen, Modeling stream water chemistry as a mixture of soil water end-members – a step towards second-generation acidification models, Journal of Hydrology, 116, ,  Christophersen, N. and R. P. Hooper, Multivariate analysis of stream water chemical data: the use of principal components analysis for the end- member mixing problem, Water Resources Research, 28(1), ,  Hooper, R. P., N. Christophersen, and N. E. Peters, Modeling stream water chemistry as a mixture of soil water end-members – an application to the Panola mountain catchment, Georgia, U.S.A., Journal of Hydrology, 116, ,  Liu, F., M. Williams, and N. Caine, in review, Source waters and flowpaths in a seasonally snow-covered catchment, Colorado Front Range, USA, Water Resources Research, 2003.