Download presentation
Presentation is loading. Please wait.
1
Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil
Source Apportionment of particulate Matter Using Principal Component Analysis and Positive Matrix Factorisation Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil School of Environmental and Natural Resource Sciences, Universiti Kebangsaan Malaysia, Bangi, Malaysia Research Centre for Tropical Climate Change System, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi, Malaysia Institute for Environment and Development (Lestari), Universiti Kebangsaan Malaysia, Bangi, Malaysia Atmospheric Chemistry and Air Pollution Research Group
2
Determination of Air Pollution Sources
Schematic representations of the different methods for source identification (EU 2014) Atmospheric Chemistry and Air Pollution Research Group
3
Receptor Model Atmospheric Chemistry and Air Pollution Research Group
4
Principal Component Analysis
It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Principal component analysis (PCA) is also a technique used to emphasize variation and bring out strong patterns in a dataset. It's often used to make data easy to explore and visualize. Atmospheric Chemistry and Air Pollution Research Group
5
Toy Example Pretend we are
studying the motion of the physicist’s ideal spring. This system consists of a ball of mass m attached to a massless, frictionless spring. In other words, the underlying dynamics can be expressed as a function of a single variable x. Atmospheric Chemistry and Air Pollution Research Group
6
Source-compositions Receptor models (e.g., PMF, CMB, PCA) Monitor
7
Source Apportionment Models
Widely Used Models Other Available Models PCA/ APCS - simplified model Weighted APCS - deals “zero score” but lack of non-negativity requirement PCA/Absolute principal component score(APCS) EPA‘S Chemical Mass Balance (CMB) PMF is complicated and robust model PMF - lower uncertainty and stop producing zero factor score, requires component loadings and scores to be non-negative Capable of identifying sources without any prior knowledge of sources Unmix Positive Matrix Factorization (PMF) Artificial Neural Networks-Source receptor modelling
8
PCA/APCS/PMF/MLRA address with the following formula Measurement error
Normalized data Source contribution Source profile Atmospheric Chemistry and Air Pollution Research Group
9
Data matrix Data matrix Source contribution Profiles
Atmospheric Chemistry and Air Pollution Research Group
10
Preparation of Database
Common problem Systematic bias-analysis by different labs or different methods Presence of data below MDL Presence of coelution (non-target analytes that elute at the same time as a target analyte) Data entry, identify outliers Noisy data Missing data Exclude variables if missing >50% Atmospheric Chemistry and Air Pollution Research Group
11
Preparation of Database
Continue.. Replace data below MDL with MDL/2 Replace missing data with average value of nearby data, or simply the average of the variable concentration Data normalization or conversion of the data into unit less or zero/centered mean Adequate number of data point and variables Atmospheric Chemistry and Air Pollution Research Group
12
Independent (y) and Dependent (x) Missing value
Step 1: Get Data Suitable data (N) Independent (y) and Dependent (x) Missing value Atmospheric Chemistry and Air Pollution Research Group
13
Adequate number of data set
No of data point must be more than no of variables No of data point should be 5 times of variables N > or = 100 samples (PK Hopke) N>(30+p+3)/2 (Henry et al 1984) N=50 (source unknown) N=30 (magic number!) Suitability test (KMO and Bartlett’s test): Our suggestions!! Atmospheric Chemistry and Air Pollution Research Group
14
Optimization Factor >1 Eigen value Variance (%) ~ 10 or >10
Interpretable factor profiles At least one variables should response significantly Exclude variable if doesn’t response to any factor either! Atmospheric Chemistry and Air Pollution Research Group
15
Step 1: Get Data Suitable data (N) Missing value
Atmospheric Chemistry and Air Pollution Research Group
16
Step 2: Normalize the Data in Excel
X – Average Stdev Use “$” for Average and Standard Deviation Paste formula e.g. =SUM(H3-H$632)/H$633 Atmospheric Chemistry and Air Pollution Research Group
17
Upload data into SPSS Atmospheric Chemistry and Air Pollution Research Group
18
Step 3: Suitability of the Data
KMO and Bartlett’s test Atmospheric Chemistry and Air Pollution Research Group
19
KMO and Bartlett’s test
Atmospheric Chemistry and Air Pollution Research Group
20
Step 4: Run PCA Atmospheric Chemistry and Air Pollution Research Group
21
Run PCA Atmospheric Chemistry and Air Pollution Research Group
22
PCA Results Atmospheric Chemistry and Air Pollution Research Group
23
PCA Results Atmospheric Chemistry and Air Pollution Research Group
24
Step 5: Copy and paste the Factor Scores in a Excel Sheet from Step 4
Atmospheric Chemistry and Air Pollution Research Group
25
Step 6: Prepare a New Raw Data Set Adding a Zero Sample at the End of the Row
Atmospheric Chemistry and Air Pollution Research Group
26
Step 7: Run PCA for the Second Time
Atmospheric Chemistry and Air Pollution Research Group
27
Step 8: Copy and paste the Factor Scores in a Excel Sheet from Step 7
Atmospheric Chemistry and Air Pollution Research Group
28
The revised factor scores are recognized here APCS
Step 9: Subtract the Factor score for Zero Sample from the Each Sample in Step-8 The revised factor scores are recognized here APCS Atmospheric Chemistry and Air Pollution Research Group
29
Step 10: Run MLR using PM2.5 mass as Dependent Variables and Each of the APCS is Independent Variable. Atmospheric Chemistry and Air Pollution Research Group
30
Step 10: Convert the APCS into Factor Mass by Multiplying the Respective Regression Coefficient
Atmospheric Chemistry and Air Pollution Research Group
31
Atmospheric Chemistry and Air Pollution Research Group
32
Demonstration of US EPA PMF 5.0
33
Upload input files
34
Execution of the PMF model
35
Responses of PMF5.0 {Mg, Zn, Cu, Ni, Ca2+}
{Pb, NH4+, K+, As, Cd, Zn, Ni, V } {NO3-} Slope = 0.91, R2 = 0.88 P < 0.01 {As, Ba, Sr, Se} PMF ….Fit line {Na+, Cl-, SO42-} HVS
36
Acknowledgement School of Environmental and Natural Resource Sciences, Universiti Kebangsaan Malaysia, Bangi, Malaysia Research Centre for Tropical Climate Change System, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi, Malaysia Atmospheric Chemistry and Air Pollution Research Group Atmospheric Chemistry and Air Pollution Research Group
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.