Presentation is loading. Please wait.

Presentation is loading. Please wait.

Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil

Similar presentations


Presentation on theme: "Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil"— Presentation transcript:

1 Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil
Source Apportionment of particulate Matter Using Principal Component Analysis and Positive Matrix Factorisation Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil School of Environmental and Natural Resource Sciences, Universiti Kebangsaan Malaysia, Bangi, Malaysia Research Centre for Tropical Climate Change System, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi, Malaysia Institute for Environment and Development (Lestari), Universiti Kebangsaan Malaysia, Bangi, Malaysia Atmospheric Chemistry and Air Pollution Research Group

2 Determination of Air Pollution Sources
Schematic representations of the different methods for source identification (EU 2014) Atmospheric Chemistry and Air Pollution Research Group

3 Receptor Model Atmospheric Chemistry and Air Pollution Research Group

4 Principal Component Analysis
It is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Principal component analysis (PCA) is also a technique used to emphasize variation and bring out strong patterns in a dataset. It's often used to make data easy to explore and visualize. Atmospheric Chemistry and Air Pollution Research Group

5 Toy Example Pretend we are
studying the motion of the physicist’s ideal spring. This system consists of a ball of mass m attached to a massless, frictionless spring. In other words, the underlying dynamics can be expressed as a function of a single variable x. Atmospheric Chemistry and Air Pollution Research Group

6 Source-compositions Receptor models (e.g., PMF, CMB, PCA) Monitor

7 Source Apportionment Models
Widely Used Models Other Available Models PCA/ APCS - simplified model Weighted APCS - deals “zero score” but lack of non-negativity requirement PCA/Absolute principal component score(APCS) EPA‘S Chemical Mass Balance (CMB) PMF is complicated and robust model PMF - lower uncertainty and stop producing zero factor score, requires component loadings and scores to be non-negative Capable of identifying sources without any prior knowledge of sources Unmix Positive Matrix Factorization (PMF) Artificial Neural Networks-Source receptor modelling

8 PCA/APCS/PMF/MLRA address with the following formula Measurement error
Normalized data Source contribution Source profile Atmospheric Chemistry and Air Pollution Research Group

9 Data matrix Data matrix Source contribution Profiles
Atmospheric Chemistry and Air Pollution Research Group

10 Preparation of Database
Common problem Systematic bias-analysis by different labs or different methods Presence of data below MDL Presence of coelution (non-target analytes that elute at the same time as a target analyte) Data entry, identify outliers Noisy data Missing data Exclude variables if missing >50% Atmospheric Chemistry and Air Pollution Research Group

11 Preparation of Database
Continue.. Replace data below MDL with MDL/2 Replace missing data with average value of nearby data, or simply the average of the variable concentration Data normalization or conversion of the data into unit less or zero/centered mean Adequate number of data point and variables Atmospheric Chemistry and Air Pollution Research Group

12 Independent (y) and Dependent (x) Missing value
Step 1: Get Data Suitable data (N) Independent (y) and Dependent (x) Missing value Atmospheric Chemistry and Air Pollution Research Group

13 Adequate number of data set
No of data point must be more than no of variables No of data point should be 5 times of variables N > or = 100 samples (PK Hopke) N>(30+p+3)/2 (Henry et al 1984) N=50 (source unknown) N=30 (magic number!) Suitability test (KMO and Bartlett’s test): Our suggestions!! Atmospheric Chemistry and Air Pollution Research Group

14 Optimization Factor >1 Eigen value Variance (%) ~ 10 or >10
Interpretable factor profiles At least one variables should response significantly Exclude variable if doesn’t response to any factor either! Atmospheric Chemistry and Air Pollution Research Group

15 Step 1: Get Data Suitable data (N) Missing value
Atmospheric Chemistry and Air Pollution Research Group

16 Step 2: Normalize the Data in Excel
X – Average Stdev Use “$” for Average and Standard Deviation Paste formula e.g. =SUM(H3-H$632)/H$633 Atmospheric Chemistry and Air Pollution Research Group

17 Upload data into SPSS Atmospheric Chemistry and Air Pollution Research Group

18 Step 3: Suitability of the Data
KMO and Bartlett’s test Atmospheric Chemistry and Air Pollution Research Group

19 KMO and Bartlett’s test
Atmospheric Chemistry and Air Pollution Research Group

20 Step 4: Run PCA Atmospheric Chemistry and Air Pollution Research Group

21 Run PCA Atmospheric Chemistry and Air Pollution Research Group

22 PCA Results Atmospheric Chemistry and Air Pollution Research Group

23 PCA Results Atmospheric Chemistry and Air Pollution Research Group

24 Step 5: Copy and paste the Factor Scores in a Excel Sheet from Step 4
Atmospheric Chemistry and Air Pollution Research Group

25 Step 6: Prepare a New Raw Data Set Adding a Zero Sample at the End of the Row
Atmospheric Chemistry and Air Pollution Research Group

26 Step 7: Run PCA for the Second Time
Atmospheric Chemistry and Air Pollution Research Group

27 Step 8: Copy and paste the Factor Scores in a Excel Sheet from Step 7
Atmospheric Chemistry and Air Pollution Research Group

28 The revised factor scores are recognized here APCS
Step 9: Subtract the Factor score for Zero Sample from the Each Sample in Step-8 The revised factor scores are recognized here APCS Atmospheric Chemistry and Air Pollution Research Group

29 Step 10: Run MLR using PM2.5 mass as Dependent Variables and Each of the APCS is Independent Variable. Atmospheric Chemistry and Air Pollution Research Group

30 Step 10: Convert the APCS into Factor Mass by Multiplying the Respective Regression Coefficient
Atmospheric Chemistry and Air Pollution Research Group

31 Atmospheric Chemistry and Air Pollution Research Group

32 Demonstration of US EPA PMF 5.0

33 Upload input files

34 Execution of the PMF model

35 Responses of PMF5.0 {Mg, Zn, Cu, Ni, Ca2+}
{Pb, NH4+, K+, As, Cd, Zn, Ni, V } {NO3-} Slope = 0.91, R2 = 0.88 P < 0.01 {As, Ba, Sr, Se} PMF ….Fit line {Na+, Cl-, SO42-} HVS

36 Acknowledgement School of Environmental and Natural Resource Sciences, Universiti Kebangsaan Malaysia, Bangi, Malaysia Research Centre for Tropical Climate Change System, Institute of Climate Change, Universiti Kebangsaan Malaysia, Bangi, Malaysia Atmospheric Chemistry and Air Pollution Research Group Atmospheric Chemistry and Air Pollution Research Group


Download ppt "Md Firoz Khan, Mohd Talib Latif, Norhaniza Amil"

Similar presentations


Ads by Google