Jia-Jyun Dong, Yu-Hsiang Tung, Chien-Chih Chen, Jyh-Jong Liao, Yii-Wen Pan 學 生 :陳奕愷 Logistic regression model for predicting the failure probability of.

Slides:



Advertisements
Similar presentations
Brief introduction on Logistic Regression
Advertisements

Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
A COMPARISON OF HOMOGENEOUS AND MULTI-LAYERED BERM BREAKWATERS WITH RESPECT TO OVERTOPPING AND STABILITY Lykke Andersen, Skals & Burcharth ICCE2008, Hamburg,
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Linear Model Incorporating Feature Ranking for Chinese Documents Readability Gang Sun, Zhiwei Jiang, Qing Gu and Daoxu Chen State Key Laboratory for Novel.
1 The Role of the Revised IEEE Standard Dictionary of Measures of the Software Aspects of Dependability in Software Acquisition Dr. Norman F. Schneidewind.
Logistic Regression Part I - Introduction. Logistic Regression Regression where the response variable is dichotomous (not continuous) Examples –effect.
Prof. Dr. Robert Jüpner 4th ISFD, Toronto 2008 The new European Flood Management Directive and the municipal flood management system as one realization.
Chapter 4 Validity.
Graph Data Management Lab School of Computer Science , Bristol, UK.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Concept of Measurement
Simple Linear Regression
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistical Methods Chichang Jou Tamkang University.
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
1 Simple Linear Regression Linear regression model Prediction Limitation Correlation.
Environmental Safety Assessment Eric Silberhorn, PhD, DABT.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Decision Tree Models in Data Mining
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Assessment of Model Development Techniques and Evaluation Methods for Binary Classification in the Credit Industry DSI Conference Jennifer Lewis Priestley.
1 Opinion Spam and Analysis (WSDM,08)Nitin Jindal and Bing Liu Date: 04/06/09 Speaker: Hsu, Yu-Wen Advisor: Dr. Koh, Jia-Ling.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Understanding Statistics
1 Ch 3: Forecasting: Techniques and Routes. 2 Study objectives After studying this chapter the reader should be able to: Evaluate the suitability of several.
Division of Population Health Sciences Royal College of Surgeons in Ireland Coláiste Ríoga na Máinleá in Éirinn Indices of Performances of CPRs Nicola.
1 Ch 3: Forecasting: Techniques and Routes. 2 Study objectives After studying this chapter the reader should be able to: Evaluate the suitability of several.
Prediction of Malignancy of Ovarian Tumors Using Least Squares Support Vector Machines C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel 1, I.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
1 Everyday is a new beginning in life. Every moment is a time for self vigilance.
EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
APPLIED DATA ANALYSIS IN CRIMINAL JUSTICE CJ 525 MONMOUTH UNIVERSITY Juan P. Rodriguez.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Multiple Discriminant Analysis
Chapter 16 Data Analysis: Testing for Associations.
Multivariate Data Analysis Chapter 1 - Introduction.
Evaluating Risk Adjustment Models Andy Bindman MD Department of Medicine, Epidemiology and Biostatistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
IMPACT 3-5th November 20044th IMPACT Project Workshop Zaragoza 1 Investigation of extreme flood Processes and uncertainty IMPACT Investigation of Extreme.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice- Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition.
Linear Discriminant Analysis and Logistic Regression.
Classification Ensemble Methods 1
Heart Disease Example Male residents age Two models examined A) independence 1)logit(╥) = α B) linear logit 1)logit(╥) = α + βx¡
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Binary logistic regression. Characteristic Regression model for target categorized variable explanatory variables – continuous and categorical Estimate.
Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.
The Next Generation of Research on Earthquake-induced Landslides: An International Conference in Commemoration of 10th Anniversary of the Chi-Chi Earthquake,
Blackbox classifiers for preoperative discrimination between malignant and benign ovarian tumors C. Lu 1, T. Van Gestel 1, J. A. K. Suykens 1, S. Van Huffel.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Toward a New ATM Software Safety Assessment Methodology dott. Francesca Matarese.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
RASP - Risk Assessment of flood and coastal defence for Strategic Planning A High Level Methodology Project partners and co-authors Paul Sayers / Corina.
Dealing with Uncertainty Assessing a Project’s Worth under Uncertainty or Risk.
Failure Modes, Effects and Criticality Analysis
Predicting Mortgage Pre-payment Risk. Introduction Definition Borrower pays off the loan before the contracted term loan length. Lender loses future part.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Logistic Regression: Regression with a Binary Dependent Variable.
Multivariate Analysis - Introduction. What is Multivariate Analysis? The expression multivariate analysis is used to describe analyses of data that have.
Multivariate Analysis - Introduction
Analysis of influencing factors on Budyko parameter and the application of Budyko framework in future runoff change projection EGU Weiguang Wang.
Evaluation of measuring tools: reliability
Multivariate Analysis - Introduction
Presentation transcript:

Jia-Jyun Dong, Yu-Hsiang Tung, Chien-Chih Chen, Jyh-Jong Liao, Yii-Wen Pan 學 生 :陳奕愷 Logistic regression model for predicting the failure probability of a landslide dam

OUTLINE Introduction Methodology Results Conclusions

3 Introduction After the formation of a landslide dam, the natural lake may bequickly breached with an outburst flood and debris flow that resultsin a catastrophic disaster. Rapid assessment of the landslide-dam stability is one of the crucial steps for decision-making to reduce the related disasters. Geomorphic approaches are widely used to correlate the dam, river, impoundment characteristics and landslide-dam stability.

4 Introduction DBI 43 well documented landslide dams find out the dominant variables stability of a landslide dam and to construct a series of multivariate regression models

5 Introduction Discriminant models PHWL Ds = -2.94log(A) -4.58log(H) +4.17log(W) +2.39log(L) Discriminant models AHWL Ds = -2.62log(A) -4.67log(H) +4.57log(W) +2.67log(L) +8.26

6 Introduction Ds is the discriminant score; P, H,W, L, A are the Peak flow, dam height, width, length and catchment area, respectively. Discriminant models AHV Ds = -2.13log(A) -4.08log(H) +2.94log(V) +4.09

7 Introduction The risk assessment and management may involve (1) Estimating the risk level (2) judging whether the risk level acceptable (3) exercising appropriate countermeasures to reduce the risk This work also compared the performance of the logistic regression models with those of the previous DBI index- based graphic model and discriminant models.

8 Methodology Logistic regression is useful when the dependent variable is categorical (e.g., presence or absence) and the explanatory (independent) variables are categorical, numerical, or both.

9 Methodology (Categorization of dataset by logistic regression analysis) Probability of a landslide dam remaining stable : Certain linear combination of the influencing variables :

10 Methodology (Performance evaluation of the logistic regression model) (1) The training set (17 unstable and 5 stable) (2) The target set (17 unstable and 4 stable)

11 Methodology (Performance evaluation of the logistic regression model) By using the target set, we then evaluated the predictive success of the model built on the training set. The prediction performance of a predictive model can be evaluated via the ROC (relative operating characteristic) diagram. The ROC diagram method has been widely used to measure the prediction potential of landslide susceptibility models. A larger area under the ROC Curve (AUC) indicates better model prediction;the index AUC ranges from 0.5 (for models with no predictive capability) to 1.0 (for models with perfect predictive power).

12 Methodology (Jack-knife technique) The jack-knife technique can be utilized to sort the importance of the relevant variables in the logistic regression model. We eliminate the variables in the logistic regression model one by one.

13 Results (Logistic regression models PHWL_Log and AHWL_Log) Logistic regression models PHWL_Log

14 Results (Logistic regression models PHWL_Log and AHWL_Log) Logistic regression models AHWL_Log

15 Results (Logistic regression models PHWL_Log and AHWL_Log)

16 Results (Logistic regression models PHWL_Log and AHWL_Log)

17 Results (Logistic regression models PHWL_Log and AHWL_Log) The overall prediction power (success rate) of AHV_Log was 89.3%.The cross-validation accuracy was 85.7%. The AUC of the Model AHV_Log was Comparing with the overall prediction powers of 70.1% and 64.9%,for the discriminant model AHV_Dis and for the index-based graphic model (DBI 3.08 unstable), the logistic regression model seems to have a better ability to categorize landslide dams (as stable or unstable).

18 Results (Importance of the factors affecting landslide-dam stability) The jack-knife technique was utilized to examine the relative importance of each variable in the predictive models. We eliminated one of the four variables one by one and established four logistic regression sub-models.

19 Results (Importance of the factors affecting landslide-dam stability)

20 Results (Failure probability of landslide dams in Tobata inventory) About 87% of unstable landslide dams had a failure probability greater than 80%

21 Results (Failure probability of landslide dams in Tobata inventory) About 85% of unstable landslide dams had a failure probability greater than 80%

22 Results (Failure probability of landslide dams in Tobata inventory)

23 Results ( Application to the landslide dams induced by recent catastrophic events )

24 Conclusions The proposed models PHWL_Log and AHWL_Log were able to categorize the landslide dams into stable and unstable groups with high success rates. Model PHWL_Log (AUC=94.8%) was slightly superior to model AHWL_Log (AUC=92.5%). Yet, model AHWL_Log may be more useful in practice because peak flow information is not always available in the early stage after dam formation.

25 Conclusions The log-transformed peak flows are identified as the most important geomorphic variables influencing the stability of a landslide dam. The log-transformed dam height, with a negative contribution to the stability of a landslide dam,is the second most significant variable. In addition to the classification of landslide dams into the stable and unstable groups, the failure probability can also be evaluated based on the proposed logistic regression models.

26 Conclusions The failure probability of an earthquake-induced landslide dam decreased if the prediction model AHWL_Log was replaced by model PHWL_Log. The failure probability of a heavy rainfall-induced case predicted by model PHWL_Log is higher than that by AHWL_Log because a high flow rate is incorporated into the prediction model (PHWL_Log).

27 Conclusions It would be more suitable if both the failure probability and the impact due to the outburst flood from the landslide-dammed lake could be considered separately for classifying the risk level of a landslide dam. A simple model describing the failure probability of landslide dams, such as the proposed logistic regression model,is necessary for classifying the risk level related to a landslide dam breach.

28 Conclusions The proposed models can be used for evaluating the risk associated with outburst floods from landslide-dammed lakes.These models can be used as an evaluation tool for decision-making concerning hazard mitigation actions, especially when the allowable time is limited.

29 THANK YOU FOR YOUR LISTENING