EEC 688/788 Secure and Dependable Computing

Slides:

Advertisements

Similar presentations

3 Copyright © 2005, Oracle. All rights reserved. Designing J2EE Applications.

Advertisements

1 Chi-Square Test -- X 2 Test of Goodness of Fit.

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.

Chapter 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.

Hypothesis Testing Steps in Hypothesis Testing:

Hypothesis Testing IV Chi Square.

Sapana Mehta (CS-6V81) Overview Of J2EE & JBoss Sapana Mehta.

Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.

EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Enterprise Applications & Java/J2EE Technologies Dr. Douglas C. Schmidt Professor of EECS.

EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.

EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EEC-681/781 Distributed Computing Systems Lecture 10 Wenbing Zhao Cleveland State University.

Winter Retreat Connecting the Dots: Using Runtime Paths for Macro Analysis Mike Chen, Emre Kıcıman, Anthony Accardi, Armando Fox, Eric Brewer

J2EE Kenneth M. Anderson CSCI Web Technologies October 3, 2001.

AM Recitation 2/10/11.

Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.

Enterprise JavaBeans. What is EJB? l An EJB is a specialized, non-visual JavaBean that runs on a server. l EJB technology supports application development.

EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 20 Slide 1 Critical systems development 3.

EEC 688/788 Secure and Dependable Computing Lecture 8 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EEC 688/788 Secure and Dependable Computing Lecture 6 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EEC 688/788 Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.

Building Dependable Distributed Systems, Copyright Wenbing Zhao

Progress Report Armando Fox with George Candea, James Cutler, Ben Ling, Andy Huang.

EEC 688/788 Secure and Dependable Computing Lecture 6 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EJB Enterprise Java Beans JAVA Enterprise Edition

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

CHAPTER 11 CHI-SQUARE TESTS

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

EEC 688/788 Secure and Dependable Computing

CHAPTER 3 Architectures for Distributed Systems

#01 Client/Server Computing

Chapter 12: Inference about a Population Lecture 6b

Web Application Architectures

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Web Application Server 2001/3/27 Kang, Seungwoo. Web Application Server A class of middleware Speeding application development Strategic platform for.

Web Application Architectures

EEC 688/788 Secure and Dependable Computing

CHAPTER 11 CHI-SQUARE TESTS

EEC 688/788 Secure and Dependable Computing

Component-based Applications

EEC 688/788 Secure and Dependable Computing

Statistics II: An Overview of Statistics

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Chapter Outline Goodness of Fit test Test of Independence.

EEC 688/788 Secure and Dependable Computing

#01 Client/Server Computing

Presentation transcript:

EEC 688/788 Secure and Dependable Computing Lecture 8 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org Maulik no show 10/5/2009

EEC688/788: Secure & Dependable Computing Outline Recovery oriented computing Overview Application level fault detection Structural behavior monitoring Path shape analysis 2/22/2019 EEC688/788: Secure & Dependable Computing

Recovery-Oriented Computing On availability of soft realtime systems Availability = MTTF/(MTTF+MTTR) MTTF: mean time to failure MTTR: mean time to recover Availability can be improved by increasing MTTF as well as reducing MTTR Recovery-oriented computing: focusing on reducing MTTR Making fault detection faster and more accurate Making recovery faster

Fault Detection and Localization Fault detection: determine if some component in the system has failed Fault localization: pinpoint the particular component that failed Low-level fault detection mechanism Based on timeout, probing each component periodically with a heartbeat message Cannot detect many application-level faults Recovery-oriented computing: focusing on application-level fault detection and localization 75% of the recovery time is spent on application-level fault detection

Microreboot and System-Level Undo/Redo Microreboot: many problems can be fixed by simply restarting the faulty component Works best with component-based systems For problems cannot be fixed by microreboot, performs system-level undo, fixed the problem, then carries out system-level redo Based on checkpointing and logging

System Model for Recovery-Oriented Computing Three-tier architecture Separating application logic and data management Middle-tier is stateless or maintains only session state Component-based middleware Java Platform, Enterprise Edition (Java EE often referred to as J2EE) Key component: Enterprise Java Bean (EJB)

Application-Level Fault Detection Fail-stop faults can be detected using timeouts Application-level faults can only be detected in the application level One plausible fault detection method: acceptance test Developer would have to develop effective and efficient acceptance test routines Not practical for Internet apps due to their scale, complexity and rapid rate of changes ROC-based approach: measure and monitor structural behaviors of an app May detect app-level faults without a priori knowledge of the app details

Structural Behavior Monitoring Interaction patterns between different components reflect the app-level functionality Each component implements a specific app function, e.g., Stateful session bean to manage a user’s shopping cart A set of singleton session beans to keep track of inventory The internal structural behavior can be monitored to infer whether or not the app is functioning normally To monitor Log runtime path for each end-user request, including all incoming msgs, outgoing msgs, method invocations, etc.

Structural Behavior: Runtime Path Example Runtime path for a single end-user request Span 5 components Consist of 10 events

Structural Behavior: Machine Learning Train reference models using machine learning Historical reference model: training with aggregated runtime path data Objective: anomaly detection based on historical behavior May use real workload as well as synthetic workload that resembles real workload Peer reference model: train with most recent runtime path data Objective: anomaly detection with respect to the peer components Must train with real workload Fault (anomaly) detection: comparing observed patterns with those in the reference models

Component Interactions Modeling Focus on interactions between a component instance and all other component classes More scalable: can cope with cases when there are many instances of each class Suitable for using the Chi-square test for anomaly detection

Component Interactions Modeling Given a system with n component classes, the interaction model for a component instance consists of a set of n-1 weighted links between the instance and all the other n-1 component classes We assume instances of the same class do not interact with each other We assume that interactions are symmetric (i.e., request and reply) Weight assigned to each link is the probability of the component instance interacting with the linked component class The sum of the weight on all links is 1, i.e., the component instance has probability of 1 to interact with other component classes

Component Interaction Model: Example Class A: web component, handles end-user requests Class B: app logic, handles conversations with end-users, 3 instances Class C and Class D: also app logic, representing shared state Class E: database server, persistent state

Component Interaction Model: Example Machine learning: determine link weight based on training data Training data A issued 400 remote invocations on b1 b1 issued 300 local method invocations on C, and 300 invocations on D Not important what happened between C & E, D & E Link weight calculation Total number interactions occurred at b1 instance: 1000 P(b1-A) = 400/1000 = 0.4 P(b1-C) = 300/1000 = 0.3 P(b1-D) = 300/1000 = 0.3

Anomaly Detection Comparison of current behavior with the trained behavior: use Chi-Square test Prepare the observed data as a histogram Compare distribution using formula: n: number of cells in the histogram ei: expected frequency in cell i oi: observed frequency in cell i If ei is 0, the cell should be pruned off Each link is regarded as a cell For observation period of m requests, expected frequency for link i: ei = m * pi No anomaly: D = 0 ideally. In practice, D is not 0 due to randomness, it follows a chi-square distribution

Anomaly Detection: Chi-Square Test Anomaly detected: D > the 1-a quantile of the chi-square distribution with freedom of degree of k=n-1 at a level of significance a Higher level of a => more sensitive => more false positive Level of significance: the probability of rejecting the null hypothesis in a statistical test when it is true http://www.merriam-webster.com/dictionary/level%20of%20significance

Anomaly Detection: Chi-Square Test: Example Observation period: 100 requests A issued 45 requests on b1 b1 issued 35 invocations on C, and 20 invocations on D Link(A-b1): expected value is 100*0.4=40, observed 45 Link(C-b1): expected: 100*0.3=30, observed 35 Link(D-b1): expected: 100*0.3, observed 20 D=(45-40)2/40 + (35-30)2/30+(20-30)2/30 = 4.79 Chi-square test: degree of freedom is 2 (only 3 cells), for a=0.1, 90% quantile is 4.6 => anomaly detected

Path Shapes Modeling The shape of a runtime path is defined to be the ordered set of component classes A path shape is represented as a tree in which a node represents a component class The directional edge represents the causal relationship between two adjacent nodes

Path Shapes Modeling The probabilistic context-free grammar (PCFG) is used for path shape modeling (in Chomsky Normal Form, CNF) A list of terminal symbols, Tk, component classes in a path shape form Tk A list of nonterminal symbols, Ni Denote the stages of the production rules N1: start symbol, often denoted as S $: the end of a rule All other nonterminal symbols are to be replaced by production rules (see below) A list of production rules, Ni -> zj (a list of terminals and nonterminals) A list of probabilities Rij = P(Ni -> zj )

Path Shape Modeling: Example Path shape for 4 end-user requests 100% probability for the call to transit from A to B R1j: SA, p=1.0 R2j: AB, p=1.0

Path Shape Modeling: Example For B, 3 possible transitions: to C with 25%, to D with 25%, and to both C&D with 50 probability R3j: BC, p=0.25 | BD, p=0.25 | BCD, p=0.5 Once a call reaches C or D, it must transit to E, hence: R4j: CE, p=1.0 R5j: DE, p=1.0 E is the last stop for all R6j: E$, p=1.0

Path Shape Modeling: Anomaly Detection The path shape of new requests can be judged to see if they confirm to the grammar An anomaly is detected if a path shape does not conform to the grammar PCFG itself only detect fault, but not pinpoint root cause (localization of fault) Need to use other method, such as decision tree

Exercise 1: The following are the interactions that occurred in a system at instance b1 during a period, the total invocations on b1 at an instance are 1200. The remote invocation on b1 by A, the local method invocation by C, D, E and F are 300,200,300,200 and 200. If remote invocations on A by b1, the local method invocations on C, D, E and F observed are 35, 25,20,15, and 25 then find if anomalies are present in the system?