Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond

Similar presentations


Presentation on theme: "Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond"— Presentation transcript:

1 Using Visual Symptoms for Debugging Presentation Failures in Web Applications
Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond University of Southern California Los Angeles, California, USA Work supported by NSF Grant CCF

2 Background Information
What do we mean by presentation? “Look and feel” of the website in a browser What is a presentation failure? Web page rendering ≠ expected appearance Importance of presentation Aesthetics impact users’ evaluation [Tractinsky et. al. 2006] Impacts trustworthiness and usability [Lindgaard et. al. 2011]

3 Usage Scenario Regression Debugging – modify current version of the
web page to correct a bug or refactor the HTML structure. <table> <div> Menu | Contact <tr> <td> <div> News Username <tr> <td> <td> <div> <div> Password Sign in <tr> <td> <div> About us | Feedback| FAQ Web page Table-based layout Div-based layout

4 Usage Scenario – Difficulties
Menu | Contact Menu | Contact News Username News Username Password Problem1 Password Sign in Sign in About us | Feedback| FAQ About us | Feedback| FAQ Problem2 Oracle (Previous version) Test web page Developer

5 Usage Scenario – Difficulties
Analyze the observed differences Menu | Contact Menu | Contact News Username News Username Explore the UI to find the fault Password Password Sign in Sign in About us | Feedback| FAQ Background color of “Sign in” button About us | Feedback| FAQ Oracle (Previous version) Test web page Developer

6 Usage Scenario – Difficulties
Analyze the observed difference Menu | Contact Menu | Contact Manual debugging is difficult Complex interaction between HTML, CSS, and JS Hundreds of HTML elements + CSS properties Makes labor intensive and error prone Prior user study [Mahajan et. al. ICST 2015] Correct fault identified in only 36% test cases! News Username News Username Explore the UI to find the fault Password Password Sign in Sign in About us | Feedback| FAQ Background color of “Sign in” button About us | Feedback| FAQ Oracle (Previous version) Test web page Developer

7 Limitations of Existing Techniques
DOM comparison techniques (e.g., XBT) Not effective if DOM has changed significantly Invariant specification techniques (e.g., Selenium) Not practical, since all correctness properties need to be provided Fighting layout bugs Checks app independent problems only Our approach – Automate debugging of presentation failures

8 Color related presentation failure
Three Key Insights Visual differences can help diagnosis Visual symptoms Sign in Oracle Color related presentation failure Sign in Test page

9 Definition Visual symptom – boolean predicate describing
the visual difference clues to the fault Visual Symptoms CSS Properties 1. Almost matched element margin-top, padding-top, etc. 2. Shift bottom element margin-top, margin-left, etc. 3. Page size changed height, width, padding, etc. 4. Added color background-color, color, etc. 23. All diff. pixels in top of the element padding-top, border-top-width, etc. Almost matched element – only position changed – e.g.: margin-top – sub-image searching Almost matched element Shift bottom element Shift bottom element – moved downwards – e.g.: margin-top – analyze diff. pixels . . .

10 <button, background-color>
Three Key Insights Probabilistic correlations can help identify faults <button, background-color> Color Symptom Size Symptom T F 0.0 1.0 0.2 0.8 0.95 0.05 0.7 0.3

11 ✗ ✗ ✔ Three Key Insights Building probabilistic model
Approaches Pool of known presentation failures Differences depend on page layout. Not generalizable. Historical data for the page Available only for mature pages Manual extraction from bug-tracking system Probabilistic models can be automatically generated from the faulty test page

12 Our Approach Input Test page Oracle image (previous version screenshot, mockup etc.) Phases Detect presentation failures Build the probabilistic model Identify the most likely faults Output Ranked list of likely faults The goal is to automatically identify the fault of a presentation failure observed in a test page. Remove bullet points

13 P1. Detect Presentation Failures
Use WebSee [Mahajan et. al. ICST 2015] Oracle image Presentation failures Visual comparison Computer vision technique, Perceptual Image Differencing (PID) Test web page

14 P2. Build the Probabilistic Model
Model based on conditional probability Set of visual symptoms e = HTML element Fault = Root cause <e, p> p = CSS property Probability that a potential root cause, r, is faulty given the observed set of visual symptoms, S.

15 P2. Build the Probabilistic Model
Generate data samples Inject faults into the test page Assign different values to potential root causes Observe visual symptoms Build truth table

16 Truth Table – Example ✗ ✔ T F F, F T, T T, F F, T Data samples
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T Data samples <div, margin-top = 0px> <div, margin-top = 50px>

17 P2. Build the Probabilistic Model
Generate data samples Inject faults into the test page Assign different values to potential root causes Observe true visual symptoms Build truth table Calculate probabilities Individual symptoms and conditional probability Learn correlation between the root cause and visual symptoms

18 Probabilities Calculation
Conditional probability Bayes’ theorem r = root cause S = set of visual symptoms

19 Probabilities Calculation
P(S|r) = Probability of the status of visual symptoms S given r is the faulty root cause r = root cause S = set of visual symptoms Assumes visual symptoms are conditionally independent given the root cause Advantages Easier to calculate Parallelizable

20 Probabilities Calculation
P(S|r) = Probability of the status of visual symptoms S given r is the faulty root cause r = root cause S = set of visual symptoms Measure P(s|r) in data samples Observe visual symptoms for a seeded root cause

21 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T

22 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T

23 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue 1.0 0.0 <div, margin-top> 0px, 50px 0.5

24 Probabilities Calculation
P(r) = Relative probability of r being the faulty root cause r = root cause S = set of visual symptoms Assume developers cause faults with uniform probability r = <e, p> e = HTML element p = CSS property

25 Probabilities Calculation
P(r) = Relative probability of r being the faulty root cause r = root cause S = set of visual symptoms r = <e, p> e = HTML element p = CSS property

26 P(p) Computation – Example
Total 2 properties in the page color, margin-top

27 Probabilities Calculation
P(S) = Probability of symptoms in S being T/F for a given page r = root cause S = set of visual symptoms P(S) is independent of r Values of s S are given

28 Probabilities Calculation
P(e), P(S) = Constants r = root cause S = set of visual symptoms

29 P3. Identify Most Likely Root Causes
for r R = {<e1, p1>, …, <en, pn>} 1. calculate P(p) for r = <e, p> 2. determine visual symptoms, S 3. for s S look up P(s|r) in the model 4. calculate Rank root causes by their probabilities

30 Empirical Evaluation RQ1: How accurate is our approach in identifying root causes of presentation failures? RQ2: What are the computational resources needed to run our approach?

31 Implementation Approach implemented in FieryEye (火眼)
Building the probabilistic model Parallelized over 200 Amazon EC2 c4.large instances Identifying visual symptoms Used OpenCV to compare screenshots, extract color information, perform sub-image searching, etc.

32 Experiment Protocol Refactoring of web pages For each subject
Migrate HTML 4 to 5 (<div id=‘head’> to <header>) Convert table-based layout to div based Replace deprecated tags (<font> to CSS font) For each subject Download page (H), take screenshot = oracle Refactor H to get H’ Seed presentation failure in H’ to create a variant Run FieryEye on oracle and variant WebSee, XPERT, Text Diff Tool (TDT) – diff Regression Debugging activity Generate test cases Performance comparison

33 Subjects Random URL generator (http://www.uroulette.com) Subject
Size (Total RC) Generated # test cases Perl 1,592 36 GTK 1,121 30 Konqueror 6,779 39 Amulet 88 22 UCF 2,415 47 Remove bullet point

34 Quantify a range in the way developers may use the results
RQ1: Accuracy Ranking of the correct root cause in the result set (Effort required to find the correct root cause) Other techniques do not rank root causes Adapted other techniques to report rank Quantify a range in the way developers may use the results Ranking U = Upper bound on effort Ranking L = Lower bound on effort

35 RQ1: Accuracy Results FieryEye rank = 7.9 WebSee rank-L = 10.2
FieryEye recall = 100% WebSee recall = 65.6%

36 In Y% cases, correct root cause ranked in the top X
RQ1: Accuracy Results In Y% cases, correct root cause ranked in the top X (X, Y) FieryEye: 45% cases Correct root cause in top 5 WebSee: 5% (U), 10% (L) cases XPERT, TDT: 1% (U and L)

37 RQ2: Computational Resources
FieryEye Fast but imprecise 200 Amazon EC2 instances 1 c4.large = $0.11 per hour Cost = 200 * $0.0018/min * 3 Model building cost = $1 Remove

38 Summary Technique for finding root cause of presentation failure
Image processing to find visual symptoms Probabilistic models to predict root causes Empirical evaluation shows positive results Avg. median correct root cause rank = 7.9 Prediction time = 17 sec Model building cost = $1

39 Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond
Thank you Using Visual Symptoms for Debugging Presentation Failures in Web Applications Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond Work supported by NSF Grant CCF

40 Ranking U and L WebSee: ranked list of HTML elements
Techniques report HTML elements Add defined CSS properties for e reported faulty HTML elements { if (e == incorrect faulty element) rankingU = rankingU + e.getProps() rankingL = rankingL + 1 } else rankingU = rankingU + e.getProps() / 2 rankingL = rankingL + e.getProps() / 2 Set of root causes WebSee: ranked list of HTML elements XPERT, TDT: unsorted rankingU = rankingU / 2 rankingL = rankingL / 2

41 <div, margin-top>
Definitions Root cause – tuple <e, p>, where e = HTML element and p = CSS property Oracle image Test web page <div, margin-top>

42 Definitions Visual symptom – boolean predicate describing
the visual difference clues to the root cause Almost matched element – only position changed – e.g.: margin-top – sub-image searching Shift bottom element – moved downwards – e.g.: margin-top – analyze diff. pixels Oracle image Test web page

43 Full Running Example

44 Running example News ---------- News ----------
Menu | Contact Menu | Contact News Username News Username Password Password Sign in Sign in About us | Feedback| FAQ About us | Feedback| FAQ Oracle (Previous version) Test web page

45 P2. Generate Data Samples – Example
2 HTML elements in page <p> color <div> margin-top Test web page 2 Potential Root Causes <p, color> <div, margin-top>

46 Data sample 1: <p, color = blue>
P2. Generate Data Samples – Example 2 Potential Root Causes <p, color> <div, margin-top> 3 Data Samples <p, color> blue <div, margin-top> px, 50px inject Data sample 1: <p, color = blue> 1 Visual Symptom Added color

47 P2. Generate Data Samples – Example
2 Potential Root Causes <p, color> <div, margin-top> 3 Data Samples <p, color> blue <div, margin-top> px, 50px inject Data sample 2: <div, margin-top = 0px> 2 Visual Symptoms Almost matched element Shift top element

48 P2. Generate Data Samples – Example
2 Potential Root Causes <p, color> <div, margin-top> 3 Data Samples <p, color> blue <div, margin-top> px, 50px inject Data sample 3: <div, margin-top = 50px> 3 Visual Symptoms Almost matched element Shift bottom element Page size changed

49 Truth Table – Example T F F, F T, T T, F F, T Data samples Root Causes
Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T Data samples

50 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T

51 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue T F <div, margin-top> 0px, 50px F, F T, T T, F F, T

52 Conditional Probability Table – Example
Root Causes Injected values Visual Symptoms Added color Almost matched element Shift top element Shift bottom element Page size changed <p, color> blue 1.0 0.0 <div, margin-top> 0px, 50px 0.5

53 P(p) Computation – Example
Total 2 properties in the page color, margin-top

54 P3. – Example All root causes, R = {<p, color>, <div, margin-top>} r = <p, color> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 0.0 s2 = shift bottom element, P(s2|r) = 0.0 = 0.0 4.

55 Visual Symptoms (S) – Example
Almost matched element – only position changed – e.g.: margin-top – sub-image searching Shift bottom element – moved downwards – e.g.: margin-top – analyze diff. pixels Oracle image Test web page

56 P3. – Example All root causes, R = {<p, color>, <div, margin-top>} r = <p, color> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 0.0 s2 = shift bottom element, P(s2|r) = 0.0 = 0.0 4. = P(<p, color> | S) = 0.0 * 0.5 = 0.0

57 P3. – Example All root causes, R = {<p, color>, <div, margin-top>} r = <div, margin-top> 1. P(p) = 0.5 2. S = {almost element matched, shift bottom element} 3. s1 = almost element matched, P(s1|r) = 1.0 s2 = shift bottom element, P(s2|r) = 0.5 = 1.0 * 0.5 = 0.5 4. = P(<div, margin-top> | S) = 0.5 * 0.5 = 0.25

58 ✔ P3. – Example Rank root causes by their probabilities
P(<div, margin-top>) = 0.25 P(<p, color>) = 0.0


Download ppt "Sonal Mahajan, Bailan Li, Pooyan Behnamghader, William G. J. Halfond"

Similar presentations


Ads by Google