Download presentation
Presentation is loading. Please wait.
Published byTrevon Daniel Modified over 10 years ago
1
Xiang Fu Hofstra University Chung-Chih Li Illinois State University 04/13/20101NFM 2010
2
Background Hacker Server malicious scripts Cool page! 04/13/2010NFM 20102 Problem? SufficientText Inputs Lack of Sufficient Sanitation of Text Inputs
3
One Typical Error 1 <?php 2 $msg = $_POST[msg]; 3 $sanitized = pregreplace( 4/\.*?\ / i, 5, 6$msg ) ; 7 savetodb($sanitized ) 8 ?> 04/13/20103NFM 2010 script>alert(a) Attackers Input alert(a) Reluctant Kleene Star
4
Bigger Picture Objective: Automatic Discovery of Vulnerabilities 04/13/20104NFM 2010 Symbolic Execution Test Replayer Bytecode Attack Pattern String Constraint Solver SUSHI
5
Our Contribution Atomic Replacement Constraints Consider Two Semantics Greedy Reluctant Modeling Using Finite State Transducer (FST) Compact Representation of FST Security Analysis 04/13/2010NFM 20105
6
Finite State Transducer Accepts Regular Relation Union, Concat, Composition Intersection, Complement Used for Modeling Rewriting Rules [Kaplan94, Karttunen96] 04/13/2010NFM 20106 ε:1 1 2 3 4 a:2 b:3 A (ab,123) L(A)
7
Hierarchical FST & Modeling Declarative Semantics 04/13/2010NFM 20107 Id(* - * r *)r : ω ε:εε:ε Id(* - * r *) 1 2 3 4 Identical Relation Any String not Containing patter r Goal: Regular Search Pattern Replacement
8
Modeling Reluctant Semantics 2 Steps Mark the beginning of pattern Do the replacement 04/13/2010NFM 20108 Goal: Key: Left-Most Matching
9
04/13/2010NFM 2010 9 a a b b c d a b c a b d Input Word a + b + c x Search Pattern #: ε reluc(r) # : ω ε: ε Id() f1f1 s1s1 s2s2 Begin Marker # a # a b b c d # a b c a b d x d x a b d
10
The Challenge: Begin Marker 04/13/2010NFM 201010 a a b b c d a b c a b d Input Word ### a + b + c x Search Pattern # Look-ahead Capability? Non-determinism 3 Steps: (1)End marker (2)Generic end marker (3)Begin marker
11
Preliminary End Marker 04/13/2010NFM 201011 1 c: c 5 234 b: b a: a ε:$ b : b a: a A1A1 a + b + c x Search Pattern Idea: Start with End Marker for Reverse of Search Pattern Problem: Input tape accepts cb + a + only! Reversed Pattern cb + a +
12
Generic End Marker 04/13/2010NFM 201012 1 1 2 2,1 3 3,1 4 4,1 5 5,1 c:cb:ba:aε:$ b:b a:a c:c a:a b:b c:cb:b A2A2 cb + a + Pattern c c b a a Input Word c c b a $ a $ Output Word Deterministic! a:a
13
Finally, the Begin Marker 04/13/2010NFM 201013 a + b + c x Search Pattern 1 1 2 2,1 3 3,1 4 4,1 5 5,1 c:c b:ba:aε:# b:b a:a c:c a:a b:b c:cb:b A3A3 0 ε:εε:ε ε:εε:ε ε:εε:ε
14
04/13/2010NFM 2010 14 a a b b c d a b c a b d Input Word a + b + c x Search Pattern #: ε reluc(r) # : ω ε: ε Id() f1f1 s1s1 s2s2 Begin Marker # a # a b b c d # a b c a b d x d x a b d
15
Greedy Semantics 04/13/2010NFM 201015 Goal: greedy Challenge: Look-ahead longest match
16
04/13/2010NFM 201016 Step 1: Begin Marker Step 2: ND End Marker Step 3: Pairing Markers Step 4: Checking Match Step 5: Check Longest Step 6: Replacement a + x Search Pattern aabab #a#ab#ab #a#a$b#ab #a$#a$b#a$b #a#a$b#a$b #aa$b#a$b xbxb #a#ab#a$b #aaba$b
17
Applications Solve String Constraints 04/13/2010NFM 201017 Login Servlet Input: user name After filtering single quote and length restriction
18
Solving Atomic Constraint 04/13/2010NFM 201018 Goal: A1Id(P) Project to Input Tape Solution
19
SUSHI Constraint Solver Solves Simple Linear String Constraints (SISE) Relies on dk.brics.automaton for FSA operations Self-made Java package for FST operations Supports 16-bit Unicode Compact Transition Representation 04/13/2010NFM 201019
20
Efficiency of Solver 04/13/2010NFM 201020 Benchmark Equations 1 2 3 4 Login Servlet 1.4 Seconds on 2Ghz PC Flex SDK XSS Attack Equation Size: 565 74 Seconds Shorter than Security Track #1022748
21
Related Work Forward String Analysis Christensen & Møller [SAS03] Wasserman & Su [PLDI07, ICSE08] Bjørner & Tillmann [TACAS09] Backward String Analysis Kiezun & Ganesh [ISSTA09] Yu & Bultan [SPIN08, ASE09] Fu [COMPSAC07, TAVWEB08] Natural Language Processing * Kaplan and Kay [CL1994] 04/13/2010NFM 201021 Our Contribution: Precise Modeling of Various Regular Substitution Semantics
22
Limitations SISE String Constraints All Variables Appear on LHS (Once) No Easy Solution for Equation System Yet No string length Future Directions Encoding string length in automata Finite model on bit-vector 04/13/2010NFM 201022
23
Questions? 04/13/2010NFM 201023
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.