Download presentation
Presentation is loading. Please wait.
Published byBritney Gilmore Modified over 8 years ago
1
CSE 522 WCET Analysis Computer Science & Engineering Department Arizona State University Tempe, AZ 85287 Dr. Yann-Hang Lee yhlee@asu.edu (480) 727-7507 Some of the slides were based on the lecture by G. Fainekos (ASU)
2
Execution Time – WCET & BCET (Figure from R.Wilhelm et al., ACM Trans. Embed. Comput. Sys, 2007.) 2
3
The WCET Problem Given the code for a software task the platform (OS + hardware) that it will run on Determine the WCET of the task. Why is this problem important? The WCET is central in the design of real-time computing Can the WCET always be found? In general, not a decidability problem, but a complexity problem Compute bounds for the execution times of instructions and basic blocks and determine a longest path in the basic- block graph of the program. 3
4
Components of Execution Time Analysis Program path (Control flow) analysis Want to find longest path through the program Identify feasible paths through the program Find loop bounds Identify dependencies amongst different code fragments Processor behavior analysis For small code fragments (basic blocks), generate bounds on run-times on the platform Model details of architecture, including cache behavior, pipeline stalls, branch prediction, etc. Outputs of both analyses feed into each other 4
5
Program Path Analysis: Overall Approach (1) Construct Control-Flow Graph (CFG) for the task Nodes represent Basic Blocks of the task Basic block: a sequence of consecutive program statements where there is no possibility of branching We have a single entry and a single exit node Edges represent flow of control (jumps, branches, calls, …) The problem is to identify the longest path in the CFG Note: CFG can have loops, so need to infer loop bounds and unroll them This gives us a directed acyclic graph (DAG). How do we find the longest path in this DAG? 5
6
Program Path Analysis: Overall Approach (2) In a CFG B i = basic block i x i = number of times the block B i is executed d j = number of times edge is executed c i = worst case running time of block B i Objective: find How to get x i ? Structural constraints Functionality constraints Loop bounds -- need to be known 6
7
CFG Example N = 10; q = 0; while(q < N) q++; q = r; Example due to Y.T. Li and S. Malik B1: N = 10; q = 0; B2: while(q<N) B4: q = r; B3: q++; 1 0 x1x1 x2x2 x4x4 x3x3 d1 d2 d3 d4 d5 d6 Want to maximize i c i x i subject to constraints x 1 = d 1 = d 2 d 1 = 1 x 2 = d 2 +d 4 = d 3 +d 5 x 3 = d 3 = d 4 = 10 x 4 = d 5 = d 6 7
8
CFG – Another example 8 /* k >=0 */ s = k; while (k < 10){ if (ok) j++; else { j = 0; ok = true; } k++; } r = j; d1d1 d5d5 d4d4 d3d3 d2d2 d8d8 d 10 d9d9 d6d6 d7d7 B1B1 s = k; B2B2 while (k < 10){ B3B3 if (ok) B4B4 j++; B6B6 k++; B7B7 r = j; B5B5 j = 0; ok = true; x7x7 x2x2 x3x3 x4x4 x5x5 x6x6 x1x1
9
check_data() { x 1 int i, morecheck, wrongone; x 2 morecheck = 1; i = 0; wrongone = -1; x 3 while (morecheck) { x 4 if (data[i] < 0) { x 5 wrongone = i; morecheck = 0; } else x 6 if (++i >= 10) x 7 morecheck = 0; } x 8 if (wrongone >= 0) x 9 return 0; else x 10 return 1; } Functionality Constraints 9 x 2 x 4 x 4 10x 2 Constraints (x 5 = 0 & x 7 = 1) | (x 5 = 1 & x 7 = 0) x 5 = x 9
10
Micro-architectural Modeling -- Cache Modify cost function (cache hit and miss have different costs) Add linear constraints to describe relationship between cache hits and misses Basic idea Basic blocks assumed to be smaller than entire cache Subdivide instruction counts (x i ) into counts of cache hits (x i hit ) and misses (x i miss ) Line-block (or l-block) is a contiguous sequence of code within the same basic block that is mapped to the same cache line in the instruction cache Either all hit or all miss in a l-block 10
11
Basic Blocks to Line Blocks (Direct- mapped cache) Color Cache Set 0 1 2 3 B1B1 B2B2 B3B3 B 1.1 B 1.2 B 1.3 B 2.1 B 2.2 B 3.1 B 3.2 No conflicting l-blocks: (only the first execution has a miss) Two nonconflicting l-blocks are mapped to same cache line Conflicting blocks: affected by the sequence Cache Constraints: 11
12
Cache Conflict Graph For every cache set containing two or more conflicting l- blocks start node, end node, and node B k.l for every l-block in the cache set Edge from B k.l to B m.n : control can pass between them without passing through any other l-blocks of the same cache set. p (i. j,u.v) : the number of times that the control passes through that edge. 12 start B m.n end B k.l p (k.l,k.l) p (m.n,m.n) p (s,k.l) p (s,m.n) p (k.l,m.n ) p (m.n,k.l ) p (k.l,e) p (m.n,e) p (s,e)
13
Cache Cache Constraints Example (1) d1d1 d5d5 d4d4 d3d3 d2d2 d8d8 d 10 d9d9 d6d6 d7d7 B 1.1 s = k; B 2.1 while (k < 10){ B 3.1 if (ok) B 4.1 j++; B 6.1 k++; B 7.1 r = j; B 5.1 j = 0; ok = true; x7x7 x2x2 x3x3 x4x4 x5x5 x6x6 x1x1 13
14
Cache Constraints Example (2) S E B 5.1 B 4.1 p (s,4.1) p (s,5.1) p (s,e) p (4.1,4.1) p (5.1,4.1) p (4.1,5.1) p (5.1,5.1) p (5.1,e) p (4.1,e) S E B 6.1 B 1.1 p (s,1.1) p (1.1,6.1) p (1.1,e) p (6.1,e) p (6.1,6.1) 14
15
1995 20022005 over-estimation 20-30% 15% 30-50% 4 25 60 200 cache-miss penalty Lim et al. Thesing et al.Souyris et al. The explosion of penalties has been compensated by a reduction of uncertainties! 10% 25% Progress During the Past 10 Years 15
16
Open Problems Architectures are getting much more complex. Can we create processor behavior models without the pain? Can we change the architecture to make timing analysis easier? Small changes to code and/or architecture require completely re-doing the WCET computation Use robust techniques that learn about processor/platform behavior Need more reliable ways to measure execution time References: Li, Malik, and Wolfe, “Cache Modeling for Real-Time Software: Beyond Direct Mapped Instruction Caches” Wilhelm, “Determining bounds on execution times,” Handbook on Embedded Systems, CRC Press, 2005 16
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.