Presentation is loading. Please wait.

Presentation is loading. Please wait.

2019/2/25 chapter25.

Similar presentations


Presentation on theme: "2019/2/25 chapter25."— Presentation transcript:

1 2019/2/25 chapter25

2 2019/2/25 chapter25

3 2019/2/25 chapter25

4 2019/2/25 chapter25

5 Spacing of C ≤d(p, q)≤d*. 2019/2/25 chapter25

6 An Example K=3. 2019/2/25 chapter25

7 4.4 Single-Source Shortest Paths
Problem Definition Shortest paths and Relaxation Dijkstra’s algorithm (can be viewed as a greedy algorithm) 2019/2/25 chapter25

8 Problem Definition: Real problem: A motorist wishes to find the shortest possible route from Chicago to Boston.Given a road map of the United States on which the distance between each pair of adjacent intersections is marked, how can we determine this shortest route? Formal definition: Given a graph G=(V, E, W), where each edge has a weight, find a shortest path from s to v for some interesting vertices s and v. s—source v—destination. 2019/2/25 chapter25

9 Find a shortest path from station A to station B.
-need serious thinking to get a correct algorithm. 2019/2/25 chapter25

10 The cost of the shortest path from s to v is denoted as (s, v).
The weight of path p=<v0,v1,…,vk > is the sum of the weights of its constituent edges: The cost of the shortest path from s to v is denoted as (s, v). 2019/2/25 chapter25

11 Negative-Weight edges:
Edge weight may be negative. negative-weight cycles– the total weight in the cycle (circuit) is negative. If no negative-weight cycles reachable from the source s, then for all v V, the shortest-path weight remains well defined,even if it has a negative value. If there is a negative-weight cycle on some path from s to v, we define = 2019/2/25 chapter25

12 a b -4 h i 3 -1 2 4 3 c d 6 8 5 -8 3 5 11 g s -3 e 3 f 2 7 j -6 Figure1 Negative edge weights in a directed graph.Shown within each vertex is its shortest-path weight from source s.Because vertices e and f form a negative-weight cycle reachable from s,they have shortest-path weights of Because vertex g is reachable from a vertex whose shortest path is ,it,too,has a shortest-path weight of Vertices such as h, i ,and j are not reachable from s,and so their shortest-path weights are , even though they lie on a negative-weight cycle. 2019/2/25 chapter25

13 Representing shortest paths:
we maintain for each vertex vV , a predecessor [ v] that is the vertex in the shortest path right before v. With the values of , a backtracking process can give the shortest path. (We will discuss that after the algorithm is given) 2019/2/25 chapter25

14 Observation: (basic) Suppose that a shortest path p from a source s to a vertex v can be decomposed into s u v for some vertex u and path p’. Then, the weight of a shortest path from s to v is We do not know what is u for v, but we know u is in V and we can try all nodes in V in O(n) time. Also, if u does not exist, the edge (s, v) is the shortest. Question: how to find (s, u), the first shortest from s to some node? 2019/2/25 chapter25

15 Relaxation: The process of relaxing an edge (u,v) consists of testing whether we can improve the shortest path to v found so far by going through u and,if so,updating d[v] and [v]. RELAX(u,v,w) if d[v]>d[u]+w(u,v) then d[v] d[u]+w(u,v) (based on obersation) [v] u 2019/2/25 chapter25

16 u v u v 2 2 5 9 5 6 RELAX(u,v) RELAX(u,v) u v u v 2 2 5 7 5 6 (a) (b)
Figure2 Relaxation of an edge (u,v).The shortest-path estimate of each vertex is shown within the vertex. (a)Because d[v]>d[u]+w(u,v) prior to relaxation, the value of d[v] decreases. (b)Here, d[v] d[u]+w(u,v) before the relaxation step,so d[v] is unchanged by relaxation. 2019/2/25 chapter25

17 Initialization: For each vertex v  V, d[v] denotes an upper bound on the weight of a shortest path from source s to v. d[v]– will be (s, v) after the execution of the algorithm. initialize d[v] and [v] as follows: . INITIALIZE-SINGLE-SOURCE(G,s) for each vertex v  V[G] do d[v] [v] NIL d[s] 2019/2/25 chapter25

18 Dijkstra’s Algorithm:
Dijkstra’s algorithm assumes that w(e)0 for each e in the graph. maintain a set S of vertices such that Every vertex v S, d[v]=(s, v), i.e., the shortest-path from s to v has been found. (Intial values: S=empty, d[s]=0 and d[v]=) (a) select the vertex uV-S such that d[u]=min {d[x]|x V-S}. Set S=S{u} (b) for each node v adjacent to u do RELAX(u, v, w). Repeat step (a) and (b) until S=V. 2019/2/25 chapter25

19 Continue: DIJKSTRA(G,w,s): INITIALIZE-SINGLE-SOURCE(G,s) S Q V[G]
while Q do u EXTRACT -MIN(Q) S S {u} for each vertex v  Adj[u] do RELAX(u,v,w) 2019/2/25 chapter25

20 Implementation: a priority queue Q stores vertices in V-S, keyed by their d[] values. the graph G is represented by adjacency lists. 2019/2/25 chapter25

21 u v 10 5 2 1 3 4 6 9 7 8 s Single Source Shortest Path Problem x y (a) 2019/2/25 chapter25

22 u v 1 10/s 8 10 9 s 2 3 4 6 7 5 5/s 8 2 x y (b) (s,x) is the shortest path using one edge. It is also the shortest path from s to x. 2019/2/25 chapter25

23 u v 1 8/x 14/x 10 9 s 2 3 4 6 7 5 5/s 7/x 2 x y (c) 2019/2/25 chapter25

24 u v 1 8/x 13/y 10 9 s 2 3 4 6 7 5 5/s 7/x 2 x y (d) 2019/2/25 chapter25

25 7/x 9/u 5/s 8/x 10 5 2 1 3 4 6 9 7 s u v x y (e) 2019/2/25 chapter25

26 Backtracking: v-u-x-s
1 8/x 9/u 10 9 s 2 3 4 6 7 5 5/s 7/x 2 x y (f) Backtracking: v-u-x-s 2019/2/25 chapter25

27 Proof: We prove it by induction on |S|.
Theorem: Consider the set S at any time in the algorithm’s execution. For each vS, the path Pv is a shortest s-v path. Proof: We prove it by induction on |S|. If |S|=1, then the theorem holds. (Because d[s]=0 and S={s}.) Suppose that the theorem is true for |S|=k for some k>0. Now, we grow S to size k+1 by adding the node v. 2019/2/25 chapter25

28 Proof: (continue) Now, we grow S to size k+1 by adding the node v.
Let (u, v) be the last edge on our s-v path Pv. Consider any other path from P: s,…,x,y, …, v. (red in the Fig.) y is the first node that is not in S and xS. Since we always select the node with the smallest value d[] in the algorithm, we have d[v]d[y]. Moreover, the length of each edge is 0. Thus, the length of Pd[y]d[v]. That is, the length of any path d[v]. Therefore, our path Pv is the shortest. y x s If y does not exist, d[v] is the smallest length for paths from s to v using red nodes (nodes in S) only since we did relax.from every red node to v. u v Set S 2019/2/25 chapter25

29 S->v is shorter than s->u, but it is longer than
The algorithm does not work if there are negative weight edges in the graph . u -10 2 v s 1 S->v is shorter than s->u, but it is longer than s->u->v. 2019/2/25 chapter25

30 Time complexity of Dijkstra’s Algorithm:
Time complexity depends on implementation of the Queue. Method 1: Use an array to story the Queue EXTRACT -MIN(Q) --takes O(|V|) time. Totally, there are |V| EXTRACT -MIN(Q)’s. time for |V| EXTRACT -MIN(Q)’s is O(|V|2). RELAX(u,v,w) --takes O(1) time. Totally |E| RELAX(u, v, w)’s are required. time for |E| RELAX(u,v,w)’s is O(|E|). Total time required is O(|V|2+|E|)=O(|V|2) Backtracking with [] gives the shortest path in inverse order. Method 2: The priority queue is implemented as a Fibonacci heap. It takes O(log n) time to do EXTRACT-MIN(Q). The total running time is O(|E|log n ). 2019/2/25 chapter25

31 Huffman codes and Data Compression
Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 The first way needs 1003=300 bits. The second way needs 45 1+13 3+12 3+16 3+9 4+5 4=224 bits. 2019/2/25 chapter25

32 Variable-length code Need some care to read the code.
(codeword: a=0, b=00, c=01, d=11.) Where to cut? 00 can be explained as either aa or b. Prefix of 0011: 0, 00, 001, and 0011. Prefix codes: no codeword is a prefix of some other codeword. (prefix free) Prefix codes are simple to encode and decode. 2019/2/25 chapter25

33 Using codeword in Table to encode and decode
Encode: abc = = (just concatenate the codewords.) Decode: = = aabe a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 2019/2/25 chapter25

34 Encode: abc = = (just concatenate the codewords.) Decode: = = aabe (use the (right)binary tree below:) a:45 b:13 c:12 d:16 e:9 f:5 1 100 14 86 28 58 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 Tree for the fixed length codeword Tree for variable-length codeword 2019/2/25 chapter25

35 Total length: 45 1+13 3+12 3+16 3+9 4+5 4=224 bits.
Binary tree Every non-leaf node has at most two children. The fixed-length code in our example is not optimal. The total number of bits required to encode a file is f ( c ) : the frequency (number of occurrences) of c in the file dT(c): denote the depth (length of codeword) of c’s leaf in the tree a(45, 0), b(13, 101), c(12, 100), d(16, 111), e(9, 1101), f(5, 1100) Total length: 45 1+13 3+12 3+16 3+9 4+5 4=224 bits. 2019/2/25 chapter25

36 Constructing an optimal code
Formal definition of the problem: Input: a set of characters C={c1, c2, …, cn}, each cC has frequency f[c]. Output: a binary tree representing codewords so that the total number of bits required for the file is minimized. Huffman proposed a greedy algorithm to solve the problem. 2019/2/25 chapter25

37 (a) (b) 14 1 f:5 e:9 c:12 b:13 d:16 a:45 a:45 d:16 e:9 f:5 b:13 c:12
1 b:13 c:12 2019/2/25 chapter25

38 14 1 25 (c) 25 30 14 1 (d) a:45 d:16 e:9 f:5 b:13 c:12 a:45 b:13 c:12
1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25 30 14 1 (d) 2019/2/25 chapter25

39 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 1 (f) (e) 2019/2/25 chapter25

40 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q)
HUFFMAN(C) 1 n:=|C| 2 Q:=C 3 for i:=1 to n-1 do 4 z:=ALLOCATE_NODE() 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q) 7 f[z]:=f[x]+f[y] 8 INSERT(Q,z) 9 return 2019/2/25 chapter25

41 The Huffman Algorithm This algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. C is a set of n characters, and each character c in C is a character with a defined frequency f[c]. Q is a priority queue, keyed on f, used to identify the two least-frequent characters to merge together. The result of the merger is a new object (internal node) whose frequency is the sum of the two objects. 2019/2/25 chapter25

42 Time complexity Lines 4-8 are executed n-1 times.
Each heap operation in Lines 4-8 takes O(lg n) time. Total time required is O(n lg n). Note: The details of heap operation will not be tested. Time complexity O(n lg n) should be remembered. 2019/2/25 chapter25

43 Another example: e:4 a:6 c:6 b:9 c:6 b:9 e:4 a:6 10 1 d:11 d:11
1 2019/2/25 chapter25

44 e:4 a:6 10 1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 e:4 a:6 10 1 21 d:11
1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 d:11 e:4 a:6 10 1 21 2019/2/25 chapter25

45 c:6 b:9 15 1 d:11 e:4 a:6 10 21 36 2019/2/25 chapter25

46 Correctness of Huffman’s Greedy Algorithm (Fun Part, not required)
Again, we use our general strategy. Let x and y be the two characters in C having the lowest frequencies. (the first two characters selected in the greedy algorithm.) We will show the two properties: There exists an optimal solution Topt (binary tree representing codewords) such that x and y are siblings in Topt. Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with z x y 2019/2/25 chapter25

47 Proof of Property 1 b x y c x b c y Tnew Topt Look at the lowest siblings in Topt, say, b and c. Exchange x with b and y with c. B(Topt)-B(Tnew)=f[x]d[x]+f[y]d[y]+f [b]d[b] +f[c]d[c] -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y] =(d[b]-d[x])(f[b]-f[x])+(d[c]-d[y])(f[c]-f[y])0 Since f[x] and f[y] are the smallest, (f[b]-f[x]) 0, and (f[c]-f[y)0. Moreover, b and c are at the bottom of Topt, d[b]-d[x]>0, and (d[c]-d[y])>0.(Draw an example.) 2019/2/25 chapter25

48 Proof of Property 1 -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y]
Tnew Topt B(Topt)-B(Tnew)=f[x]d[x]+f[y]d[y]+f [b]d[b] +f[c]d[c] -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y] =61+62+83+93 (69) -63-63-81-92 (50) >0. 2019/2/25 chapter25

49 Proof: Let T( C ) be the tree obtained from T (C’) by
Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T(C’) be an optimal tree for C’. Then we can get Topt (C) for C from T(C’) by replacing z with Proof: Let T( C ) be the tree obtained from T (C’) by replacing z with the three nodes. B(T(C))=B(T(C’))+f[x]+f[y] … (1) (the length of the codes for x and y are 1 bit more than that of z.) Similarly, Let Topt (C) be an optimum tree with x and y as siblings. T’(C’) is obtained from Topt ( C ) by deleting x and y and labeling the parent of x and y as z. Then B(Topt(C))=B(T’(C’))+f[x]+f[y]. …..(2) . From (1) and (2), since B(T(C’))B(T’(C’)). B(T(C))B(Topt (C)) (Use this notation for next year) z y x 2019/2/25 chapter25

50 B(T’)B(T’’) by the definition of T’
T(C’ ) T’( C’ ) x y x y Topt( C ) T (C ) B(T’)B(T’’) by the definition of T’ 2019/2/25 chapter25

51 Challenge Problem 1 Let us design a keyboard for a mechanical hand. The keyboard has 26 letters A, B, …, Z arranged in one row. The hand is always at the left end of the row and it comes back to the left end after pressing a key. Assume that we know the frequency of every letter. Design the order of the 26 letters in the row such that the average length of movement of the mechanical hand is minimized. Prove that your solution is correct. No mark will be given. I will remember who did it. If you can solve this problem, then the chance that you can get A+ is high. If you can do it, send your solution by to with the subject: CS 3335 Challenge Problem 1 before Oct. 20. We can discuss the solution after Oct. 20. 2019/2/25 chapter25


Download ppt "2019/2/25 chapter25."

Similar presentations


Ads by Google