2019/2/25 chapter25.

2019/2/25 chapter25

Spacing of C ≤d(p, q)≤d*. 2019/2/25 chapter25

An Example K=3. 2019/2/25 chapter25

4.4 Single-Source Shortest Paths
Problem Definition Shortest paths and Relaxation Dijkstra’s algorithm (can be viewed as a greedy algorithm) 2019/2/25 chapter25

Problem Definition: Real problem: A motorist wishes to find the shortest possible route from Chicago to Boston.Given a road map of the United States on which the distance between each pair of adjacent intersections is marked, how can we determine this shortest route? Formal definition: Given a graph G=(V, E, W), where each edge has a weight, find a shortest path from s to v for some interesting vertices s and v. s—source v—destination. 2019/2/25 chapter25

Find a shortest path from station A to station B.
-need serious thinking to get a correct algorithm. 2019/2/25 chapter25

The cost of the shortest path from s to v is denoted as (s, v).
The weight of path p=<v0,v1,…,vk > is the sum of the weights of its constituent edges: The cost of the shortest path from s to v is denoted as (s, v). 2019/2/25 chapter25

Negative-Weight edges:
Edge weight may be negative. negative-weight cycles– the total weight in the cycle (circuit) is negative. If no negative-weight cycles reachable from the source s, then for all v V, the shortest-path weight remains well defined,even if it has a negative value. If there is a negative-weight cycle on some path from s to v, we define = 2019/2/25 chapter25

a b -4 h i 3 -1 2 4 3 c d 6 8 5 -8 3 5 11 g s -3 e 3 f 2 7 j -6 Figure1 Negative edge weights in a directed graph.Shown within each vertex is its shortest-path weight from source s.Because vertices e and f form a negative-weight cycle reachable from s,they have shortest-path weights of Because vertex g is reachable from a vertex whose shortest path is ,it,too,has a shortest-path weight of Vertices such as h, i ,and j are not reachable from s,and so their shortest-path weights are , even though they lie on a negative-weight cycle. 2019/2/25 chapter25

Representing shortest paths:
we maintain for each vertex vV , a predecessor [ v] that is the vertex in the shortest path right before v. With the values of , a backtracking process can give the shortest path. (We will discuss that after the algorithm is given) 2019/2/25 chapter25

Observation: (basic) Suppose that a shortest path p from a source s to a vertex v can be decomposed into s u v for some vertex u and path p’. Then, the weight of a shortest path from s to v is We do not know what is u for v, but we know u is in V and we can try all nodes in V in O(n) time. Also, if u does not exist, the edge (s, v) is the shortest. Question: how to find (s, u), the first shortest from s to some node? 2019/2/25 chapter25

Relaxation: The process of relaxing an edge (u,v) consists of testing whether we can improve the shortest path to v found so far by going through u and,if so,updating d[v] and [v]. RELAX(u,v,w) if d[v]>d[u]+w(u,v) then d[v] d[u]+w(u,v) (based on obersation) [v] u 2019/2/25 chapter25

u v u v 2 2 5 9 5 6 RELAX(u,v) RELAX(u,v) u v u v 2 2 5 7 5 6 (a) (b)
Figure2 Relaxation of an edge (u,v).The shortest-path estimate of each vertex is shown within the vertex. (a)Because d[v]>d[u]+w(u,v) prior to relaxation, the value of d[v] decreases. (b)Here, d[v] d[u]+w(u,v) before the relaxation step,so d[v] is unchanged by relaxation. 2019/2/25 chapter25

Initialization: For each vertex v  V, d[v] denotes an upper bound on the weight of a shortest path from source s to v. d[v]– will be (s, v) after the execution of the algorithm. initialize d[v] and [v] as follows: . INITIALIZE-SINGLE-SOURCE(G,s) for each vertex v  V[G] do d[v] [v] NIL d[s] 2019/2/25 chapter25

Dijkstra’s Algorithm:
Dijkstra’s algorithm assumes that w(e)0 for each e in the graph. maintain a set S of vertices such that Every vertex v S, d[v]=(s, v), i.e., the shortest-path from s to v has been found. (Intial values: S=empty, d[s]=0 and d[v]=) (a) select the vertex uV-S such that d[u]=min {d[x]|x V-S}. Set S=S{u} (b) for each node v adjacent to u do RELAX(u, v, w). Repeat step (a) and (b) until S=V. 2019/2/25 chapter25

Continue: DIJKSTRA(G,w,s): INITIALIZE-SINGLE-SOURCE(G,s) S Q V[G]
while Q do u EXTRACT -MIN(Q) S S {u} for each vertex v  Adj[u] do RELAX(u,v,w) 2019/2/25 chapter25

Implementation: a priority queue Q stores vertices in V-S, keyed by their d[] values. the graph G is represented by adjacency lists. 2019/2/25 chapter25

u v 10 5 2 1 3 4 6 9 7 8 s Single Source Shortest Path Problem x y (a) 2019/2/25 chapter25

u v 1 10/s 8 10 9 s 2 3 4 6 7 5 5/s 8 2 x y (b) (s,x) is the shortest path using one edge. It is also the shortest path from s to x. 2019/2/25 chapter25

u v 1 8/x 13/y 10 9 s 2 3 4 6 7 5 5/s 7/x 2 x y (d) 2019/2/25 chapter25

7/x 9/u 5/s 8/x 10 5 2 1 3 4 6 9 7 s u v x y (e) 2019/2/25 chapter25

Backtracking: v-u-x-s
1 8/x 9/u 10 9 s 2 3 4 6 7 5 5/s 7/x 2 x y (f) Backtracking: v-u-x-s 2019/2/25 chapter25

Proof: We prove it by induction on |S|.
Theorem: Consider the set S at any time in the algorithm’s execution. For each vS, the path Pv is a shortest s-v path. Proof: We prove it by induction on |S|. If |S|=1, then the theorem holds. (Because d[s]=0 and S={s}.) Suppose that the theorem is true for |S|=k for some k>0. Now, we grow S to size k+1 by adding the node v. 2019/2/25 chapter25

Proof: (continue) Now, we grow S to size k+1 by adding the node v.
Let (u, v) be the last edge on our s-v path Pv. Consider any other path from P: s,…,x,y, …, v. (red in the Fig.) y is the first node that is not in S and xS. Since we always select the node with the smallest value d[] in the algorithm, we have d[v]d[y]. Moreover, the length of each edge is 0. Thus, the length of Pd[y]d[v]. That is, the length of any path d[v]. Therefore, our path Pv is the shortest. y x s If y does not exist, d[v] is the smallest length for paths from s to v using red nodes (nodes in S) only since we did relax.from every red node to v. u v Set S 2019/2/25 chapter25

S->v is shorter than s->u, but it is longer than
The algorithm does not work if there are negative weight edges in the graph . u -10 2 v s 1 S->v is shorter than s->u, but it is longer than s->u->v. 2019/2/25 chapter25

Time complexity of Dijkstra’s Algorithm:
Time complexity depends on implementation of the Queue. Method 1: Use an array to story the Queue EXTRACT -MIN(Q) --takes O(|V|) time. Totally, there are |V| EXTRACT -MIN(Q)’s. time for |V| EXTRACT -MIN(Q)’s is O(|V|2). RELAX(u,v,w) --takes O(1) time. Totally |E| RELAX(u, v, w)’s are required. time for |E| RELAX(u,v,w)’s is O(|E|). Total time required is O(|V|2+|E|)=O(|V|2) Backtracking with [] gives the shortest path in inverse order. Method 2: The priority queue is implemented as a Fibonacci heap. It takes O(log n) time to do EXTRACT-MIN(Q). The total running time is O(|E|log n ). 2019/2/25 chapter25

Huffman codes and Data Compression
Binary character code: each character is represented by a unique binary string. A data file can be coded in two ways: a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 The first way needs 1003=300 bits. The second way needs 45 1+13 3+12 3+16 3+9 4+5 4=224 bits. 2019/2/25 chapter25

Variable-length code Need some care to read the code.
(codeword: a=0, b=00, c=01, d=11.) Where to cut? 00 can be explained as either aa or b. Prefix of 0011: 0, 00, 001, and 0011. Prefix codes: no codeword is a prefix of some other codeword. (prefix free) Prefix codes are simple to encode and decode. 2019/2/25 chapter25

Using codeword in Table to encode and decode
Encode: abc = = (just concatenate the codewords.) Decode: = = aabe a b c d e f frequency(%) 45 13 12 16 9 5 fixed-length code 000 001 010 011 100 101 variable-length code 111 1101 1100 2019/2/25 chapter25

Encode: abc = = (just concatenate the codewords.) Decode: = = aabe (use the (right)binary tree below:) a:45 b:13 c:12 d:16 e:9 f:5 1 100 14 86 28 58 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 Tree for the fixed length codeword Tree for variable-length codeword 2019/2/25 chapter25

Total length: 45 1+13 3+12 3+16 3+9 4+5 4=224 bits.
Binary tree Every non-leaf node has at most two children. The fixed-length code in our example is not optimal. The total number of bits required to encode a file is f ( c ) : the frequency (number of occurrences) of c in the file dT(c): denote the depth (length of codeword) of c’s leaf in the tree a(45, 0), b(13, 101), c(12, 100), d(16, 111), e(9, 1101), f(5, 1100) Total length: 45 1+13 3+12 3+16 3+9 4+5 4=224 bits. 2019/2/25 chapter25

Constructing an optimal code
Formal definition of the problem: Input: a set of characters C={c1, c2, …, cn}, each cC has frequency f[c]. Output: a binary tree representing codewords so that the total number of bits required for the file is minimized. Huffman proposed a greedy algorithm to solve the problem. 2019/2/25 chapter25

(a) (b) 14 1 f:5 e:9 c:12 b:13 d:16 a:45 a:45 d:16 e:9 f:5 b:13 c:12
1 b:13 c:12 2019/2/25 chapter25

14 1 25 (c) 25 30 14 1 (d) a:45 d:16 e:9 f:5 b:13 c:12 a:45 b:13 c:12
1 b:13 c:12 25 (c) a:45 b:13 c:12 d:16 e:9 f:5 25 30 14 1 (d) 2019/2/25 chapter25

a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 100 1 a:45 b:13 c:12 d:16 e:9 f:5 55 25 30 14 1 (f) (e) 2019/2/25 chapter25

5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q)
HUFFMAN(C) 1 n:=|C| 2 Q:=C 3 for i:=1 to n-1 do 4 z:=ALLOCATE_NODE() 5 x:=left[z]:=EXTRACT_MIN(Q) 6 y:=right[z]:=EXTRACT_MIN(Q) 7 f[z]:=f[x]+f[y] 8 INSERT(Q,z) 9 return 2019/2/25 chapter25

The Huffman Algorithm This algorithm builds the tree T corresponding to the optimal code in a bottom-up manner. C is a set of n characters, and each character c in C is a character with a defined frequency f[c]. Q is a priority queue, keyed on f, used to identify the two least-frequent characters to merge together. The result of the merger is a new object (internal node) whose frequency is the sum of the two objects. 2019/2/25 chapter25

Time complexity Lines 4-8 are executed n-1 times.
Each heap operation in Lines 4-8 takes O(lg n) time. Total time required is O(n lg n). Note: The details of heap operation will not be tested. Time complexity O(n lg n) should be remembered. 2019/2/25 chapter25

Another example: e:4 a:6 c:6 b:9 c:6 b:9 e:4 a:6 10 1 d:11 d:11
1 2019/2/25 chapter25

e:4 a:6 10 1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 e:4 a:6 10 1 21 d:11
1 d:11 c:6 b:9 15 1 c:6 b:9 15 1 d:11 e:4 a:6 10 1 21 2019/2/25 chapter25

c:6 b:9 15 1 d:11 e:4 a:6 10 21 36 2019/2/25 chapter25

Correctness of Huffman’s Greedy Algorithm (Fun Part, not required)
Again, we use our general strategy. Let x and y be the two characters in C having the lowest frequencies. (the first two characters selected in the greedy algorithm.) We will show the two properties: There exists an optimal solution Topt (binary tree representing codewords) such that x and y are siblings in Topt. Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T’ be an optimal tree for C’. Then we can get Topt from T’ by replacing z with z x y 2019/2/25 chapter25

Proof of Property 1 b x y c x b c y Tnew Topt Look at the lowest siblings in Topt, say, b and c. Exchange x with b and y with c. B(Topt)-B(Tnew)=f[x]d[x]+f[y]d[y]+f [b]d[b] +f[c]d[c] -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y] =(d[b]-d[x])(f[b]-f[x])+(d[c]-d[y])(f[c]-f[y])0 Since f[x] and f[y] are the smallest, (f[b]-f[x]) 0, and (f[c]-f[y)0. Moreover, b and c are at the bottom of Topt, d[b]-d[x]>0, and (d[c]-d[y])>0.(Draw an example.) 2019/2/25 chapter25

Proof of Property 1 -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y]
Tnew Topt B(Topt)-B(Tnew)=f[x]d[x]+f[y]d[y]+f [b]d[b] +f[c]d[c] -f[x]d[b]-f [y]d[c]- f[b]d[x] -f[c]d[y] =61+62+83+93 (69) -63-63-81-92 (50) >0. 2019/2/25 chapter25

Proof: Let T( C ) be the tree obtained from T (C’) by
Let z be a new character with frequency f[z]=f[x]+f[y] and C’=C-{x, y}{z}. Let T(C’) be an optimal tree for C’. Then we can get Topt (C) for C from T(C’) by replacing z with Proof: Let T( C ) be the tree obtained from T (C’) by replacing z with the three nodes. B(T(C))=B(T(C’))+f[x]+f[y] … (1) (the length of the codes for x and y are 1 bit more than that of z.) Similarly, Let Topt (C) be an optimum tree with x and y as siblings. T’(C’) is obtained from Topt ( C ) by deleting x and y and labeling the parent of x and y as z. Then B(Topt(C))=B(T’(C’))+f[x]+f[y]. …..(2) . From (1) and (2), since B(T(C’))B(T’(C’)). B(T(C))B(Topt (C)) (Use this notation for next year) z y x 2019/2/25 chapter25

B(T’)B(T’’) by the definition of T’
T(C’ ) T’( C’ ) x y x y Topt( C ) T (C ) B(T’)B(T’’) by the definition of T’ 2019/2/25 chapter25

Challenge Problem 1 Let us design a keyboard for a mechanical hand. The keyboard has 26 letters A, B, …, Z arranged in one row. The hand is always at the left end of the row and it comes back to the left end after pressing a key. Assume that we know the frequency of every letter. Design the order of the 26 letters in the row such that the average length of movement of the mechanical hand is minimized. Prove that your solution is correct. No mark will be given. I will remember who did it. If you can solve this problem, then the chance that you can get A+ is high. If you can do it, send your solution by to with the subject: CS 3335 Challenge Problem 1 before Oct. 20. We can discuss the solution after Oct. 20. 2019/2/25 chapter25

2019/2/25 chapter25.

Similar presentations

Presentation on theme: "2019/2/25 chapter25."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

2019/2/25 chapter25.

Similar presentations

Presentation on theme: "2019/2/25 chapter25."— Presentation transcript:

Similar presentations

About project

Feedback