Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010.

Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010

- קבוצות של קבוצות. - מאחדים קבוצות זרות ורוצים לדעת היכן כל עצם. - פעולות MERGE ו FIND - דוגמה: יחס האקוויולנטיות/שקילות (EQUIVALENCE ) מקיים: רפלקסיבי: סימטרי: טרנזיטיבי: a a b b a a b, b c a c נתון: רצף של פעולות שקילות 1 2, 3 4, 5 6, 2 3 רוצים: לייצר קבוצות שקילות. משתמשים ב- : MERGE לאיחוד הקבוצות FIND : לחפש למי שקול Data Structures, CS, TAU - 5.21 קבוצות עם MERGE ו-FIND

MERGE(A, B) - בצע אחוד והכנס תוצאה ל A או B FIND(x) - מצא באיזו קבוצה נמצא x INITIAL(A, x) - הכנס x ל A יישום פשוט: - מערך שבו כל איבר מכיל את שם הקבוצה לה האיבר שייך A={1, 3, 5}, B={2, 4}, C={6, 7, 8} A B A B A C C C 1 2 3 4 5 6 7 8 יעילות: O(1) : FIND, INIT O(N) : MERGE (צריך לעבור על כל אברי המערך) מדד יעילות: N פעולות MERGE ו- FIND Data Structures, CS, TAU - 5.22 פעולות

- לקשר את האיברים של A לחוד ושל B לחוד (רשימה מקושרת) - לא צריך לרוץ על כל אברי התחום אלא רק על אברי הקבוצה. - עדיין n מיזוגים יכולים לעלות: O(n ) 2 כי: רצף של n-1 מיזוגים שבו ממזגים את הקבוצה שנוצרה לאיבר בודד: פתרון: - לשמור את גודל הקבוצות - למזג קבוצות קטנות לגדולות Data Structures, CS, TAU - 5.23 יישום מהיר יותר

סיבוכיות: 1) מתחשבנים עם כל איבר בנפרד (לא עם הקבוצה) 2) כשאיבר עובר קבוצה גודל קבוצת האם לפחות מוכפל. 3) גודל קבוצה ראשונית - 1 גודל קבוצה שנייה 2 גודל קבוצה שלישית 4גודל קבוצה רביעית 8 i-1 גודל קבוצה i 2 אבל גודל הקבוצה האחרונה N 2 #steps גודל קבוצה אחרונה N כל איבר עובר לכל היותר פעמים סבוכיות כוללת Data Structures, CS, TAU - 5.24 סיבוכיות

1) צריך לכל קבוצה: א) גודלה ב) האיבר הראשון בה 2) צריך לכל איבר: א) קבוצת השייכות ב) האיבר הבא בקבוצה ישום: (הנחה: כל האיברים הם השלמים) type nametype = 1,…,n elementype = 1, 300, n MFSET = record setheaders: array[1…n] of record count: 0,…,n; firstelement: 0,…,n; end; names: array[1…n] of record setname: nametype nextelement: 0,…,n לכל קבוצה גודלה והאיבר הראשון לכל איבר שם הקבוצה והבא. Data Structures, CS, TAU - 5.25 מבנה נתונים

- בודקים מי הקבוצה הקטנה (נניח A) - רצים לאורך הקבוצה ומשנים שמה ל- B - באיבר האחרון עושים את השרשור A ל- B - ב Headers מעדכנים את האיבר הראשון ואת גודל הקבוצה. סבוכיות: - כל איבר שעובר לבעלים חדשים, גודל הבעלים גדל פי שניים (לפחות) - לכן כל איבר עובר לכל היותר log n פעמים. סבוכיות : O(n log n) Data Structures, CS, TAU - 5.26 ביצוע MERGE

- נסיון למנוע ריצה על כל אברי A כשמעבירים ל B- - בעץ מייצגים איברים. - כל צומת מצביע לאביו. - בשורש יושב שם הקבוצה. A 1 7 3 5 B 8 6 C 17 ביצוע הפעולות: MERGE(A, B) - תלה את השורש של A על זה של B FIND(x) - רוץ כלפי מעלה. Data Structures, CS, TAU - 5.27 יישום באמצעות עץ

O(1) = MERGE O(n) = FIND (יתכן) N שידוכים וחיפושים O(n) 2 (אם תולים גדול על קטן נוצרת רשימה) שיפור: תלה עץ קטן על גדול - בכל תליה עומק גדל ב- 1. - בכל תליה מס’ הצמתים בעץ לפחות מוכפל. - צומת משתתף בתליה עומק כל צומת סבוכיות:(find) Data Structures, CS, TAU - 5.28 סיבוכיות

כשמבצעים FIND לקפל את המסלול אל השורש (כל צמתי המסלול יהפכו לבני השורש( ביצוע קל: בשני מעברים (ראשון לזיהוי השורש, שני לקיפול ותליה) 1 7 A 3 2 8 1 7 A 3 2 8 FIND (7) ניתוח סיבוכיות: פעולה בודדת - עדיין יתכן O(n) ממוצע - מסובך לניתוח. אם לא תולים קטן על גדול, יקח O(NlogN) לבצוע FINDS N) קשה לאנליזה ( Data Structures, CS, TAU - 5.29 קיפול מסלולים Finished 13/12/04

אם כן תולים קטן על גדול, סבוכיות ל N פעולות: : (N) קרובה לקבוע אינה קבוע אבל גדלה לאט מאוד עם N פונקצית אקרמן: A(X, Y) A(0, y) = 1 A(1, 0) = 2 A(x, 0) = x+2 for x 2 A(x, y) = A(A(x-1, y), y-1), x,y 1 A(x, 0) = x+2 A(x, 1) = A(A(x-1), 1), 0) = A(x-1, 1)+2 = 2x A(x, 2) = A(A(x-1), 2), 1) = 2A(x-1, 2) = 2 x A(x, 3) = A(A(x-1), 3), 2) = 2 = 2 A(x-1, 3) 2 2 2 x פעמים A(x, 4) =אין צורה מתמטית הגדרה Data Structures, CS, TAU - 5.30

Union-Find Make( x ): Create a set containing x Union( x, y ): Unite the sets containing x and y Find( x ): Return a representative of the set containing x

Union Find make union find a c b de O(1) O(α(n)) Amortized

Fun aplications: Generating mazes 1234 5678 9101112 13141516 make(1) make(2) make(16) … Choose edges in random order and remove them if they connect two different regions find(6)=find(7) ? union(6,7) find(7)=find(11) ? union(7,11) …

Fun aplications: Generating mazes 1234 5678 9101112 13141516

Generating mazes – a larger example Construction time -- O(n 2 α(n 2 )) n

More serious aplications: Maintaining an equivalence relation Incremental connectivity in graphs Computing minimum spanning trees …

Union Find Represent each set as a rooted tree Union by rankPath compression The parent of a vertex x is denoted by p[ x ] x Find( x ) traces the path from x to the root p[x]p[x]

Path Compression

Union by rank 0 r1r1 r2r2 r r r+1 r 1 < r 2 Union by rank on its own gives O(log n) find time A tree of rank r contains at least 2 r elements If x is not a root, then rank(x)<rank(p[x]) Rank = height (disregarding compressions)

Union Find - pseudocode

Union-Find makelinkfind O(1) O(log n) makelinkfind O(1)O(α(n)) Worst case Amortized

Nesting / Repeated application

Ackermann’s function

Ackermann’s function (modified)

Inverse functions

Union by Size 0 r1r1 r2r2 r r r+1 r 1 < r 2 Hang the smaller (# of nodes) tree on the larger tree.

Continue from Notes www.cse.yorku.ca/~andy/courses/4101/lecture-notes/LN6.pdf Lemma 1: if conduct number of UNION ops. If node has height h then it has Proof: induction. When height grows size at least doubles Cor 0: if do UNION+find (path compression) : if height of tree = h, it has Cor1: height <= log n Assume: do UNION by size

Cont2 Cor 2: Worst case of UNION = O(1) Worst case of find = O(logn) Will show amortized = O(log*(n)) Fact 1: lg* (r) = g iff exp*(g-1) < r <= exp *(g) Reminder: rank (x) = height of node x in uncompressed forest

Cont 3 Lemma 2:for any sequence s of operations (Union + Find) number of nodes at rank r is at most Proof: Lemma 1  each such node has 2 r descendants. All nodes that are at rank r must have their descendants disjoint. Sum them must give less than n+1 nodes.

Cont 4 Lemma 3: If during execution of sequence s. Node x is a proper descendant of node y then : rank(x) < rank (y) in s. Proof: a) If x becomes descendant due to union then after union rank (x) < rank (y). b) If due to compression (find) then also due to (earlier) union  as before. 0 r1r1 r2r2 r r r+1 Recall: rank = height in uncompres sed

Cont 5 Put nodes in groups for x : group(x) = lg* (rank(x)) Analyze FIND (+compression) Look at x1, x2,.. Xk on the path being compressed. If group(x i ) = group (x i+1 ) charge to x i else : charge to FIND rank group 21 42 163 655364 2^655365 Cost attributed to single find = O(log*n)

Cont 6 Cost attributed to x: every compression – x gets new parent (move up) => new parent has higher rank than old parent As long as parent (x) remains in group (x) charge to x After that x becomes a child of a parent in another group (x and parent are not in same group)  charging of next compressions to FIND. node at group g will be charged at most exp*(g)- exp*(g-1). rank group 21 42 163 655364 2^655365 Recall: group is by rank (accounted for in the uncompressed tree )

Cont 7 node at group g will be charged at most exp*(g)- exp*(g-1). Number of nodes in g: N(g) rank group 21 42 163 655364 2^655365

Cont 8 Total number of moves in group g rank group 21 42 163 655364 2^655365 N groups  overall work = n log*(n)  Amortized work = log*(n)

Inverse Ackermann function is the inverse of the function The first “column” A “diagonal”

Level and Index Back to union-find…

Potentials

Bounds on level Definition Claim Proof

Bounds on index

Amortized cost of make Actual cost: O(1)  : 0 Amortized cost: O(1)

Amortized cost of link The potentials of y and z 1,…, z k can only decrease Actual cost: O(1) y x … z1z1 zkzk The potentials of x is increased by at most  (n)    (n) Actual cost: O(  (n) )

Amortized cost of find x p[x]p[x] y=p’[x] rank[x] is unchanged rank[p[x]] is increased level(x) is either unchanged or is increased If level(x) is unchanged, then index(x) is either unchanged or is increased If level(x) is increased, then index(x) is decreased by at most rank[x]–1 is either unchanged or is decreased

Amortized cost of find x=x 0 xixi xlxl xjxj Suppose that:  (x) is decreased !

Amortized cost of find x=x 0 xixi xlxl xjxj The only nodes that can retain their potential are: the first, the last and the last node of each level Actual cost: l +1    (  (n)+1) – (l +1) Amortized cost:  (n)+1

Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010.

Similar presentations

Presentation on theme: "Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010.

Similar presentations

Presentation on theme: "Union-Find A data structure for maintaining a collection of disjoint sets Course: Data Structures Lecturer: Hanoch Levy January 2010."— Presentation transcript:

Similar presentations

About project

Feedback