Disjoint Sets

Disjoint sets are data structures that keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. That means given a set of numbers, the challenge is to efficiently union groups together. One popular way this is implemented is using union by rank and path compression.

Applications:

  • Kruskel's algorithm

  • Finding a cycle in an undirected graph

The API:

/*
 * makes a new set by creating a new element with a unique id, a rank of 0,
 * and a parent pointer to itself
 */ 
 MakeSet(x)


/*
 * follows the chain of parent pointers from x upwards through the tree until 
 * an element is reached whose parent is itself
 */
 Find(x)


/*
 * uses Find to determine the roots of the trees x and y belong to
 */
 Union(x, y)

Complexity: O(N) space for N elements, and O(M*alpha(N)) where alpha(N) is the Inverse Ackermann function which grows very slowly. We can argue that alpha(N) <= 4, and hence the time complexity becomes O(4M) or O(M)

The Algorithm:

Disjoint_set(7)

Union(1,2)

Union(2,3)

Union(6,7)

At this point, all the nodes are unioned into a single set. However, for the purposes of demonstration, finding 2 will perform path compression to it's parent 4.

Discussion

How is path compression beneficial?

It optimizes the performance for future findand unionoperations as it shorted the number of chained find operations in order to reach the root. Additionally, since this computation is done regardless, there is no harm is modifying these pointers since the entire purpose of findis to reach the root anyway.

What effect does prioritizing the higher rank have of two nodes have?

Minimizing the depth of the unioned tree, and hence few chained findoperations. Rank effectively represents the depth of the tree, and so if a higher ranked tree pointed to a lower ranked tree. The merged tree would have a smaller width. However, if we maximize the width of the tree (by merging a lower rank to a higher rank), the depth becomes smaller, and so findwill perform fewer calls.

Why does the merging on two trees with the same rank increment the depth of the new tree?

Because they have the same depth, and one pointer must be modified to point to the new root, hence increasing the depth of the new tree by one.

Last updated