Union-Find (DSU)
Disjoint-set structure with path compression + union by rank for near-O(1) component queries.
Connected components offline, Kruskal's MST, redundant connection, accounts merge.
Path compression and union by rank give an amortized cost of inverse Ackermann α(n), which is at most 4 for any practical input. The parent and rank arrays hold one entry per element.
Union-Find maintains a forest of disjoint sets where each set has a representative root. Two operations: find(x) returns x's root, union(x, y) merges the sets containing x and y. With path compression (find flattens the tree as it walks) and union by rank/size (always attach the smaller tree under the larger), both operations run in O(alpha(n)) — inverse Ackermann, effectively constant for any realistic n.
The pattern is the right tool for offline connectivity: when you process a batch of merges and ask 'are x and y connected?' after each, union-find beats BFS/DFS by orders of magnitude. It's also the engine behind Kruskal's MST (sort edges, union as you add).
What union-find can't do efficiently is split — once merged, sets stay merged. For problems that involve removal (like 'minimum cost to disconnect'), you typically have to reverse-process: start from the end state and add edges in reverse, treating deletions as unions.
Path compression plus union by rank gives O(alpha(n)) amortized per operation — effectively constant — making union-find the right choice for any batch of merge/connectivity queries on a static or grow-only graph.
- Implementing find without path compression, degrading to O(log n) or worse per operation and timing out on large inputs.
- Implementing union without rank/size and always attaching one fixed direction, building a degenerate linked list.
- Comparing parent[x] == y instead of find(x) == find(y) to test connectivity, missing transitive connections.
- Trying to use union-find for problems that require split or removal — wrong tool, switch to offline reverse processing or different DS.
- In Kruskal's, forgetting to sort edges by weight before iterating — produces a spanning tree, but not minimum.
- Standard DSU with path compression + union by rank.
- Kruskal's MST: sort edges, union if endpoints disconnected.
- Weighted/relational union-find: track relative offsets between nodes (e.g. accounts merge, equations).
- Offline reverse processing: handle edge deletions by reversing the timeline into edge additions.
- Number of islands II / dynamic islands: start empty, add cells one by one with unions.
class DSU {
int[] parent, rank;
DSU(int n) {
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++) parent[i] = i;
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
boolean union(int a, int b) {
int ra = find(a), rb = find(b);
if (ra == rb) return false;
if (rank[ra] < rank[rb]) { int t = ra; ra = rb; rb = t; }
parent[rb] = ra;
if (rank[ra] == rank[rb]) rank[ra]++;
return true;
}
}Problems using this pattern
2Easiest first · Blind 75 above NeetCode 150