Stage 5: Interview Simulation·Rust-Specific Practice
Practice9 min readUnion-FindDSUDesign

Rewrite: Union-Find with path compression

Disjoint-set union with rank + path compression

Implement as a struct with rank and size. Make it reusable.

Published Apr 17, 2026

Exercise

Implement a reusable Union-Find (DSU) structure with:

  • new(n) — initialize n singleton components.
  • find(x) — return the root of x's component, with path compression.
  • union(x, y) — merge by rank; return true if the call changed anything.
  • connected(x, y) — are x and y in the same component?
  • size(x) — size of x's component.
  • count() — current number of components.

With both path compression and union-by-rank, the amortized cost per op is O(α(n)) — inverse Ackermann, effectively constant.

Intuition

DSU tracks equivalence classes under a growing set of union operations. Each class is represented by a tree whose root identifies it. Two optimizations make the amortized bound work:

  1. Union by rank (or size): when merging two trees, attach the shorter one under the taller one. Keeps the trees shallow.
  2. Path compression: during find, point every visited node directly at the root. Makes future finds along that path O(1).

Either alone is good; both together deliver α(n). In Rust we'll use recursive find with full compression; in TypeScript we'll use the iterative two-pass (path halving) variant that's easier to read without recursion.

The structure is the backbone of Kruskal's MST, connected-components queries on streaming edges, and offline graph connectivity.

Approach

Plain DSU (no rank, no compression)

TimeO(n) per op worst case
SpaceO(n)

Unions link one root under the other without regard to size. Finds walk the parent chain without rewriting. Degenerate inputs (always merging the taller tree under the shorter) produce O(n) paths — every query becomes linear.

Useful as a reference implementation when writing the first version or when n is tiny.

pub struct DSU {
    parent: Vec<usize>,
    sizes: Vec<usize>,
    components: usize,
}
impl DSU {
    pub fn new(n: usize) -> Self {
        Self { parent: (0..n).collect(), sizes: vec![1; n], components: n }
    }
    pub fn find(&mut self, mut x: usize) -> usize {
        while self.parent[x] != x { x = self.parent[x]; }
        x
    }
    pub fn union(&mut self, x: usize, y: usize) -> bool {
        let (rx, ry) = (self.find(x), self.find(y));
        if rx == ry { return false; }
        self.parent[rx] = ry;
        self.sizes[ry] += self.sizes[rx];
        self.components -= 1;
        true
    }
    pub fn connected(&mut self, x: usize, y: usize) -> bool {
        self.find(x) == self.find(y)
    }
    pub fn size(&mut self, x: usize) -> usize {
        let r = self.find(x);
        self.sizes[r]
    }
    pub fn count(&self) -> usize { self.components }
}
Rust · runnable

DSU with rank + path compression

TimeO(α(n)) per op
SpaceO(n)
Recommended

Three parallel Vecs: parent, rank, size — all index-based, no pointers, zero allocation after construction. Maintain components as a running counter.

Path compression in find makes every visited ancestor point directly at the root. Union-by-rank avoids creating tall trees by always attaching the lower-rank root under the higher.

pub struct DSU {
    parent: Vec<usize>,
    rank: Vec<u32>,
    size: Vec<usize>,
    components: usize,
}

impl DSU {
    pub fn new(n: usize) -> Self {
        Self {
            parent: (0..n).collect(),
            rank: vec![0; n],
            size: vec![1; n],
            components: n,
        }
    }

    pub fn find(&mut self, x: usize) -> usize {
        if self.parent[x] != x {
            let root = self.find(self.parent[x]);
            self.parent[x] = root;
        }
        self.parent[x]
    }

    pub fn union(&mut self, x: usize, y: usize) -> bool {
        let (rx, ry) = (self.find(x), self.find(y));
        if rx == ry { return false; }
        let (big, small) = if self.rank[rx] >= self.rank[ry] { (rx, ry) } else { (ry, rx) };
        self.parent[small] = big;
        self.size[big] += self.size[small];
        if self.rank[big] == self.rank[small] { self.rank[big] += 1; }
        self.components -= 1;
        true
    }

    pub fn connected(&mut self, x: usize, y: usize) -> bool {
        self.find(x) == self.find(y)
    }

    pub fn size(&mut self, x: usize) -> usize {
        let r = self.find(x);
        self.size[r]
    }

    pub fn count(&self) -> usize { self.components }
}
Rust · runnable

Trace

DSU(5); union(0, 1); union(2, 3); union(1, 3); connected(0, 3); size(0); count():

opparentranksizecomponents
init[0,1,2,3,4][0,0,0,0,0][1,1,1,1,1]5
union(0, 1)[1,1,2,3,4][0,1,0,0,0][1,2,1,1,1]4
union(2, 3)[1,1,3,3,4][0,1,0,1,0][1,2,1,2,1]3
union(1, 3)[1,3,3,3,4][0,1,0,2,0][1,2,1,4,1]2
connected(0, 3) → trueunchangedunchangedunchanged2
size(0) → 4compressedunchangedunchanged2
count() → 2unchangedunchangedunchanged2

Edge cases

  • Self unionunion(x, x) finds the same root twice and returns false; components is unchanged.
  • Idempotent re-union — calling union(0, 1) twice only merges once; subsequent calls are O(α(n)) no-ops.
  • Out-of-bounds — not handled; Rust would panic on vector access, TS would get undefined. Add assert!(x < n) if the API needs to be defensive.
  • Integer overflow on rankrank only grows to log₂(n) in the worst case, so u32 is absurd overkill — u8 suffices for n 2^255.

Takeaway

DSU is the fastest way to answer "are these two nodes connected under a growing set of edges?" The index-based representation is Rust-idiomatic: it sidesteps Box/Rc ownership tangles by using the vector itself as an arena.

Keep this template in your personal crate. It reappears in Kruskal's MST, Tarjan's offline LCA, image segmentation, percolation simulations, and contest problems at every difficulty level.

More in Rust-Specific Practice