Stage 4: Advanced Techniques·Segment Tree & BIT

Hard#3159 min readBITMerge SortCoordinate CompressionLeetCode

Count of Smaller Numbers After Self

Binary Indexed Tree with coordinate compression

BIT or merge sort. Process right to left, count elements smaller than current.

Published Apr 16, 2026

Problem

Given an integer array nums, return an array counts such that counts[i] is the number of elements to the right of nums[i] that are strictly smaller than nums[i].

Input

nums = [5, 2, 6, 1]

Output

[2, 1, 1, 0]

Explanation

5 dominates [2, 1]; 2 dominates [1]; 6 dominates [1]; 1 has no right neighbors.

Input

nums = [-1]

Output

[0]

Explanation

A single element has no elements to its right.

Input

nums = [-1, -1]

Output

[0, 0]

Explanation

Equality does not count — we want strictly smaller.

Intuition

A double loop works but costs O(n²). To get O(n log n) we have to reframe "how many smaller elements come after me" as a problem over an ordered structure.

Two angles work equally well:

Right-to-left sweep + ordered count. Walk from the end of the array toward the front. At each step, ask "how many values strictly less than nums[i] have I already seen?", then insert nums[i]. A Fenwick tree over compressed values answers both in O(log n).
Merge sort with index tracking. During the merge step, whenever an element from the right half lands before an element from the left half, every remaining left-half element has been dominated by that right-half value — but we actually want the opposite count. Flip the inequality: when we emit from the left half, the number of right-half elements already emitted is the number of elements strictly smaller than this left item that came after it in the original order.

Both views reach the same O(n log n) cost — the BIT is the one to know cold because it generalizes to "count values in a range" in every inversion-flavored problem.

Interactive concept

BIT Counts To The Left

Process from right to left; query how many smaller values have already appeared.

Compress values

Only relative order matters, so values become ranks in sorted unique order.

nums

[5, 2, 6, 1]

sorted

[1, 2, 5, 6]

ranks

5->3, 2->2, 6->4, 1->1

Approach

Brute Force

TimeO(n²)

SpaceO(n)

For each i, scan j > i and count hits. Correct, small code, pinned as the oracle for the other two.

impl Solution {
    pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
        let n = nums.len();
        let mut out = vec![0i32; n];
        for i in 0..n {
            let mut c = 0;
            for j in (i + 1)..n {
                if nums[j] < nums[i] {
                    c = 1;
                }
            }
            out[i]= c;
        }
        out
    }
}

Rust · runnable

function countSmaller(nums: number[]): number[] {
  const n = nums.length;
  const out = new Array<number>(n).fill(0);
  for (let i = 0; i < n; i++) {
    let c= 0;
    for (let j= i + 1; j < n; j++) {
      if (nums[j] < nums[i]) c++;
    }
    out[i]= c;
  }
  return out;
}

TypeScript · runnable

Merge Sort with Index Tracking

TimeO(n log n)

SpaceO(n)

Run merge sort on indices, not values. During each merge, when we take the next element from the left half, we also know how many elements from the right half have already been placed — those are exactly the elements that came after this left item in the original array and have smaller values. Add that count to the answer for the left item's original index.

Carrying indices (not raw values) is the key trick: the values get reordered, but every original position keeps a stable identity we can write answers into.

impl Solution {
    pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
        let n = nums.len();
        let mut out = vec![0i32; n];
        let mut idx: Vec<usize> = (0..n).collect();
        let mut buf: Vec<usize> = vec![0; n];
        Self::sort(&nums, &mut idx, &mut buf, 0, n, &mut out);
        out
    }

    fn sort(
        nums: &[i32],
        idx: &mut [usize],
        buf: &mut [usize],
        lo: usize,
        hi: usize,
        out: &mut [i32],
    ) {
        if hi - lo <= 1 {
            return;
        }
        let mid= (lo + hi) / 2;
        Self::sort(nums, idx, buf, lo, mid, out);
        Self::sort(nums, idx, buf, mid, hi, out);

        let mut i= lo;
        let mut j= mid;
        let mut k= lo;
        let mut right_count: i32= 0;
        while i < mid && j < hi {
            if nums[idx[j]] < nums[idx[i]] {
                buf[k]= idx[j];
                j = 1;
                k = 1;
                right_count = 1;
            } else {
                out[idx[i]] = right_count;
                buf[k]= idx[i];
                i = 1;
                k = 1;
            }
        }
        while i < mid {
            out[idx[i]] = right_count;
            buf[k]= idx[i];
            i = 1;
            k = 1;
        }
        while j < hi {
            buf[k]= idx[j];
            j = 1;
            k = 1;
        }
        idx[lo..hi].copy_from_slice(&buf[lo..hi]);
    }
}

Rust · runnable

function countSmaller(nums: number[]): number[] {
  const n = nums.length;
  const out = new Array<number>(n).fill(0);
  const idx = Array.from({ length: n }, (_, i) => i);
  const buf = new Array<number>(n).fill(0);

  const sort = (lo: number, hi: number): void => {
    if (hi - lo <= 1) return;
    const mid= (lo + hi) >> 1;
    sort(lo, mid);
    sort(mid, hi);

    let i = lo;
    let j = mid;
    let k = lo;
    let rightCount = 0;
    while (i < mid && j < hi) {
      if (nums[idx[j]] < nums[idx[i]]) {
        buf[k++]= idx[j++];
        rightCount++;
      } else {
        out[idx[i]] = rightCount;
        buf[k++]= idx[i++];
      }
    }
    while (i < mid) {
      out[idx[i]] = rightCount;
      buf[k++]= idx[i++];
    }
    while (j < hi) {
      buf[k++]= idx[j++];
    }
    for (let t= lo; t < hi; t++) idx[t]= buf[t];
  };

  sort(0, n);
  return out;
}

TypeScript · runnable

BIT with Coordinate Compression

TimeO(n log n)

SpaceO(n)

Recommended

Compress the distinct values of nums to a contiguous 1..=m range. A Fenwick tree indexed by compressed rank tracks "how many values of each rank have we seen while sweeping right-to-left." For each nums[i] from right to left:

counts[i] = prefix(rank[nums[i]] - 1) — the number of already-seen values strictly smaller.
Then add(rank[nums[i]], 1) — record this value.

Both ops are O(log n), and compression keeps the tree size at O(n) regardless of the original value range. This is the template for every "count inversions / count smaller-after" problem in production — order-book depth, live leaderboards, trailing-statistics dashboards.

impl Solution {
    pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
        let n = nums.len();
        // Coordinate compression: map each distinct value to a 1-based rank.
        let mut sorted: Vec<i32> = nums.clone();
        sorted.sort_unstable();
        sorted.dedup();
        let rank = |v: i32| -> usize {
            // sorted[pos] == v; +1 to switch to 1-based for the BIT.
            sorted.binary_search(&v).unwrap() + 1
        };
        let m = sorted.len();

        let mut bit = vec![0i32; m + 1];
        let add = |bit: &mut Vec<i32>, mut i: usize| {
            while i <= m {
                bit[i] = 1;
                i = i & i.wrapping_neg();
            }
        };
        let prefix= |bit: &Vec<i32>, mut i: usize| -> i32 {
            let mut s = 0;
            while i > 0 {
                s += bit[i];
                i -= i & i.wrapping_neg();
            }
            s
        };

        let mut out = vec![0i32; n];
        for i in (0..n).rev() {
            let r = rank(nums[i]);
            out[i] = prefix(&bit, r - 1);
            add(&mut bit, r);
        }
        out
    }
}

Rust · runnable

function countSmaller(nums: number[]): number[] {
  const n = nums.length;
  // Coordinate compression: distinct sorted values, 1-based rank.
  const sorted = [...new Set(nums)].sort((a, b) => a - b);
  const rankOf = new Map<number, number>();
  for (let i = 0; i < sorted.length; i++) rankOf.set(sorted[i], i + 1);
  const m= sorted.length;

  const bit= new Array<number>(m + 1).fill(0);
  const add = (i: number): void => {
    for (; i <= m; i = i & -i) bit[i] = 1;
  };
  const prefix= (i: number): number=> {
    let s= 0;
    for (; i > 0; i -= i & -i) s += bit[i];
    return s;
  };

  const out = new Array<number>(n).fill(0);
  for (let i = n - 1; i >= 0; i--) {
    const r = rankOf.get(nums[i])!;
    out[i] = prefix(r - 1);
    add(r);
  }
  return out;
}

TypeScript · runnable

Trace

BIT sweep over [5, 2, 6, 1]. Compressed ranks: 1 → 1, 2 → 2, 5 → 3, 6 → 4. We process right to left.

Range structure#315

Count of Smaller Numbers After Self walkthrough

nums = [5,2,6,1], ranks: 1→1,2→2,5→3,6→4

Step 1 of 30%

Range structure state

BIT add rank 1

Value 1 has rank 1.

Start from the rightmost value

Decision

Query prefix(0), then add rank 1.

Update

count[3] = 0.

Why it works

There are no values to the right yet.

Reading down the prefix(rank-1) column gives the answer for positions 3, 2, 1, 0 respectively: 0, 1, 1, 2. Written in original order: [2, 1, 1, 0].

Edge cases

Single element — no right neighbors, answer is [0].
All equal — strict inequality, so every count is 0.
Negative values — compression handles the sign transparently.
Large range, small n — compression keeps the BIT size at O(n); the raw value range never matters.
Duplicates — they share a compressed rank, and the strict prefix(rank - 1) query excludes them correctly.

💡

Tip

Any problem that asks "count elements with some monotone relationship to the current one" (inversions, dominating points, skyline depth) is a BIT over ranks. Memorize the compression + sweep template.

Takeaway

Right-to-left sweep + Fenwick over compressed ranks is the default tool for online rank queries. Merge sort gives the same asymptotics but demands more bookkeeping; reach for it when you need the sorted order as a byproduct.

Both techniques reappear in #2407 (segment tree over value windows) and in the guts of trading systems that maintain rolling statistics over sliding windows of mutable data.

Medium#307

Range Sum Query - Mutable

Classic segment tree. Build, update, query. Implement from scratch in Rust.

Hard#699

Falling Squares

Coordinate compression + lazy propagation. A complete segment tree exercise.

Hard#2407

Longest Increasing Subsequence II

Segment tree optimization on DP transitions. The ‘DP + data structure’ pattern.