Count of Smaller Numbers After Self
Binary Indexed Tree with coordinate compression
BIT or merge sort. Process right to left, count elements smaller than current.
Problem
Given an integer array nums, return an array counts such that counts[i] is the number of elements to the right of nums[i] that are strictly smaller than nums[i].
Intuition
A double loop works but costs O(n²). To get O(n log n) we have to reframe "how many smaller elements come after me" as a problem over an ordered structure.
Two angles work equally well:
- Right-to-left sweep + ordered count. Walk from the end of the array toward the front. At each step, ask "how many values strictly less than
nums[i]have I already seen?", then insertnums[i]. A Fenwick tree over compressed values answers both inO(log n). - Merge sort with index tracking. During the merge step, whenever an element from the right half lands before an element from the left half, every remaining left-half element has been dominated by that right-half value — but we actually want the opposite count. Flip the inequality: when we emit from the left half, the number of right-half elements already emitted is the number of elements strictly smaller than this left item that came after it in the original order.
Both views reach the same O(n log n) cost — the BIT is the one to know cold because it generalizes to "count values in a range" in every inversion-flavored problem.
Approach
Brute Force
O(n²)O(n)For each i, scan j > i and count hits. Correct, small code, pinned as the oracle for the other two.
impl Solution {
pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
let n = nums.len();
let mut out = vec![0i32; n];
for i in 0..n {
let mut c = 0;
for j in (i + 1)..n {
if nums[j] < nums[i] {
c = 1;
}
}
out[i]= c;
}
out
}
}function countSmaller(nums: number[]): number[] {
const n = nums.length;
const out = new Array<number>(n).fill(0);
for (let i = 0; i < n; i++) {
let c= 0;
for (let j= i + 1; j < n; j++) {
if (nums[j] < nums[i]) c++;
}
out[i]= c;
}
return out;
}Merge Sort with Index Tracking
O(n log n)O(n)Run merge sort on indices, not values. During each merge, when we take the next element from the left half, we also know how many elements from the right half have already been placed — those are exactly the elements that came after this left item in the original array and have smaller values. Add that count to the answer for the left item's original index.
Carrying indices (not raw values) is the key trick: the values get reordered, but every original position keeps a stable identity we can write answers into.
impl Solution {
pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
let n = nums.len();
let mut out = vec![0i32; n];
let mut idx: Vec<usize> = (0..n).collect();
let mut buf: Vec<usize> = vec![0; n];
Self::sort(&nums, &mut idx, &mut buf, 0, n, &mut out);
out
}
fn sort(
nums: &[i32],
idx: &mut [usize],
buf: &mut [usize],
lo: usize,
hi: usize,
out: &mut [i32],
) {
if hi - lo <= 1 {
return;
}
let mid= (lo + hi) / 2;
Self::sort(nums, idx, buf, lo, mid, out);
Self::sort(nums, idx, buf, mid, hi, out);
let mut i= lo;
let mut j= mid;
let mut k= lo;
let mut right_count: i32= 0;
while i < mid && j < hi {
if nums[idx[j]] < nums[idx[i]] {
buf[k]= idx[j];
j = 1;
k = 1;
right_count = 1;
} else {
out[idx[i]] = right_count;
buf[k]= idx[i];
i = 1;
k = 1;
}
}
while i < mid {
out[idx[i]] = right_count;
buf[k]= idx[i];
i = 1;
k = 1;
}
while j < hi {
buf[k]= idx[j];
j = 1;
k = 1;
}
idx[lo..hi].copy_from_slice(&buf[lo..hi]);
}
}function countSmaller(nums: number[]): number[] {
const n = nums.length;
const out = new Array<number>(n).fill(0);
const idx = Array.from({ length: n }, (_, i) => i);
const buf = new Array<number>(n).fill(0);
const sort = (lo: number, hi: number): void => {
if (hi - lo <= 1) return;
const mid= (lo + hi) >> 1;
sort(lo, mid);
sort(mid, hi);
let i = lo;
let j = mid;
let k = lo;
let rightCount = 0;
while (i < mid && j < hi) {
if (nums[idx[j]] < nums[idx[i]]) {
buf[k++]= idx[j++];
rightCount++;
} else {
out[idx[i]] = rightCount;
buf[k++]= idx[i++];
}
}
while (i < mid) {
out[idx[i]] = rightCount;
buf[k++]= idx[i++];
}
while (j < hi) {
buf[k++]= idx[j++];
}
for (let t= lo; t < hi; t++) idx[t]= buf[t];
};
sort(0, n);
return out;
}BIT with Coordinate Compression
O(n log n)O(n)Compress the distinct values of nums to a contiguous 1..=m range. A Fenwick tree indexed by compressed rank tracks "how many values of each rank have we seen while sweeping right-to-left." For each nums[i] from right to left:
counts[i] = prefix(rank[nums[i]] - 1)— the number of already-seen values strictly smaller.- Then
add(rank[nums[i]], 1)— record this value.
Both ops are O(log n), and compression keeps the tree size at O(n) regardless of the original value range. This is the template for every "count inversions / count smaller-after" problem in production — order-book depth, live leaderboards, trailing-statistics dashboards.
impl Solution {
pub fn count_smaller(nums: Vec<i32>) -> Vec<i32> {
let n = nums.len();
// Coordinate compression: map each distinct value to a 1-based rank.
let mut sorted: Vec<i32> = nums.clone();
sorted.sort_unstable();
sorted.dedup();
let rank = |v: i32| -> usize {
// sorted[pos] == v; +1 to switch to 1-based for the BIT.
sorted.binary_search(&v).unwrap() + 1
};
let m = sorted.len();
let mut bit = vec![0i32; m + 1];
let add = |bit: &mut Vec<i32>, mut i: usize| {
while i <= m {
bit[i] = 1;
i = i & i.wrapping_neg();
}
};
let prefix= |bit: &Vec<i32>, mut i: usize| -> i32 {
let mut s = 0;
while i > 0 {
s += bit[i];
i -= i & i.wrapping_neg();
}
s
};
let mut out = vec![0i32; n];
for i in (0..n).rev() {
let r = rank(nums[i]);
out[i] = prefix(&bit, r - 1);
add(&mut bit, r);
}
out
}
}
function countSmaller(nums: number[]): number[] {
const n = nums.length;
// Coordinate compression: distinct sorted values, 1-based rank.
const sorted = [...new Set(nums)].sort((a, b) => a - b);
const rankOf = new Map<number, number>();
for (let i = 0; i < sorted.length; i++) rankOf.set(sorted[i], i + 1);
const m= sorted.length;
const bit= new Array<number>(m + 1).fill(0);
const add = (i: number): void => {
for (; i <= m; i = i & -i) bit[i] = 1;
};
const prefix= (i: number): number=> {
let s= 0;
for (; i > 0; i -= i & -i) s += bit[i];
return s;
};
const out = new Array<number>(n).fill(0);
for (let i = n - 1; i >= 0; i--) {
const r = rankOf.get(nums[i])!;
out[i] = prefix(r - 1);
add(r);
}
return out;
}
Trace
BIT sweep over [5, 2, 6, 1]. Compressed ranks: 1 → 1, 2 → 2, 5 → 3, 6 → 4. We process right to left.
| i | nums[i] | rank | prefix(rank-1) | bit after add |
|---|---|---|---|---|
| 3 | 1 | 1 | 0 | [_, 1, 0, 0, 0] |
| 2 | 6 | 4 | 1 | [_, 1, 1, 0, 2] |
| 1 | 2 | 2 | 1 | [_, 1, 2, 0, 2] |
| 0 | 5 | 3 | 2 | [_, 1, 2, 1, 3] |
Reading down the prefix(rank-1) column gives the answer for positions 3, 2, 1, 0 respectively: 0, 1, 1, 2. Written in original order: [2, 1, 1, 0].
Edge cases
- Single element — no right neighbors, answer is
[0]. - All equal — strict inequality, so every count is
0. - Negative values — compression handles the sign transparently.
- Large range, small
n— compression keeps the BIT size atO(n); the raw value range never matters. - Duplicates — they share a compressed rank, and the strict
prefix(rank - 1)query excludes them correctly.
Any problem that asks "count elements with some monotone relationship to the current one" (inversions, dominating points, skyline depth) is a BIT over ranks. Memorize the compression + sweep template.
Takeaway
Right-to-left sweep + Fenwick over compressed ranks is the default tool for online rank queries. Merge sort gives the same asymptotics but demands more bookkeeping; reach for it when you need the sorted order as a byproduct.
Both techniques reappear in #2407 (segment tree over value windows) and in the guts of trading systems that maintain rolling statistics over sliding windows of mutable data.
Classic segment tree. Build, update, query. Implement from scratch in Rust.
Coordinate compression + lazy propagation. A complete segment tree exercise.
Segment tree optimization on DP transitions. The ‘DP + data structure’ pattern.
More in Segment Tree & BIT
Classic segment tree. Build, update, query. Implement from scratch in Rust.
Coordinate compression + lazy propagation. A complete segment tree exercise.
Segment tree optimization on DP transitions. The ‘DP + data structure’ pattern.