Find the Shortest Superstring
Bitmask DP (TSP with overlap cost)
Bitmask DP (TSP variant). Can you identify it within 3 minutes?
Problem
Given an array of strings words, return the shortest string that contains every word as a substring. If several answers have equal length, return any of them. You may assume no word is a substring of another (after a standard pre-pass).
Constraints that shape the algorithm: 1 ≤ n ≤ 12 and |word| ≤ 20. An n under 16 is the tell-tale sign of a bitmask DP.
Intuition
If you just concatenate the words in any order, the total length is the sum of all lengths. Every time the suffix of one word matches the prefix of the next, you can overlap them and save characters. So the question becomes: find an ordering that maximizes total overlap — equivalently, minimizes total length.
Formally: pick a permutation p of the n words; cost is Σ len(words[p[i]]) − Σ overlap(words[p[i−1]], words[p[i]]). That's TSP on a complete digraph where nodes are words and edges carry overlap amounts. TSP is NP-hard in general, but n ≤ 12 means 2^n · n² ≈ 590k — trivial for a bitmask DP.
dp[mask][i] = minimum total length of a chain that uses exactly the words in mask and ends at word i. Transition: extend by any word j not yet in the mask, paying len(words[j]) − overlap(i, j).
Reconstruction needs parent[mask][i] — the word we were at before arriving at i.
Approach
Brute-Force Permutation
O(n! · n · L)O(n)Enumerate all n! permutations, build each candidate string, keep the shortest.
n = 12 gives 479M permutations, far too slow in practice but useful as a correctness oracle during development.
impl Solution {
pub fn shortest_superstring(words: Vec<String>) -> String {
let n = words.len();
let mut perm: Vec<usize> = (0..n).collect();
let mut best: Option<String> = None;
Self::permute(&mut perm, 0, &words, &mut best);
best.unwrap_or_default()
}
fn permute(perm: &mut Vec<usize>, k: usize, words: &[String], best: &mut Option<String>) {
if k == perm.len() {
let mut s = words[perm[0]].clone();
for w in 1..perm.len() {
s = Self::merge(&s, &words[perm[w]]);
}
if best.as_ref().map_or(true, |b| s.len() < b.len()) {
*best= Some(s);
}
return;
}
for i in k..perm.len() {
perm.swap(k, i);
Self::permute(perm, k + 1, words, best);
perm.swap(k, i);
}
}
fn merge(a: &str, b: &str) -> String {
let max_k = a.len().min(b.len());
for k in (0..=max_k).rev() {
if a.as_bytes().ends_with(&b.as_bytes()[..k]) {
let mut out = a.to_string();
out.push_str(&b[k..]);
return out;
}
}
unreachable!()
}
}
function shortestSuperstring(words: string[]): string {
let best = '';
const perm = [...words.keys()];
const permute = (k: number): void => {
if (k === perm.length) {
let s = words[perm[0]];
for (let w = 1; w < perm.length; w++) s = merge(s, words[perm[w]]);
if (best === '' || s.length < best.length) best = s;
return;
}
for (let i = k; i < perm.length; i++) {
[perm[k], perm[i]] = [perm[i], perm[k]];
permute(k + 1);
[perm[k], perm[i]] = [perm[i], perm[k]];
}
};
const merge = (a: string, b: string): string => {
const max = Math.min(a.length, b.length);
for (let k = max; k >= 0; k--) {
if (a.endsWith(b.slice(0, k))) return a + b.slice(k);
}
return a + b;
};
permute(0);
return best;
}Bitmask DP (Held–Karp)
O(2^n · n^2 + n^2 · L)O(2^n · n)- Pre-pass: remove any word that is a substring of another (otherwise the "no duplicates / no containment" assumption breaks).
- Precompute
overlap[i][j]inO(n^2 · L): maxkwhere the lastkchars ofwords[i]match the firstkchars ofwords[j]. - DP.
dp[mask][i]= min length of any chain using words inmask, ending ati. Base:dp[{i}][i] = len(words[i]). Transition:dp[mask | (1<<j)][j]= min(..., dp[mask][i] + len(words[j]) − overlap[i][j])for everyj ∉ mask. - Reconstruct via
parent[mask][i]: walk backwards from the best end state, popping bits, to recover the permutation.
impl Solution {
pub fn shortest_superstring(words: Vec<String>) -> String {
let mut words: Vec<Vec<u8>> = {
let mut seen = std::collections::HashSet::new();
words
.into_iter()
.map(|s| s.into_bytes())
.filter(|w| seen.insert(w.clone()))
.collect()
};
let mut keep = vec![true; words.len()];
for i in 0..words.len() {
for j in 0..words.len() {
if i == j || !keep[i] || words[i].len() >= words[j].len() {
continue;
}
if words[j].windows(words[i].len()).any(|w| w == words[i].as_slice()) {
keep[i] = false;
break;
}
}
}
words = words.into_iter().zip(keep).filter(|(_, k)| *k).map(|(w, _)| w).collect();
let n = words.len();
if n == 0 { return String::new(); }
if n == 1 { return String::from_utf8(words[0].clone()).unwrap(); }
let mut overlap = vec![vec![0usize; n]; n];
for i in 0..n {
for j in 0..n {
if i == j { continue; }
let max_k = words[i].len().min(words[j].len());
for k in (1..=max_k).rev() {
if words[i].ends_with(&words[j][..k]) {
overlap[i][j] = k;
break;
}
}
}
}
const INF: usize = usize::MAX / 4;
let full = 1usize << n;
let mut dp= vec![vec![INF; n]; full];
let mut parent= vec![vec![usize::MAX; n]; full];
for i in 0..n {
dp[1 << i][i]= words[i].len();
}
for mask in 1..full {
for i in 0..n {
if (mask >> i) & 1 == 0 || dp[mask][i] == INF { continue; }
for j in 0..n {
if (mask >> j) & 1 == 1 { continue; }
let new_mask = mask | (1 << j);
let cand= dp[mask][i] + words[j].len() - overlap[i][j];
if cand < dp[new_mask][j] {
dp[new_mask][j]= cand;
parent[new_mask][j]= i;
}
}
}
}
let all= full - 1;
let mut best_i= 0;
for i in 1..n {
if dp[all][i] < dp[all][best_i] { best_i= i; }
}
let mut order: Vec<usize> = Vec::with_capacity(n);
let (mut mask, mut i) = (all, best_i);
while i != usize::MAX {
order.push(i);
let prev = parent[mask][i];
mask ^= 1 << i;
i= prev;
}
order.reverse();
let mut out= words[order[0]].clone();
for w in 1..order.len() {
let ov= overlap[order[w - 1]][order[w]];
out.extend_from_slice(&words[order[w]][ov..]);
}
String::from_utf8(out).unwrap()
}
}function shortestSuperstring(words: string[]): string {
const unique = Array.from(new Set(words));
const keep = unique.map(() => true);
for (let i = 0; i < unique.length; i++) {
for (let j = 0; j < unique.length; j++) {
if (i === j || !keep[i] || unique[i].length >= unique[j].length) continue;
if (unique[j].includes(unique[i])) { keep[i] = false; break; }
}
}
const W = unique.filter((_, i) => keep[i]);
const n = W.length;
if (n === 0) return '';
if (n === 1) return W[0];
const overlap: number[][] = Array.from({ length: n }, () => new Array(n).fill(0));
for (let i = 0; i < n; i++) {
for (let j = 0; j < n; j++) {
if (i === j) continue;
const maxK = Math.min(W[i].length, W[j].length);
for (let k = maxK; k >= 1; k--) {
if (W[i].endsWith(W[j].slice(0, k))) { overlap[i][j] = k; break; }
}
}
}
const INF = Number.MAX_SAFE_INTEGER / 4;
const full = 1 << n;
const dp: number[][] = Array.from({ length: full }, () => new Array(n).fill(INF));
const parent: number[][] = Array.from({ length: full }, () => new Array(n).fill(-1));
for (let i = 0; i < n; i++) dp[1 << i][i] = W[i].length;
for (let mask = 1; mask < full; mask++) {
for (let i = 0; i < n; i++) {
if (!((mask >> i) & 1) || dp[mask][i] === INF) continue;
for (let j = 0; j < n; j++) {
if ((mask >> j) & 1) continue;
const nm = mask | (1 << j);
const cand = dp[mask][i] + W[j].length - overlap[i][j];
if (cand < dp[nm][j]) { dp[nm][j] = cand; parent[nm][j] = i; }
}
}
}
const all = full - 1;
let best = 0;
for (let i = 1; i < n; i++) if (dp[all][i] < dp[all][best]) best = i;
const order: number[] = [];
let mask = all, i = best;
while (i !== -1) {
order.push(i);
const prev = parent[mask][i];
mask ^= 1 << i;
i = prev;
}
order.reverse();
let out = W[order[0]];
for (let w = 1; w < order.length; w++) {
out += W[order[w]].slice(overlap[order[w - 1]][order[w]]);
}
return out;
}Trace
On ["alex","loves","leetcode"], all pairwise overlaps are 0 (no suffix/prefix matches), so dp[full][i] = 4 + 5 + 8 = 17 for every ending i.
| mask | ending i | dp | parent |
|---|---|---|---|
| {alex} | alex | 4 | - |
| {loves} | loves | 5 | - |
| {leetcode} | leetcode | 8 | - |
| {alex,loves} | loves | 9 | alex |
| {alex,loves,leetcode} | leetcode | 17 | loves |
Edge cases
- Duplicates / containment — the pre-pass deduplicates and strips substring-of-another words; forgetting this produces wrong overlaps.
n = 1— return the sole word directly; the DP loop won't enter any transition.- Disjoint words (no overlap) — every
overlap[i][j] = 0;dp[full][i] = Σ len. Output is still a valid superstring, just not shorter than the raw concat. - Multiple optimal answers — different tie-breaks produce different strings of the same length. Judges (and our harness) accept any valid shortest.
Takeaway
When n ≤ 16 and you need to pick a best permutation, reach for bitmask DP. The state (mask, last) is the universal Held–Karp shape for TSP variants — shortest superstring, traveling purchaser, sequential decision problems.
Under interview pressure, the 3-minute pattern-match is: small n, permutation search, optimization objective → bitmask DP. Explaining the dp[mask][i] state out loud is often enough to earn partial credit even if you don't finish coding.
More in Timed Random Hards
Sweep line + ordered multiset. Compose techniques under pressure.
Set cover via bitmask DP. n ≤ 16 is the signal.
2D sweep line + coordinate compression. Multi-technique composition.
Ordered map interval tracking. Relevant to order book modeling.