Merge adjacent segments by spectral similarity — merge_by

Given a set of adjacent segment ID pairs and per-segment mean feature vectors, compute a mapping that merges (clusters) segments whose mean vectors are sufficiently similar. Similarity is evaluated only between segments that are adjacent (share an edge/corner), and merges are propagated transitively via a union-find (disjoint-set) structure.

Usage

merge_by_adjacency(adj_pairs, means, thr)

Arguments

adj_pairs: Integer matrix with two columns. Each row is an undirected adjacency between segment IDs (e.g., output of `extract_seam_pairs()`). Pairs may contain duplicates; they do not need to be sorted.
means: Numeric matrix of per-segment mean features (e.g., band means). Row names must be segment IDs as character strings (e.g., `"123"`), and columns are feature dimensions (bands or derived features).
thr: Numeric scalar. Merge threshold applied to the Euclidean distance between mean feature vectors. Adjacent segments with distance `< thr` are merged into the same component.

Value

An integer vector mapping segment IDs to representative IDs. The vector is named by the original segment IDs (as character) and its values are the representative IDs (as integer). Only IDs present in `adj_pairs` are included. If `adj_pairs` is empty, returns an empty named integer vector.

Details

This function is typically used after extracting seam or boundary adjacencies (e.g., from tiled segmentations) to reconcile segment IDs across tiles and reduce seam artifacts by merging regions with near-identical spectral means.

**Algorithm.** For each adjacency `(a, b)`, the function extracts feature vectors `means[a, ]` and `means[b, ]` and computes their Euclidean distance `d = sqrt(sum((da - db)^2))`. If `d < thr` and both vectors contain no `NA`, the segments are unioned in a disjoint-set (union-find) structure. After all pairs are processed, each segment is assigned to the representative of its connected component (with path compression to accelerate repeated finds).

**Transitivity.** Because union-find forms connected components, merges are transitive: if `a` merges with `b` and `b` merges with `c`, then `a`, `b`, and `c` will share the same representative even if `(a, c)` was never an explicit adjacency.

**Representatives.** Representatives are chosen by union order (the root in the union-find structure), not necessarily the smallest ID. If you require canonical representatives (e.g., minimum ID per component), post-process the mapping accordingly.

**Robustness.** Pairs whose feature vectors contain `NA` are skipped. Only IDs appearing in `adj_pairs` are considered; if `means` contains additional segments not present in `adj_pairs`, they will not appear in the output map.

Examples

if (FALSE) { # \dontrun{

# Example adjacency pairs (undirected)
adj <- matrix(c(1,2, 2,3, 10,11), ncol = 2, byrow = TRUE)
means <- rbind(
  "1"  = c(0.10, 0.20),
  "2"  = c(0.11, 0.19),
  "3"  = c(0.12, 0.18),
  "10" = c(0.80, 0.75),
  "11" = c(0.81, 0.74)
)
merge_by_adjacency(adj, means, thr = 0.05)
} # }