PILLAR IV — THE PIPELINE
From field recording to AI caption.
Eight stages, each transparent. Every point on the atlas passed through this exact pipeline. No black boxes, no hand-waving.
EIGHT STAGES
The end-to-end methodology.
Record
Field capture. AudioMoth passive recorders and directional shotgun mics. CC-licensed archives.
795 recordings
Preprocess
Noise reduction, silence trimming, segmentation. Bandpass 200 Hz–8 kHz.
~3.2s median
Embed
Pass each segment through Perch. Produces a 1024-dimensional vector. No labels.
Perch / BEATs
Project
Reduce 1024 dims to 2 via UMAP. Preserves local neighborhood structure.
UMAP
Cluster
HDBSCAN density-based clustering. Finds groups without being told how many.
9 clusters
Label
Match clusters to ethological literature. Old labels become region names.
primary citations
Caption
NatureLM-audio generates zero-shot descriptions. Hypothesis generator.
zero-shot
Interpret
Editorial synthesis. Read primary literature, compare to model outputs.
83 essays