Skip to content
CrowLingo

PILLAR IV — THE PIPELINE

From field recording to AI caption.

Eight stages, each transparent. Every point on the atlas passed through this exact pipeline. No black boxes, no hand-waving.

EIGHT STAGES

The end-to-end methodology.

01

Record

Field capture. AudioMoth passive recorders and directional shotgun mics. CC-licensed archives.

795 recordings

02

Preprocess

Noise reduction, silence trimming, segmentation. Bandpass 200 Hz–8 kHz.

~3.2s median

03

Embed

Pass each segment through Perch. Produces a 1024-dimensional vector. No labels.

Perch / BEATs

04

Project

Reduce 1024 dims to 2 via UMAP. Preserves local neighborhood structure.

UMAP

05

Cluster

HDBSCAN density-based clustering. Finds groups without being told how many.

9 clusters

06

Label

Match clusters to ethological literature. Old labels become region names.

primary citations

07

Caption

NatureLM-audio generates zero-shot descriptions. Hypothesis generator.

zero-shot

08

Interpret

Editorial synthesis. Read primary literature, compare to model outputs.

83 essays

See the pipeline in action.

Open the atlas →