How the cluster boundaries got drawn

The procedure is: embed every clip with a foundation model (we use NatureLM-audio[6] in the v1 corpus; production work would use ), get a 1,024- or 1,536-dim vector per clip, run — hierarchical density-based spatial clustering — on the full embeddings. HDBSCAN finds dense regions of arbitrary shape without requiring a target cluster count. On American crow audio, it converges on nine to twelve regions depending on the minimum-samples parameter. The names — territorial, mobbing, assembly, rattle, begging, companion, quiet grunts, loud grunts, exceptional — are then assigned post-hoc by listening to clips from each region and matching against the prior descriptive literature: Marzluff[4] and Angell, Mates[1] et al., the Verbeek[2] 2024 Birds of the World update. Human biologists name; the model draws.

Long-duration caws emitted from a perch, typically by a single bird, faced outward across a territory boundary.

Territorial caw

Long-duration caws emitted from a , typically by a single bird, faced outward across a territory boundary. The acoustic profile is one of declaration: long durations, regular spacing, low spectral roughness. Carries caller identity, sex, and approximate intent. This is the cluster that most directly maps onto the canonical caw the broader public hears in their neighborhoods. The territorial cluster is geometrically distant from mobbing in the space — distinguished primarily by inter-call interval and spectral packing, not by the individual call's acoustic shape.

Mobbing alarm

Compressed, urgent caws delivered in rapid sequences, often by recruited pairs or groups targeting an aerial predator. Spectrally rough, tightly packed in time. Kevin McGowan's Cornell field studies catalogued the recruitment chains: one bird sees, calls; three arrive; ten more follow. Some flocks remember and re-mob the same individual predator for years. The mobbing cluster sits high in the space, separated from territorial by call rate alone — the individual calls within a mobbing sequence are not radically different from territorial calls, but the sequence structure is.

Assembly calls

Loud, far-carrying calls designed to summon group members to a foraging discovery or settling roost. Where territorial calls keep neighbors at arm's length, assembly calls draw the crowd in. The acoustic distinction shows up in inter-call interval, not the spectral shape; the model detects it instantly. Two of the assembly recordings in CrowLingo's v1 corpus come from urban roosts of approximately twenty thousand birds in residential Minneapolis — an acoustic density that static spectrograms barely capture.

Rattle

The rattle is the strangest sound in the American crow repertoire — mechanical, almost prehistoric, weakly harmonic. Rattles appear in affiliative, recruitment, and occasionally territorial contexts, with high individual variation. Pairs and family groups seem to use them to signal something closer to mood than position. The acoustic geometry of the rattle cluster is unlike anything else in the crow's catalog, and unlike the rattle equivalents in other corvid species' embeddings. If a single cluster defines what makes American crows acoustically distinct, this is it.

Juvenile begging

Higher-frequency, narrower-band calls from juveniles soliciting feeding. Diagnostic spectral signature: tightly clustered, easy to identify. The Marzluff[4] lab's parent-offspring work showed begging calls are surprisingly individual; chicks recognize their own parents' return calls within weeks of fledging. The begging cluster sits in its own corner of the space, geometrically distant from everything except — interestingly — the rattle cluster, suggesting a deep acoustic relationship that traditional category labels missed.

Companion calls

Soft contact calls between paired adults, heavily individual and pair-specific. Every long-term mated couple has a recognizable acoustic signature. The Wright laboratory's analyses on dyadic corvid vocalizations through the 2010s and 2020s consistently found pair-specific patterns surviving seasonal changes. The companion cluster in our v1 corpus contains several of the canonical reference recordings — the USGS sample, the Wikipedia article introduction — because companion-call exemplars are what gets recorded when a researcher just points a microphone at a perched adult crow doing nothing dramatic.

Quiet grunts and loud grunts

These are the two clusters where the v1 CC-licensed corpus has the thinnest coverage. Both denote close-range vocalizations — quiet grunts in affiliative and parent-offspring contexts, loud grunts during foraging and recruitment. Both are acoustically subtle. The 2026 Demartsev[3] paper on carrion crows finally surfaced quiet grunts at scale using wearable bioacoustic loggers; American crow studies are catching up. The clusters exist on the map as placeholders for what we know is there but haven't yet sourced in license-compatible form.

The exceptional cluster

Atypical calls, rare vocalizations, anything the model places but the named categories can't account for. It's where future repertoire expansion lives — the dialect variants we haven't named yet, the bridge sounds between known clusters, the calls that don't fit because the categories themselves are still too coarse. Whether this cluster ultimately splits into named sub-clusters, or stays as the formal home for outliers, depends on how the corpus grows.

The new vs the old

Six of the nine clusters map cleanly onto the canonical categories the hand-labeling tradition recognized: territorial, mobbing, assembly, rattle, begging, companion. Three are genuinely new contributions of the AI approach: the explicit quiet-grunts / loud-grunts split (which classical methods often conflated or missed), and the exceptional category as a first-class home for atypical calls. The hand-labels weren't wrong. They were less granular than the geometry the model recovered. That's the honest summary of what AI added to crow bioacoustics in 2024-2026: not new sounds, but a finer-grained framework for the sounds we already knew were there.