FIG 0.1 — American crow · 1024-dimensional audio space
We can finally see what we couldn't hear.
A focused exploration of one species and one revolution: the American crow, and the new generation of AI audio models that turn its voice into a map. Seven hundred ninety-five recordings. Nine emergent clusters. One geometry.
- 795
- Recordings
- 9
- Clusters
- 83
- Essays
- 1024
- Dimensions
FIG 0.2 — Three coordinates
Listen. Then find each call on the map.
Perched flock — territorial caws
Rattle complex — affiliative
Juvenile begging + adult exchange
FIG 0.3 — The shift
Discrete categories gave way to a continuous map.
A fifty-year methodology change, compressed into one frame.
The change is not that we found new sounds. The change is that we stopped treating each call as a label, and started treating the whole repertoire as a geometry. Graded variation, dialect, individual signature: all of it visible at once, on the same map, in milliseconds per call.
The old labels survive as labels of regions in the map, not as boundaries on the world.
— CrowLingo Editorial
The 2026 carrion-crow bioRxiv preprint by Demartsev et al. used wearable loggers and this mapping discipline to recover both discrete and graded structure in grunts and caws. Territory by territory, individual by individual. The method is the message: stop sorting, start mapping.
FIG 0.4 — Honest framing
We mapped the language. We have not learned it.
Demonstrated today
What the models can do
- Automatic detection and segmentation of crow vocalizations from field audio
- Unsupervised category discovery — clusters emerge from geometry, not labels
- Caller-identity inference at individual-bird resolution
- Behavioral-context mapping across nine distinct call types
- Zero-shot captioning via NatureLM-audio foundation models
Not yet here
What's still ahead
- Compositional decoding — understanding calls as combined units, not single tokens
- Real-time bidirectional dialogue between human and crow
- A verified "crow dictionary" with human-readable glosses
- Cross-species transfer showing which structures generalize
- Field-deployable playback systems with ethical guardrails
FIG 0.5 — Where to next
Four ways in, scaled by commitment.
01 · 90 seconds
Listen first
Three real crow recordings with spectrograms and AI interpretation.
→02 · 10 minutes
Open the atlas
Interactive 2D map. Nine clusters, real spectrograms, behavioral context.
→03 · 30 minutes
Read the journal
Eighty-three long-form essays on AI bioacoustics, corvid cognition.
→04 · An evening
Study the methods
Self-supervised audio, latent spaces, NatureLM-audio. Source by source.
→Frequently asked
What people ask about this.
What is CrowLingo?+
An independent editorial publication analyzing how AI audio models are changing what we know about American crow vocalization. Built on primary corvid research, real field recordings, and open-source bioacoustic models.
Is CrowLingo translating crow language?+
No. The models can map, cluster, and characterize vocalizations. They cannot translate. We are explicit about this distinction throughout the site.
Who built this?+
CrowLingo is a Kymata Labs publication. Not affiliated with Earth Species Project, Cornell Lab of Ornithology, Project CETI, or any specific research group.
Can I use the recordings?+
Field recordings are CC-licensed by their original contributors. The site's editorial content is CC BY-NC 4.0. Source code is MIT.