Skip to content
CrowLingo

FIG 0.1 — American crow · 1024-dimensional audio space

We can finally see what we couldn't hear.

A focused exploration of one species and one revolution: the American crow, and the new generation of AI audio models that turn its voice into a map. Seven hundred ninety-five recordings. Nine emergent clusters. One geometry.

795
Recordings
9
Clusters
83
Essays
1024
Dimensions
VOCAL ATLAS · UMAP 2DHDBSCAN · 9 CLUSTERS
Scroll the full atlas to explore all nine clustersOpen →

FIG 0.2 — Three coordinates

Listen. Then find each call on the map.

All nine clusters →
Cluster 01 · Territorial

Perched flock — territorial caws

00:25Suburban pine stand, MN
Cluster 04 · Rattle

Rattle complex — affiliative

00:33Urban park, MN
Cluster 05 · Begging

Juvenile begging + adult exchange

00:40Powderhorn Park, MN

FIG 0.3 — The shift

Discrete categories gave way to a continuous map.

A fifty-year methodology change, compressed into one frame.

The change is not that we found new sounds. The change is that we stopped treating each call as a label, and started treating the whole repertoire as a geometry. Graded variation, dialect, individual signature: all of it visible at once, on the same map, in milliseconds per call.

The old labels survive as labels of regions in the map, not as boundaries on the world.

— CrowLingo Editorial

The 2026 carrion-crow bioRxiv preprint by Demartsev et al. used wearable loggers and this mapping discipline to recover both discrete and graded structure in grunts and caws. Territory by territory, individual by individual. The method is the message: stop sorting, start mapping.

FIG 0.4 — Honest framing

We mapped the language. We have not learned it.

Demonstrated today

What the models can do

  • Automatic detection and segmentation of crow vocalizations from field audio
  • Unsupervised category discovery — clusters emerge from geometry, not labels
  • Caller-identity inference at individual-bird resolution
  • Behavioral-context mapping across nine distinct call types
  • Zero-shot captioning via NatureLM-audio foundation models

Not yet here

What's still ahead

  • Compositional decoding — understanding calls as combined units, not single tokens
  • Real-time bidirectional dialogue between human and crow
  • A verified "crow dictionary" with human-readable glosses
  • Cross-species transfer showing which structures generalize
  • Field-deployable playback systems with ethical guardrails

Frequently asked

What people ask about this.

What is CrowLingo?+

An independent editorial publication analyzing how AI audio models are changing what we know about American crow vocalization. Built on primary corvid research, real field recordings, and open-source bioacoustic models.

Is CrowLingo translating crow language?+

No. The models can map, cluster, and characterize vocalizations. They cannot translate. We are explicit about this distinction throughout the site.

Who built this?+

CrowLingo is a Kymata Labs publication. Not affiliated with Earth Species Project, Cornell Lab of Ornithology, Project CETI, or any specific research group.

Can I use the recordings?+

Field recordings are CC-licensed by their original contributors. The site's editorial content is CC BY-NC 4.0. Source code is MIT.

Frequently asked

What people ask about this.

What is CrowLingo?
CrowLingo is a public-facing exploration of what AI audio foundation models reveal about American crow vocalizations. It pairs an interactive vocal atlas, real CC-licensed audio recordings, real spectrograms, and AI-narrated cluster explanations grounded in primary corvid literature.
Is CrowLingo translating crow language?
No. The site catalogs and characterizes crow vocalizations using audio embeddings and behavioral context — it does not claim to translate. The ethics floor explicitly rules out translation claims; interpretation is acknowledged as interpretation.
Who built CrowLingo?
CrowLingo is a Kymata Labs publication. Audio sources are CC-licensed from Wikimedia Commons (primarily Jonathon Jongsma, CC BY-SA 3.0). AI narration is grounded in the published literature listed in the library.