What corvid researchers wish you understood about AI bioacoustics

Map versus dictionary

The single most-repeated wish: that the public would internalize the distinction between mapping a vocal repertoire and translating a language. Mapping the geometry of bird vocalizations is what the AI methods do well and what the field has gotten dramatically better at since 2022. Translating those vocalizations into human language is what the field cannot do, will not do soon, and probably will not do with the contemporary methods at all. Both can be true. Both ARE true. The popular framing conflates them roughly every six months.

The second-most-repeated wish: that popular coverage would stop treating the receiver-side decoding problem as a technicality that future research will resolve.

Receiver-side problem isn't a detail

The second-most-repeated wish: that popular coverage would stop treating the receiver-side decoding problem as a technicality that future research will resolve. The receiver-side problem — whether animals, hearing a call, demonstrably use the call's information to do something — is the actual rate limiter on translation claims. It's not a detail. It requires behavioral experiments that are ethically expensive and methodologically demanding. It cannot be solved by training larger models. Most AI bioacoustics speculation in popular outlets implicitly assumes the receiver-side problem is solvable by scale; the working scientists generally don't share that assumption.

Field bioacoustics is slow on purpose

Patient field observation — Heinrich's tradition, the McGowan program, the Marzluff Seattle work, the Demartsev^[3] wearable-logger studies — produces findings that the AI methods need to anchor against. The slowness isn't lazy; it's calibrated to the resolution at which the questions can be answered honestly. Every time popular coverage characterizes traditional bioacoustics as 'outdated' or 'replaced by AI,' a working scientist somewhere updates their priors about how seriously to take that publication going forward. AI has accelerated some questions; it has not replaced the slow questions that the slow methods are best at.

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Most working bioacousticians would, if pressed, name the Merlin phone app and BirdNET^[1] as the most consequential AI bioacoustics deployment of the past five years — not the latest research model that wins benchmarks. Reason: deployment scale changes what the public's relationship with bird audio looks like, which changes citizen-science data flows, which changes the data infrastructure the research field can rely on for the next decade. A research model that wins on a benchmark and reaches twelve people doesn't move the needle as much as a deployed model that reaches twelve million. The hierarchy of importance from inside the field is not the hierarchy popular coverage often implies.

Open data is the bottleneck, not algorithms

Algorithm-side progress has been rapid. Data-side progress has been slow because the underlying audio corpora are licensed restrictively (Macaulay Library), encoded inconsistently (decades of audio in different formats), or simply not collected (most species' close-range and quiet vocalizations). Working bioacousticians who pay attention to where progress will come from generally bet on data more than algorithms. The Demartsev^[3] wearable-logger work is interesting not for new methods but for the data it generated. The same observation applies to American crow work: when comparable wearable-logger studies happen for American crows, the field will jump forward in ways no algorithm release would deliver.

Ethics is a real constraint, not a footnote

The ethics floor in bioacoustic research — no playback within ten meters of nests, IACUC review for any vertebrate-wildlife playback, etc. — is not regulatory paperwork. It's how the field maintains the relationship with wild populations that makes the science possible. Popular coverage that implicitly assumes 'we'll just run more playback experiments' as the path to translation underestimates how constraining the ethics floor is and how seriously working scientists take it. CrowLingo doesn't deploy playback features for the same reason: the ethical floor for a public-facing site is stricter than the research-lab floor because the user base is uncontrolled.

What working scientists are actually excited about

Wearable bioacoustic loggers (the methodology that produced the Demartsev^[3] paper) generalizing to more species. Open-tooling infrastructure (Voxaboxen, weights, BEANS benchmark) accumulating into a shared substrate. Cross-disciplinary collaboration between behavioral ecologists and ML researchers becoming the default rather than the exception. Slow, careful work on receiver-side validation finally getting the funding it deserves. Notably absent from the working-scientist excitement list: imminent translation, AI-mediated human-animal conversation, dictionary-style decoders. These are popular-coverage staples that don't show up in the actual research community's enthusiasm.

Quick answers from this piece.

What's the biggest misconception about AI animal communication research?

That mapping a vocal repertoire is the same as translating it. AI methods can map vocalizations geometrically and characterize cluster-level behavioral associations; they cannot translate without receiver-side behavioral evidence that the contemporary methods alone cannot produce.

Are working scientists optimistic about translating animal language?

Not in the strong sense the popular framing implies. Most are optimistic about better repertoire mapping, more rigorous behavioral validation, and meaningful insights into communication structure. They are not generally optimistic about translation in the next decade with current methods.

What deployments matter most for AI bioacoustics?

Counter-intuitively, BirdNET-powered Merlin on phones (tens of millions of users) and AudioMoth-based passive acoustic monitoring (tens of thousands of devices) — both consumer/deployment scales — have moved the field more than any single research-grade foundation model release. Reaching the public reshapes the data infrastructure the research depends on.

Map versus dictionary

Receiver-side problem isn't a detail

Field bioacoustics is slow on purpose

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Open data is the bottleneck, not algorithms

Ethics is a real constraint, not a footnote

What working scientists are actually excited about

Quick answers from this piece.

What's the biggest misconception about AI animal communication research?

Are working scientists optimistic about translating animal language?

What deployments matter most for AI bioacoustics?

What corvid researchers wish you understood about AI bioacoustics

Map versus dictionary

Receiver-side problem isn't a detail

Field bioacoustics is slow on purpose

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Open data is the bottleneck, not algorithms

Ethics is a real constraint, not a footnote

What working scientists are actually excited about

Quick answers from this piece.

Cited in this piece.

What 'translating' animal language would actually require

Why animal-language AI is harder than human-language AI

What we don't know about crow communication

What corvid researchers wish you understood about AI bioacoustics

Map versus dictionary

Receiver-side problem isn't a detail

Field bioacoustics is slow on purpose

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Open data is the bottleneck, not algorithms

Ethics is a real constraint, not a footnote

What working scientists are actually excited about

Quick answers from this piece.

Cited in this piece.

What 'translating' animal language would actually require

Why animal-language AI is harder than human-language AI

What we don't know about crow communication

Map versus dictionary

Receiver-side problem isn't a detail

Field bioacoustics is slow on purpose

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Open data is the bottleneck, not algorithms

Ethics is a real constraint, not a footnote

What working scientists are actually excited about

Quick answers from this piece.

Cited in this piece.

People who read this also read

What 'translating' animal language would actually require

Why animal-language AI is harder than human-language AI

What we don't know about crow communication

Map versus dictionary

Receiver-side problem isn't a detail

Field bioacoustics is slow on purpose

BirdNET on a phone is a bigger deal than NatureLM-audio in a lab

Open data is the bottleneck, not algorithms

Ethics is a real constraint, not a footnote

What working scientists are actually excited about

Quick answers from this piece.

Cited in this piece.

People who read this also read

What 'translating' animal language would actually require

Why animal-language AI is harder than human-language AI

What we don't know about crow communication