Caller sex — high confidence
Pitch contour is the most reliable sex tag in American crow vocalizations. Female caws sit, on average, slightly higher than male caws, with a steeper terminal frequency fall. The signal isn't perfect — there's overlap, and individual variation within sex exceeds mean between-sex variation for some individuals — but at the population level, AI classifiers recover caller sex from a single caw with accuracy substantially above chance. This finding was established with hand-engineered features (Mates[1] et al. 2014, and earlier) and has been replicated with every modern -based pipeline that has tested it. High confidence.
Harmonic emphasis — the relative loudness of second and third harmonics versus the fundamental — fingerprints individual crows.
Individual identity — high confidence
Harmonic emphasis — the relative loudness of second and third harmonics versus the fundamental — fingerprints individual crows. Different birds emphasize different harmonics consistently across recordings, sometimes across years. Modern pipelines recover individual identity from a single caw with accuracy that approaches the limit set by recording quality, not by the audio content itself. The signal is robust to method changes; it has held across hand-engineered features and learned embeddings. High confidence at the layer-one (acoustic-recoverability) level. Whether crows themselves use the signal in their day-to-day interactions is layer-three, addressed separately below.
Behavioral context — medium-high confidence at cluster level
Calls produced in different behavioral contexts — territorial caws given alone, mobbing sequences against an aerial predator, juvenile begging from a fledgling, companion calls from a paired adult — cluster geometrically in the space. The cluster boundaries are reproducible across methods, and the cluster identities match the descriptive vocabulary the pre-AI literature established (Marzluff[2] & Angell, Verbeek[3] et al. 2024 Birds of the World monograph). Medium-high confidence: cluster-level associations are robust; per-clip behavioral interpretation is statistical, not deterministic. CrowLingo's atlas surfaces cluster-wide probabilities precisely because per-clip ground truth isn't what the methods deliver.
Family-group dialect — medium confidence
Family groups of American crows carry measurable acoustic centroids that differ between geographically separated populations by amounts exceeding within-group variation. The finding is reproducible, consistent across years, and not fully accounted for by population genetic structure or local acoustic environment. Cultural transmission is the leading hypothesis. Medium confidence at the descriptive layer (the differences exist), lower confidence at the explanatory layer (they probably reflect culture), and not yet science at the functional layer (whether crows themselves use the differences). Anyone claiming definitive answer to 'do crows have dialects' is collapsing the layers.
Compositional structure — low confidence, suggestive only
Statistical models hint at structured composition in crow vocal sequences. Caw-rattle combinations are non-random; certain sequences appear in certain contexts more than chance would predict. This is the substrate from which compositionality could exist. Whether crows treat sequence order as carrying meaning beyond the sum of the parts — the actual compositional claim — is undertested. Low confidence: the descriptive evidence is real but the behavioral evidence (playback experiments showing receivers respond differently to sequence-permuted versions of the same calls) hasn't been collected at the scale required for the claim. This is where the next decade of meaningful work will live.
Lexical meaning — no evidence
Despite the popular framing, there is no evidence that any specific crow vocalization carries a specific lexical meaning the receiver decodes. The closest published analog is honeybee waggle dance, which decodes to spatial coordinates rather than language. For crow alarm calls specifically, the referential-alarm-call literature (foundational work by Cheney and Seyfarth on vervet monkeys) has not been extended to American crows at the rigor required to make a referential-alarm claim. Some popular sources extrapolate from vervet to crow; the extrapolation is not justified by the corvid data. No confidence in lexical claims for American crows.
Translation — no confidence, structural impossibility today
Translation in the strong sense — a target-language representation that preserves the meaning of a source signal — requires sender-side encoding, receiver-side decoding, and a target representation. None of the three has been adequately established for any crow vocalization. The structural-impossibility framing isn't a slogan; it's the conclusion of the contemporary methodological audit. AI methods can map repertoires and identify individuals and cluster by behavioral context. They cannot translate, today or in the foreseeable extension of contemporary methods, because the receiver-side and target-representation problems are not algorithmic problems. They are biological problems that algorithms can support but not replace.
How to read popular coverage
When a popular article says AI is decoding crow language, ask which layer the underlying claim sits at. Caller sex, individual identity, behavioral cluster — high confidence, well-established. Family-group dialect — medium descriptive confidence, low functional confidence. Compositional structure — suggestive only. Lexical meaning, translation — no current evidence. The article's credibility tracks how cleanly it distinguishes these layers. The articles that conflate them are the ones that produce the credibility-damaging walkbacks the field has had to absorb in adjacent areas (animal-language popular coverage from 2018-2023 is full of these). Be skeptical, but skeptical with calibration.