PILLAR III — WHAT WE CAN DECODE
We are not translating. We are mapping.
What can today's audio AI actually tell you about a crow call? Honest capability cards: what's demonstrated, what's partial, what's still aspirational. No overclaiming.
CAPABILITY CARDS
What we can do, honestly.
Demonstrated
Automatic detection
Segment crow vocalizations from hours of field audio. BirdNET-level accuracy.
Demonstrated
Unsupervised clustering
Discover call categories from geometry, no labels needed.
Partial
Caller identification
Infer individual identity from voice. Works on known individuals.
Partial
Behavioral context
Map calls to behavioral states across nine clusters.
Emerging
Zero-shot captioning
NatureLM-audio generates descriptions. Surprisingly accurate.
Not yet
Compositional decoding
Understanding calls as combined meaningful units. The frontier.
What's still ahead.
The honest map of the frontier.
Open the frontier →