What 'open source' actually means in bioacoustic AI

The four-layer definition

An AI model is composed of multiple layers, and each layer can be open or closed independently. The model weights — the trained parameters that constitute the model itself. The model architecture — the structural design (transformer, CNN, etc.) that defines how the weights operate. The training code — the software pipeline that produced the weights from training data. The training data — the specific recordings, labels, and curation choices that informed the model. A genuinely 'open' model has all four layers publicly accessible under permissive licensing. A model that's open on weights and architecture but closed on training data is partially open; a model that's open on all four layers is fully open. The distinction matters because different downstream uses depend on different layer combinations.

BirdNET's model weights are publicly available under permissive licensing (Apache 2.

What BirdNET is

BirdNET^[1]'s model weights are publicly available under permissive licensing (Apache 2.0). The architecture is documented in the published Kahl et al. 2021 paper. Training code is largely available through the Cornell Lab's -Lite and BirdNET-Analyzer repositories on GitHub. Training data is the more complicated layer: it includes Macaulay Library recordings that carry Cornell-administered licensing, plus other audio sources with varying access terms. BirdNET is genuinely 'open' on weights, architecture, and most of the code; it's partially open on training data, with the underlying recordings not fully redistributable under unrestricted Creative Commons. This is workable for most uses but creates licensing complexity for downstream projects that want to fully reproduce or extend the work.

What Perch 2.0 is

, released in early 2025 by Google Research, is open source on weights and architecture. The training code is partially documented; some components are open and others aren't fully released. The training data is the larger Google-curated Bird Vocalization Dataset, which includes substantial Macaulay-derived material with the associated licensing constraints. The release as 'open source' is accurate in the layer-by-layer sense (weights and architecture are open) but readers should know that fully reproducing the model from scratch isn't straightforward because the training data infrastructure isn't fully open. This is the same pattern as BirdNET^[1], with slightly different specifics.

What NatureLM-audio is

NatureLM-audio^[2], the 2025 Earth Species Project foundation model, is described as the most open of the major contemporary bioacoustic models. Weights, architecture, and training code are publicly available. Earth Species Project has been more aggressive about open-data principles than the Cornell or Google-affiliated work, partly because its mission framing centers on open science. The training data is correspondingly more complicated to fully open-source because some of the contributing data carries upstream licensing constraints that Earth Species Project doesn't fully control. The general framing of as 'open' is well-supported, with caveats about the training-data layer that apply to all foundation models trained partly on Macaulay-archive material.

The downstream practical implications

What you can do with a model depends on which layers are open. Run the model on your own audio: requires open weights, which all three models have. Fine-tune the model for your specific use case: requires open weights plus reasonable architecture documentation, which all three have. Reproduce the model from scratch: requires all four layers open, which none of the three models fully has. Build a commercial product on top: requires checking the specific weight license; Apache 2.0 (BirdNET^[1]) is generally permissive, others vary. Audit the model's training data for biases or coverage gaps: requires open training data, which is the most-constrained layer across the models. Use the model in a fully open-source downstream project that needs to redistribute training data: difficult or impossible for most current models.

Why this matters for CrowLingo

CrowLingo's atlas operates downstream of these models in interpretive use rather than primary use — the atlas references findings from research using these models rather than running them in-product. This means the licensing constraints don't directly affect CrowLingo's operation. Where the constraints would matter is in any future expansion of the atlas that involved primary AI bioacoustic work — fine-tuning a model on a CrowLingo-specific corpus, training a custom classifier on Wikimedia Commons crow recordings, building a real-time identification feature. Any of those would require navigating the layer-by-layer openness of available models. For now, the atlas can describe the available models accurately, including their actual open-source status, without depending on them in ways that license constraints would block.

Quick answers from this piece.

Are BirdNET, Perch, and NatureLM-audio really open source?

Partially, with variation. All three are open on model weights and architecture. Training code is partially documented for all three with varying completeness. Training data is the most-constrained layer across all three because it includes Macaulay Library and similar archives with licensing constraints. The 'open source' framing is accurate for the weights-and-architecture layers and partially accurate for the code and data layers.

Can I use these models commercially?

Generally yes, depending on the specific license of the model weights. BirdNET uses Apache 2.0 which is commercial-friendly. Perch 2.0 and NatureLM-audio have their own licenses that should be checked individually for specific use cases. The model weights' license is the critical layer for commercial use; downstream model use generally doesn't require interaction with the more-constrained training-data layer.

Why does training-data openness matter?

Several reasons. Reproducibility: fully reproducing a model from scratch requires the training data. Auditability: checking the model for biases or coverage gaps requires inspecting what it learned from. Downstream open-source projects: building fully-open downstream work depends on training data being available for redistribution. Most current bioacoustic foundation models have constraints at the training-data layer that complicate these uses.

The four-layer definition

What BirdNET is

What Perch 2.0 is

What NatureLM-audio is

The downstream practical implications

Why this matters for CrowLingo

Quick answers from this piece.

Are BirdNET, Perch, and NatureLM-audio really open source?

Can I use these models commercially?

Why does training-data openness matter?

What 'open source' actually means in bioacoustic AI

The four-layer definition

What BirdNET is

What Perch 2.0 is

What NatureLM-audio is

The downstream practical implications

Why this matters for CrowLingo

Quick answers from this piece.

Cited in this piece.

Self-supervised audio learning, explained for non-engineers

The Macaulay Library and the open-research tension

How AI is decoding crow vocalizations in 2026

BirdNET vs Perch 2.0 vs NatureLM-audio: the practical 2026 guide

What 'open source' actually means in bioacoustic AI

The four-layer definition

What BirdNET is

What Perch 2.0 is

What NatureLM-audio is

The downstream practical implications

Why this matters for CrowLingo

Quick answers from this piece.

Cited in this piece.

Self-supervised audio learning, explained for non-engineers

The Macaulay Library and the open-research tension

How AI is decoding crow vocalizations in 2026

BirdNET vs Perch 2.0 vs NatureLM-audio: the practical 2026 guide

The four-layer definition

What BirdNET is

What Perch 2.0 is

What NatureLM-audio is

The downstream practical implications

Why this matters for CrowLingo

Quick answers from this piece.

Cited in this piece.

People who read this also read

Self-supervised audio learning, explained for non-engineers

The Macaulay Library and the open-research tension

How AI is decoding crow vocalizations in 2026

BirdNET vs Perch 2.0 vs NatureLM-audio: the practical 2026 guide

The four-layer definition

What BirdNET is

What Perch 2.0 is

What NatureLM-audio is

The downstream practical implications

Why this matters for CrowLingo

Quick answers from this piece.

Cited in this piece.

People who read this also read

Self-supervised audio learning, explained for non-engineers

The Macaulay Library and the open-research tension

How AI is decoding crow vocalizations in 2026

BirdNET vs Perch 2.0 vs NatureLM-audio: the practical 2026 guide