Open Weight vs Proprietary (vs Open Source)

The terms get used interchangeably in marketing copy, but they describe three different release postures. The gap between them is not pedantic. It determines whether you can reproduce a model, what you are legally allowed to do with it, and who absorbs the risk once it ships.

Three postures, one axis: what gets released

The cleanest way to separate them is by what the developer hands over.

Proprietary (closed): weights, training data, and code all stay private. The model exists only as an API. GPT, Claude, Gemini. You rent capability and never hold the artifact.

Open weight: the trained parameters are downloadable. You can self-host, fine-tune, and run offline. Training data and training code are usually not released. Llama, Qwen, Mistral, DeepSeek.

Open source (the strict sense): weights plus enough to recreate the model under a license granting the four freedoms (use, study, modify, share for any purpose).

The load-bearing word is recreate. Weights are the output of training, not the recipe. Meta does not publish Llama’s training datasets, so the model is impossible to rebuild from scratch even though anyone can download and run it. Releasing weights is closer to shipping a compiled binary than shipping source. Orange’s typology frames it as a spectrum of disclosure, not a binary.

This is why the Open Source Initiative accuses Meta of “openwashing” Llama. See Openwashing.

The license is a separate axis from the weights

“Open weight” says nothing about your legal rights. Two models can both be downloadable and grant wildly different freedoms.

Llama Community License: free only for companies under 700M monthly active users. Derivative names must include “Llama.” Those clauses discriminate by field of endeavor and by group, which is exactly what the Open Source Definition forbids. Open weight, not open source.
Mistral, Qwen (Apache 2.0) and DeepSeek (MIT): permissive. Commercial use, fine-tuning, redistribution, zero royalties.

So the practical taxonomy is two questions, not one: can I download the weights, and what does the license let me do. A model can be open-weight-and-restrictively-licensed (Llama) or open-weight-and-permissively-licensed (DeepSeek).

The OSI’s Open Source AI Definition (OSAID 1.0, October 2024) tried to nail this down. It requires weights, complete training code, and “data information” detailed enough to recreate the system. “Data information” was a deliberate compromise: the OSI conceded that full training datasets often cannot be released for copyright or privacy reasons, so it demands a thorough description instead. Critics on both sides dislike it. Some say it is too weak to call open source without the actual data; Meta refused to endorse it and kept calling Llama open anyway.

Irreversibility is the property that changes the risk math

The technical difference that matters most for safety is mundane: you cannot un-release weights. A proprietary provider can patch, rate-limit, or pull an API. Once a checkpoint is on BitTorrent it is permanent, and safety fine-tuning can be stripped by anyone with a few GPUs.

Kapoor, Narayanan, Bommasani and 22 co-authors mapped this in On the Societal Impact of Open Foundation Models (2024). They name five distinctive properties of open models: broader access, greater customizability, local inference, the inability to rescind weights, and the inability to monitor or moderate use.

Their key contribution is the marginal risk frame. The question is not “could this model help someone build a bioweapon” but “how much does it help beyond what a web search already provides.” Across cyberattacks, bioweapons, and disinformation, they found the existing research insufficient to show open weights create large marginal risk over pre-existing tools. See Marginal Risk Framework. The counterweight: minimal marginal risk today is not proof of safety tomorrow, since both model capability and societal defenses keep moving. The UK AI Safety Institute treats the threshold as a live, capability-dependent question rather than a settled one.

The benefits side of the ledger is concrete too: distributing decision-making power, reducing market concentration, enabling external transparency and auditing. Open weights are also one of the few viable routes to sovereign AI capacity outside the handful of countries that control frontier compute.

What open weights did to pricing

The economic shock landed in January 2025. DeepSeek released R1, an open-weight reasoning model, under MIT. It matched OpenAI’s o1 on key benchmarks at roughly 27x lower API cost: $0.55 per million input tokens against o1’s tens of dollars. Sam Altman acknowledged R1 ran 20-50x cheaper than the comparable OpenAI model.

The mechanism: once a frontier-class model is open weight and permissively licensed, anyone can host it, so inference becomes a commodity priced near hardware cost. The model layer stops being premium software. R1’s efficiency leaned partly on a sparse Mixture of Experts design (671B parameters, ~37B active per token), which keeps serving cost low. A closed provider charging a premium for comparable quality now has to justify the markup on something other than the weights. This is the same margin question explored in The Cost Subsidization of LLM Use, approached from the supply side: open weights set a price floor that subsidized closed APIs eventually have to answer to.

Try it

Run an open-weight model offline, then compare (1-2 hours, Ollama + any proprietary API). ollama run qwen3 or ollama run llama3, then pull the network cable. The model keeps answering. That offline behavior is the irreversibility property: there is no server to revoke access, no rate limit, no kill switch. Now send the same prompt to a proprietary API and note what you traded for the convenience (no artifact, no offline use, but no GPU and no ops). For a sharper read on the license axis, open the Llama license file shipped with the weights and find the 700M-MAU clause and the naming requirement, then diff it mentally against Qwen’s bare Apache 2.0. What you’re looking for: the realization that “open” collapsed two independent decisions (release the weights, grant rights) into one ambiguous word.

Sources

On the Societal Impact of Open Foundation Models — Kapoor, Narayanan, Bommasani et al. The marginal-risk framework and the five distinctive properties.
Meta’s LLaMa license is still not Open Source — OSI’s case for why downloadable weights under a restrictive license are not open source.
We finally have an official definition for open source AI — OSAID 1.0 and the “data information” compromise.
DeepSeek R1 price and performance analysis — the commoditization numbers.
Managing risks from increasingly capable open-weight AI systems — UK AISI on the moving capability threshold.