Framework 04

Policy vs enforcement — how to tell the difference

Most AI ethics commitments are policy documents. Policy without enforcement is preference, not constraint.

Every major AI company has an ethics page. Most have a responsible use policy, a safety framework, and a set of stated red lines. Almost none of them have meaningful consequences when those commitments are ignored — by their own staff, by their customers, or by state actors with procurement leverage.

This framework is a diagnostic tool. It gives you a set of questions to ask about any AI ethics commitment, so you can tell whether you are looking at a real constraint or a documented preference. The difference matters — not philosophically, but practically. One produces different behaviour. The other produces better press releases.

The distinction

Policy

A stated commitment to behave in a certain way

Policy describes intent. It tells you what a company says it will or will not do. It is typically written by a communications or legal team, published on a website, and updated when reputational pressure demands it.

"We will not develop AI for autonomous weapons."

"User data is never sold to third parties."

"Our models are subject to independent safety evaluation."

Enforcement

A mechanism that makes the commitment true regardless of incentives

Enforcement is what happens when the policy is violated — or what makes violation structurally difficult. It is technical, contractual, or institutional. It produces consequences. It is verifiable by someone outside the company.

Contractual prohibition with defined terms and penalties.

Technical control that prevents the action at the system level.

Independent audit with published findings and right of access.

The key word is regardless. A real enforcement mechanism works when the company is under pressure, when the contract is large, when the government is asking, and when compliance is inconvenient. Policy does not. Policy bends.

The five-question test

When you encounter an AI ethics commitment — in a terms of service, a safety framework, a government contract, or a press release — ask these five questions. The answers tell you whether you are looking at policy or enforcement.

What happens if this commitment is violated?

If the answer is "nothing automatic" — no contractual penalty, no technical lockout, no regulatory consequence — you are looking at policy. Enforcement requires a consequence that does not depend on the company choosing to apply it. If the company decides whether to enforce its own rules, the rules are preferences.

Can someone outside the company verify compliance?

Self-reported compliance is not enforcement. Real enforcement requires an external party with access rights — an independent auditor, a regulator, a technical reviewer — who can check that the stated constraint is actually operating. If the only people who can verify are the people with an interest in a positive result, verification is not happening.

Is the commitment specific enough to be falsifiable?

"We are committed to responsible AI" cannot be violated, because it can always be reframed. A real commitment is specific enough that a violation can be identified — not reinterpreted. "No autonomous weapons" sounds specific until you ask: does that include targeting recommendation systems? Decision support for lethal operations? ISR analysis? If the definition is controlled by the company, the commitment is not enforceable.

Does the commitment hold under commercial or political pressure?

This is the stress test. Policy that holds only when it is convenient is not a constraint — it is a default setting. The question to ask is: has this commitment ever cost the company something? Has it been maintained when a large customer demanded otherwise? If the commitment has never been tested, you do not know whether it is real. If it has been tested and survived, that is evidence. If it bent, that tells you everything.

Is enforcement technical, contractual, or only reputational?

There is a hierarchy of enforcement strength. Technical controls — the system cannot perform the prohibited action — are the strongest. Contractual obligations with defined penalties are second. Reputational consequences — the company will face criticism — are the weakest, because they depend on public awareness, media attention, and whether anyone is paying attention that week. Most AI ethics commitments rely on reputational enforcement only.

The enforcement spectrum

Most AI governance commitments sit somewhere on a spectrum between pure policy and genuine enforcement. Knowing where a commitment falls tells you how much weight to give it.

Policy only Partial Enforced

Commitment What exists Rating

"We will not enable autonomous weapons"

Policy page statement. No contractual definition of "autonomous." No technical restriction. No audit mechanism.

Policy only

"User data is never sold"

Contractual clause in ToS. Difficult to verify externally. No independent audit published. Enforcement is legal, not technical.

Partial

"This model cannot generate CSAM"

Technical classifier operating at inference time. Independently red-teamed. Hash-matching against known illegal content databases. Failure produces automatic refusal.

Enforced

"No surveillance use without consent"

Acceptable use policy. No technical gating. No customer audit. Enforcement depends on user self-reporting or media exposure.

Policy only

EU AI Act high-risk classification

Legally binding. Mandatory conformity assessment. Regulator access rights. Financial penalties for non-compliance. Independent notified body oversight.

Enforced

Why policy dominates — the incentive structure

If enforcement is more credible than policy, why do AI companies overwhelmingly choose policy? The answer is not cynicism — it is incentives.

Policy is cheap to produce and expensive to enforce. A policy page costs a communications team a week and a legal review. Real enforcement — technical controls, independent audits, contractual penalties — costs money, limits product flexibility, and creates legal exposure when something goes wrong. In a competitive market where speed matters, enforcement is a cost centre. Policy is not.

Policy is reputationally sufficient in most conditions. The majority of the time, customers, investors, and regulators do not probe whether stated commitments are technically or contractually enforced. They read the policy page and move on. This means policy produces most of the reputational benefit of enforcement at a fraction of the cost — until something breaks publicly.

Enforcement limits optionality. A technical control that prevents the model from doing X means the model cannot do X for any customer, including large ones willing to pay for it. A policy that prohibits X can be reinterpreted under pressure, applied inconsistently, or quietly modified when the contract is large enough. Policy preserves flexibility. Enforcement does not.

The procurement problem. The Anthropic / Pentagon case demonstrates this mechanism at scale. Anthropic's safety commitments were genuine enough to cost it a $200M contract. OpenAI's agreement — reached faster, with stated guardrails — raised questions about whether those guardrails were contractually defined, technically enforced, or described in a press release. Without independent verification, the public cannot tell the difference. The market, however, already decided: OpenAI got the contract.

When the procurement market rewards faster compliance and penalises principled refusal, the incentive for real enforcement collapses. Without shared minimum standards — binding on all vendors — the race is to the bottom on enforcement, not the top.

What real enforcement actually requires

Enforcement is not one thing. It operates at three levels. Meaningful governance requires all three — any single level alone can be circumvented.

Technical enforcement is the strongest form. The system is architecturally incapable of the prohibited action, or produces an automatic, irreversible refusal. It does not rely on human judgment at the point of use. It cannot be overridden by a customer request or a contract amendment. Examples: inference-time classifiers for illegal content; use-case gating that restricts deployment contexts based on verified API credentials; kill-switch architecture with externally held trigger authority.

Contractual enforcement is legally binding but operationally dependent. It requires defined terms — specific enough that violation can be identified, not reinterpreted. It requires a monitoring mechanism — someone checking whether the terms are being followed. It requires consequences — financial penalties, contract termination, public disclosure — that are automatic or close to it. A contract clause without a monitoring mechanism and a consequence schedule is policy with a signature on it.

Institutional enforcement is what makes the other two credible over time. This means independent auditors with genuine access rights and public reporting obligations. Regulators with investigative powers and enforcement authority. Civil society organisations with standing to challenge. Whistleblower protections for people inside companies who observe violations. Without institutional infrastructure, technical and contractual enforcement are only as strong as the company's willingness to self-apply them.

The EU AI Act is the closest existing example of institutional enforcement infrastructure applied to AI. It mandates conformity assessments, notified body oversight, regulator access, and financial penalties scaled to global turnover. It is imperfect and partially untested at scale — but it represents the architecture that makes commitments real. Most AI governance outside the EU has none of these elements.

How to apply this framework

This framework is a reading tool. When you encounter an AI ethics statement — in a government contract, a terms of service, a vendor pitch, or a regulatory filing — the five questions above give you a structured way to assess what it is actually worth.

For compliance and procurement practitioners: the framework maps directly to due diligence. Before integrating an AI system into a regulated workflow, ask whether the vendor's safety commitments are technical, contractual, or reputational. If the answer is reputational, that risk belongs in your risk register — not in a vendor attestation form.

For journalists and policy researchers: the framework explains why AI ethics scandals follow a predictable pattern. The company had a policy. The policy was violated. The company updated the policy. Nothing structurally changed. The mechanism for that cycle is the absence of enforcement. Naming that mechanism — not just the violation — is what moves the story forward.

For anyone evaluating an AI tool: look for independently verifiable evidence, not vendor statements. If the only source confirming a safety claim is the company making the claim, treat it as policy. Label it accordingly.

Cases that activate this framework

Framework 04 is the analytical lens for any case where a company's stated commitment was not matched by its observed behaviour. Current cases on BrokenCtrl where this framework applies:

BC-001 — Anthropic / Pentagon

The Anthropic / Pentagon case is the clearest recent illustration. Anthropic's "no autonomous weapons" commitment existed as policy. It had no contractual definition of autonomy, no technical restriction preventing deployment in military targeting workflows, and no independent audit mechanism. When the DoD applied procurement pressure, the policy held — but only because Anthropic chose to absorb the cost. The choice was genuine. The infrastructure that would make the choice unnecessary did not exist.

OpenAI's subsequent DoD agreement illustrates the other side: stated guardrails, post-hoc clarifications, and no public mechanism for external verification. Whether those guardrails are enforced or merely described is currently unknowable. That unknowability is itself the finding.

Related frameworks

Framework 01 — The Broken Control Loop Framework 03 — Friction as a Safety Feature Framework 05 — Foreseeable Misuse as Negligence

QUESTIONS

What is the difference between an AI policy and AI governance?

An AI policy is a statement of intent — what a company says it will or will not do with its AI systems. AI governance is the set of mechanisms that make those intentions enforceable: technical controls, contractual obligations, independent audit rights, and regulatory consequences. Most organisations have AI policies. Very few have AI governance with genuine enforcement infrastructure. The gap between them is where most AI harms occur.

How can you tell if an AI company's ethics commitment is real?

Ask five questions: What happens if the commitment is violated? Can someone outside the company verify compliance? Is the commitment specific enough to be falsifiable? Has it held under commercial or political pressure? Is enforcement technical, contractual, or only reputational? A commitment that passes all five has real enforcement infrastructure. Most AI ethics commitments fail at least three of them — which means they are preferences, not constraints.

What is AI ethics enforcement?

AI ethics enforcement means that stated ethical commitments produce consequences when violated — consequences that do not depend on the company choosing to apply them. It operates at three levels: technical (the system cannot perform the prohibited action), contractual (legally binding terms with defined penalties and monitoring), and institutional (independent auditors, regulators, and civil society with genuine access and authority). Meaningful enforcement requires all three levels. Any single level alone can be circumvented.

Does the EU AI Act enforce AI ethics commitments?

The EU AI Act is the closest existing framework to genuine institutional enforcement for AI. It mandates conformity assessments for high-risk AI systems, requires independent notified body oversight, gives regulators investigative powers, and sets financial penalties scaled to global annual turnover. It is imperfect — implementation is uneven and some provisions remain untested — but it represents the architecture that makes commitments real. Most AI governance outside the EU lacks comparable enforcement infrastructure.

Why do AI companies rely on policy rather than enforcement?

Because policy is cheap, flexible, and reputationally sufficient in most conditions. Real enforcement — technical controls, independent audits, contractual penalties — costs money, limits product flexibility, and creates legal exposure. In a competitive market, enforcement is a cost centre and policy is not. Policy also preserves commercial optionality: a stated commitment can be reinterpreted under pressure, while a technical control cannot. Until regulators mandate enforcement infrastructure or procurement markets reward it, the incentive structure favours policy over enforcement.

Last updated: April 2026 · Framework 04 · Methodology →