Compliance drift: when AI responses diverge from canon

Compliance drift

Compliance drift designates the phenomenon where an AI system produces, over time, responses increasingly incompatible with declared rules, policies, or constraints, without explicit canon change. The rules remain the same, but outputs diverge.

This drift is particularly dangerous because it is not always visible. The response can remain plausible, “well formulated”, and yet fall outside the interpretability perimeter. Compliance degrades silently.

Definition

Compliance drift is the situation where:

a canon (rules, policies, limits, negations) is stable;
but system outputs become progressively less compatible with that canon;
and the canon-output gap increases despite the absence of change in the source.

Drift can stem from execution context changes (routing, activated sources, models), progressive neighborhood contamination, or external changes that reframe interpretation.

Why this is critical in AI systems

It gives a false sense of control: “the rules exist, therefore it is compliant”.
It degrades reliability: audit becomes retrospective, not preventive.
It increases risk: decisions, compliance, reputation, and implicit liability.

Frequent causes

Model or behavior change: system update, fine-tuning, parameters.
Activated source change: new dominant external sources, disappearance of old ones.
Remanence / inertia: progressive return of old interpretations.
Insufficient response conditions: absence of non-response triggers and evidence.

Practical indicators (symptoms)

Responses become more “confident”, but less bounded (perimeter smoothing).
Exceptions and negations appear less and less.
The same question gives compatible responses one month, then incompatible the next.
Cited sources evolve toward secondary sources rather than the canon.

What compliance drift is not

It is not a canon update. The canon is stable.
It is not a one-time incident. It is a trajectory.
It is not only a data problem. It is often a conditions and evidence problem.

Minimum rule (enforceable formulation)

Rule CD-1: any compliance drift must be detected by regular checks (interpretive observability) and reduced by imposing response conditions, fidelity proofs, and interpretation traces. A system without an evidence mechanism cannot claim stable compliance.

Example

Case: an internal policy is stable, but AI systems begin formulating undeclared “reasonable” exceptions, or generalizing beyond the perimeter.

Diagnosis: compliance drift (smoothing + extrapolation) despite stable canon.

Expected correction: recurring checks, evidence, reinforcement of governed negations and non-response triggers.

Compliance drift