Training data poisoning: source governance and provenance
This page defines training data poisoning as a provenance corruption that alters learned authority, and clarifies why source governance is an interpretive challenge, not merely technical.
When the training corpus is contaminated, the problem is not merely “an error in a dataset”. The problem is an alteration of what the system learns as regularities, hierarchies, associations, and truth signals.
On gautierdorval.com, training poisoning is treated as a case of AI poisoning with high inertia: once learned, the bias becomes difficult to isolate, because it manifests as “natural” model behavior.
Status of this page
This page is an interpretive clarification.
It stabilizes the term’s usage in this ecosystem and distinguishes it from ordinary data noise, variable corpus quality, or simple controversial content on the web.
Operational definition
Training data poisoning: intentional (or made intentional) alteration of a corpus used to train or fine-tune a model, in order to provoke a bias, deviation, instability, or conditional behavior that subsequently manifests as a system property.
The central signature is a provenance corruption: the system learns from sources that should not be authoritative, or learns relations that have been artificially made dominant.
Why provenance is the real perimeter
The risk is not merely “what is in the text”, but the status of sources and the mechanisms by which they enter the corpus:
- source selection and ingestion perimeters
- licenses, rights, and usage constraints
- traceability, timestamping, versions, and lineage
- deduplication, canonicalization, normalization
- implicit weighting (repetition, overrepresentation, imbalance).
Weak provenance governance allows low-quality, deceptively authoritative, or hostile-intent sources to become “learned truth”.
Minimum typology (effect mechanisms)
- Directional bias: favoring an interpretation, attribution, or narrative.
- Degradation: introducing noise, contradictions, or conceptual confusion.
- Reference derivation: making the system learn an erroneous source hierarchy (inverted authority).
- Instability: making outputs sensitive to minor formulations, due to lacking stabilization.
- Conditional triggering: provoking behavior only under certain conditions (without detailing procedures here).
Necessary distinctions
- An imperfect corpus is not automatically poisoned: the key is intention (or instrumentalization) and systemic effect.
- Public disinformation is not poisoning as long as it is not integrated into the training corpus with sufficient weight.
- RAG drift concerns an indexed and recalled corpus. Training poisoning concerns learned authority.
Source governance (interpretive reading)
In an interpreted web, source governance is a component of interpretive governance:
- defining what has the right to be authoritative
- documenting exclusions (what must not be learned)
- stabilizing canonical definitions and their boundaries
- reducing inference space through explicit bounds.
Without these bounds, learning tends to reconstruct “probable” concepts, not authorized concepts.
Relation to other clarifications in the series
- AI poisoning: definition, taxonomy, and interpretation risks
- RAG poisoning: corpus contamination and interpretive drift
- Prompt injection: authority threat and instruction/data confusion
Scope of this clarification
This page applies to human readings, automated syntheses, zero-click citations, and interconnected agent chains. It must be interpreted as a principle clarification: if provenance is not governed, learned authority becomes a potential drift.
Anchoring
- Clarifications
- Doctrine
- Artificial intelligence use policy
/en/ai-use-policy/
Practical clarification
This clarification should be used as a boundary-setting page for Training data poisoning: source governance and provenance. Its purpose is not to expand the corpus with another abstract term. Its purpose is to prevent a common confusion from spreading across search engines, LLM outputs, knowledge panels, internal summaries or agentic responses.
A clarification is useful when two notions are close enough to be conflated but different enough that the conflation creates risk. The reader should therefore ask what is being separated: a person from an organization, a definition from a service, visibility from legitimacy, retrieval from authority, observation from proof, or a tool surface from the canon. Once that separation is explicit, later pages can route toward the correct definition, framework, audit method or evidence layer.
What this prevents
The main risk is not that a reader misunderstands a word once. The risk is that an ambiguous relation becomes a reusable assumption. In AI-mediated environments, a weak relation can be repeated as if it were current, generalized as if it were structural, or elevated as if it were authorized. A clarification is effective only when it changes what later systems are allowed to assume. After reading this page, a person or model should know which association is permitted, which association is forbidden, and which association remains unproven.