Neighborhood contamination
Neighborhood contamination designates the phenomenon where the interpretation of an entity or concept is altered by the semantic proximity of neighboring content (dominant categories, co-occurrences, adjacent entities), to the point where the AI system attributes to the subject properties that primarily belong to its environment, not to its canon.
In an interpreted web, meaning is not determined solely by what you declare, but by what surrounds you. Neighborhood contamination is therefore a major mechanism of interpretive invisibilization and capture.
Definition
Neighborhood contamination is the situation where:
- a subject A has a clear canon;
- but its semantic neighborhood (B, C, D) is denser, more repeated, or more dominant;
- and AI projects onto A attributes, intentions, categories, or explanations from the neighborhood.
The result is an interpretation that is “statistically coherent” but canonically false.
Why this is critical in AI systems
- The model learns by proximity: co-occurrences and associations dominate granularity.
- The model standardizes: it reduces the specific to the most frequent generic (smoothing).
- The model aligns on clusters: a dominant cluster can reframe your concept.
Common contamination forms
- Categorical contamination: your concept is reframed into a standard category (e.g. “framework” assimilated to “certification”).
- Homonymy contamination: neighborhood of a better-known homonymous entity.
- Dominant discourse contamination: a current or school imposes its vocabulary around your subject.
- Secondary source contamination: wikis, aggregators, summaries that become more visible than your canon.
Practical indicators (symptoms)
- AI systems describe your subject with the attributes of another adjacent subject.
- Your vocabulary is “corrected” toward generic terms.
- Responses cite sources that mostly concern the neighborhood, not you.
- The confusion persists even after publishing a canon, indicating inertia.
What neighborhood contamination is not
- It is not a simple factual error. It is a referential shift.
- It is not only SEO. It is a property of interpretation by proximity.
- It is not necessarily intentional. It can emerge without explicit attack.
Minimum rule (enforceable formulation)
Rule NC-1: when a subject is exposed to a dominant neighborhood, the canon must provide disambiguation markers and explicit governed negations against probable reframings. Any attribution originating from the neighborhood must be considered at-risk inference and, if ungoverned, trigger a legitimate non-response.
Example
Case: an original concept is explained as a variant of a more widespread concept, because surrounding pages use that dominant vocabulary.
Diagnosis: neighborhood contamination, interpretive smoothing, then interpretive capture.
Expected correction: canonical reinforcement, governed negations, satellite pages, external graph, fidelity proofs.