AI Content Vetting—Partner-Feed Policy

Scope. Any content ingested into the McClatchy publishing pipeline from a partner feed (Stacker, Field Level Media, Minute Media, Reuters, Campus Insights, and future partners). Distinct from CSA-pipeline-generated content (which is governed by SCALED_CONTENT_POLICY.md + the editorial style guides)—this is the inbound vetting layer.

Why this page exists. Generative-AI tooling has commoditized “competent-enough” article production. Partner feeds carry an unknown mix of human-written, AI-assisted, and fully AI-generated content. Ingesting AI-generated partner content at scale exposes McClatchy to the same scaled-content-abuse risks as if we’d produced it ourselves—Google doesn’t distinguish the production source, only the publisher. Per the scaled-content-abuse policy, the publisher (us) bears the consequences regardless of who wrote it.

This policy defines the gate: what counts as AI-generated partner content, how we vet it before ingestion, and what we do with content that fails the gate.


Definitions

The line between “AI-assisted” and “AI-generated” is the human-editorial-floor test—see §Gate criteria.


Risk model

Three failure surfaces ingesting AI-generated partner content creates:

Risk 1: scaled-content-abuse exposure

Google’s March 2024 + March 2026 enforcement waves penalized publishers hosting AI-generated content at scale even when produced by partners. The publisher bears the search-rank consequence; “we didn’t write it” doesn’t matter to the algorithm. Per the SCALED_CONTENT_POLICY §6 application: the 9-check audit list (publication cadence inconsistent with team size, no named credentialed author, only city/name changed, no HITL review, etc.) applies to inbound partner content if McClatchy publishes it under our brand.

Risk 2: substance-floor + anti-hyperbole risk

Per the engineering-leadership 4/29 directive (“substance + anti-hyperbole vigilance”): AI-generated content’s load-bearing failure mode is missing-substance + exaggeration. Partner-fed AI content often skews to listicle / aggregation / “X reasons why” framing—high in word count, low in substantive insight. Ingesting this at scale degrades the masthead.

Risk 3: provenance + fact-check void

CSA-pipeline content has fact-check assumptions: the user-supplied source is the corpus, plagiarism check runs against it (per p24 Copyscape integration), substance comes from named source materials. AI-generated partner content typically has none of these—the model’s parametric memory is the source, and post-hoc fact-checking is nontrivial. Per the Mode 2 guardrails framework: AI-asserted facts without a citable source are a structural risk surface.


Gate criteria—the 5-point test

Every batch of partner content runs against five gates before ingestion. Failing any one quarantines the batch (does not auto-publish; routes to human review).

G1—Named credentialed author

The byline must name a specific human with verifiable credentials in the topic area. “Stacker Editorial” or “Field Level Media Staff” fails. A named reporter with a public profile passes. The author’s name is reproduced on McClatchy publication (no scrubbing).

Rationale: per scaled-content-abuse policy §6 check #2; per the substance-floor directive. A named author is the first line of accountability.

G2—Substance-floor

The content carries at least three of: (a) a primary source quote, (b) a named original fact not commonly knowable from a model’s training corpus, (c) a specific recent date or event reference, (d) a named expert opinion attributed by name + credential, (e) a non-trivial statistical claim with citation. Aggregation-of-aggregation content (compiling top-10 from other top-10s without primary work) fails.

Rationale: counters the AI-content failure mode of high-word-count low-substance output. A model-authored piece can pass this gate if the human editor added primary research; that’s the AI-assisted pattern we want.

G3—Plagiarism + AI-content detection

Each piece runs through Copyscape (per p24-plagiarism-validation integration) AND an AI-content classifier (open-source rules per the engineering-leadership 4/29 framework: “open-source rules likely outperform commercial; not yet ticketed”). Fail Copyscape = quarantine pending source-attribution review; fail AI-content classifier (high model-authored signal) = quarantine pending §G1+G2 manual re-verification.

Rationale: plagiarism check catches lifted-from-elsewhere content (the Mode-2 risk in inbound form); AI-content classifier flags candidates for stricter §G1+G2 review.

G4—Editorial fit

The content fits the destination publication’s editorial standards—per the relevant per-publication style guide (USW §10.3, WW §10.5, AP-Compatible §10.6.x). Off-brand voice, off-tone framing, or off-format structure fails.

Rationale: even high-substance AI-assisted content can be off-brand. The substance-floor directive’s “make sure it sounds like us” check applies.

G5—Cadence / scaled-content audit

Partner-feed ingestion rate is monitored monthly. If a partner’s feed produces output materially above the human-writable rate for the named bylines (per the 9-check audit list, scaled-content-policy §7 check #1), the partner enters scaled-content suspicion bucket. Continued ingestion requires elevated review (every piece §G1+G2-manual-verified) until cadence normalizes OR the partner adds named-author capacity OR we exit the partner.

Rationale: “you didn’t write it” doesn’t protect us if the partner is themselves running an AI-content mill we’re laundering through.


What happens to content that fails the gate

Quarantine path (single-gate failure):

  1. Content held in staging; not auto-published.
  2. Routed to content team lead’s inbox for review within 24h.
  3. Three outcomes: (a) approve with editorial pass (substance/source verification + tone fixes); (b) approve as-is with provenance flag (transparency note: “this is partner-feed AI-assisted content”); (c) reject and remove from feed.

Hard-block path (multi-gate failure OR cadence suspicion):

  1. Content held in staging permanently.
  2. Partner relationship review flagged for exec/leadership.
  3. Feed-level decision: continue ingest with enhanced review, suspend ingest, or terminate partner.

The defaults are conservative because the false-negative cost (publishing AI slop under our masthead) is much higher than the false-positive cost (extra human review pass).


Operational mechanism

Who:

Cadence:

Logging:


Per-partner stance (current state)

Partner Stance Notes
Reuters wire Trusted (G3 + G5 only) Long history; named credentialed reporters; established fact-check process. Light-touch gate.
Stacker Quarantine-by-default until G1+G2 verified per-piece Per the 2026-04-29 engineering-leadership conversation, Stacker output reads as AI-assisted at minimum; substance floor not always cleared.
Field Level Media Active vetting needed High volume from limited byline pool; pre-G5 cadence risk.
Minute Media Active vetting needed Same pattern as Field Level.
Campus Insights TBD—pre-launch The 20-university pipeline (p31) is video-source-to-article; AI-vetting load is in the article-generation step, not ingest. Different operational gate.

This table updates as partner-vetting findings accumulate. Pierce maintains; exec/leadership approves partner-state changes.


What this is NOT


Changelog

What this retires

p15-partner next-action “Draft AI vetting policy proposal for exec/leadership + team member review”—DRAFTED. Pierce + exec/leadership + content team lead review next. Implementation gated on (a) which engineering team owns the ingest-time G1+G3 checks, (b) when AI-content-classifier OSS tooling is selected per the 4/29 framework, (c) partner_content_audit D1 table provisioning.