Restricted Individuals—Governance Rule (Scoping)

Status: SCOPING DRAFT—NOT YET ACTIVE. Source list not yet handed over; this doc defines the structure + enforcement boundary. Pierce-owned; engagement with B2C content lead + content team lead pending. Visibility: NON-PUBLIC. This file lives in csa-content-standards (private repo, CF Access gated). Do not publish, link from public surfaces, or surface in stakeholder presentations. Names ≠ accusations; the list exists because Legal said so and is treated as a legal-compliance artifact.


Why this rule exists

There is a known list (~7 pages, surfaced 2026-05-12 by both National and B2C content leads independently) of individuals McClatchy outlets are not allowed to write about. The list is legal/risk-driven (defamation history, settlement terms, litigation exposure). Two examples cited in the surfacing conversation: Oprah Winfrey, Tom Cruise. The remaining ~7 pages are unknown to me at the time of writing.

Today the list lives only in a small group’s heads—described as “literally like seven pages long” and “no one has that list except for like five people on purpose.” That means:

  1. Anyone writing for any McClatchy outlet who is NOT one of those five people has no way to know if their subject is on the list until after publication (when Legal flags it).
  2. CSA-generated content has zero awareness of the list and will happily generate variants about flagged individuals.
  3. The Insights Agent push-to-CSA endpoint (per docs/insights-agent-endpoint-audit-2026-05-14.md) is a new ingestion vector that will need to honor the list.

The B2C content lead and the National content team lead both said, on 2026-05-12: “we should probably put that in the CSA.”

This doc establishes the home + the enforcement mechanism. The actual list-content lands in a separate non-committed surface (see §3).


§1. Scope

Applies to:

Does NOT apply to:


§2. Enforcement mechanism

Two-stage gate. Both stages must pass before CSA produces any variant for a piece.

§2.1 Intake-time scrub

Run at the boundary where a content unit enters CSA, before any variant generation.

§2.2 Pre-publish scrub

Run again at the post-generation / pre-publish boundary, against the GENERATED variants. This catches the edge case where the source-text doesn’t mention a restricted name but CSA’s training data introduces one (e.g., LLM mentions Oprah in an unrelated comparison).

§2.3 Bypass-prohibited

No operator override path. No “publish anyway” button. This is the only governance rule in csa-content-standards that does not allow editorial override. Reason: the list exists because Legal said so, and editorial discretion is not the relevant authority for whether legal-risk content publishes. If an operator believes a hard-block is in error (false-positive match), the path is to escalate to Legal + the list-owner group, not to override at the tool level.


§3. Source list—schema + storage

The list itself is NOT committed to this repo or any other. It lives in a separate location with separate access controls.

§3.1 Schema (proposed)

- id: ri-001                            # opaque primary key; reference target for all audit logs + rule lookups
  added_date: 2024-03-15                # for retention / review-cycle audits
  added_by: legal                       # who put it on the list (role bucket only—no names)
  match_strings:
    - "Oprah Winfrey"
    - "Oprah"                           # short-form match
  aliases: []                            # additional spellings, nicknames, common misspellings
  category: legal-defamation-history    # or: legal-settlement-terms, legal-litigation-active, etc.
  review_date: 2026-03-15               # when this entry is re-evaluated by Legal
  notes_field_pointer: null              # if context notes exist, point to off-repo doc location—do NOT inline
- id: ri-002
  ...

§3.2 Storage location (TBD)

Three candidates:

  1. Database table in MCC_PRESENTATION.CONTENT_SCALING_AGENT (or a stricter access-controlled schema) with SELECT restricted to the CSA backend service account + a small admin group. Pros: queryable from CSA at intake, no file to leak. Cons: data-team admin grants required; provisioning timeline.
  2. Encrypted YAML/JSON file in a separate non-public repo (e.g., csa-restricted-list) with stricter access controls than csa-content-standards itself. Encrypted-at-rest via age or sops; decrypted by CSA backend at boot. Pros: file-based, version-controlled, smaller blast radius if access controls fail. Cons: file leak risk if any operator clones the repo.
  3. External secrets manager (HashiCorp Vault, GCP Secret Manager) with structured-secret support. Pros: no file at all, audit trail built in. Cons: heavier integration; CSA may not have this plumbed today.

Recommendation: Option 1 (DB table) once data-team provisioning lands. Option 2 as interim. Option 3 is over-engineered for ~7 pages of strings.

§3.3 Access control


§4. Coordination with existing CSA governance

This rule integrates with the existing csa-content-standards stack:

Existing rule Interaction
§11 Layered Enforcement contract Restricted-individuals scrub runs at the constitution layer (Layer 0, system-wide, inviolable)—before any voice/format/style layer is consulted. Hard-block precedes all other validation.
docs/ai-content-vetting.md v1.9.11 Gate 3 (plagiarism + AI-content detection) runs alongside this scrub at intake. The two are independent gates; either failing blocks the same piece.
docs/bridge-content-rules.md Bridge variants from non-restricted source pieces remain in scope; bridge-variant generation gets the same intake + pre-publish scrub.
docs/insights-agent-endpoint-audit-2026-05-14.md Pre-activation requirement #8 in that audit (“Restricted-individuals scrub”) points here. This doc establishes the home; the audit doc points the endpoint at it.

§5. Out of scope


§6. Open questions for the next conversation with content team

Before this rule activates, the following need answers from the source-of-truth group (Legal + the ~5 people who hold the current list):

  1. Where is the list today? (Document? Spreadsheet? Heads of 5 people?) This determines the migration path to a queryable source.
  2. What’s the change-cadence? Once per year? Quarterly? Ad-hoc as litigation lands? This determines whether a YAML file + manual edits suffices vs. needing a UI.
  3. Who owns the list going forward? Legal? VP-level editorial? This determines who has write access in §3.3.
  4. Are there context notes per entry? (“Don’t write about Person X in the context of Topic Y.”) If yes, the schema needs a structured notes field; if no, hard-block-on-name is sufficient.
  5. What about people who were ON the list but are now OFF it? (Settlement expires, etc.) The schema’s review_date field anticipates this—but we need to know if entries get hard-removed or soft-tombstoned for audit history.
  6. What’s the false-positive recourse path? If “Tom Cruise” matches in a piece that’s actually about a Cruise-line vacation—what’s the escalation flow?

§7. Implementation path

When the list-owner group hands over the source content:

  1. Stand up storage per §3.2 Option 1 or 2.
  2. Plumb intake-time scrub (§2.1) into CSA backend at the constitution layer.
  3. Plumb pre-publish scrub (§2.2) into the audience-variants stage.
  4. Wire audit table (§2.1) writes; verify name-redaction in the audit log.
  5. Smoke-test with a synthetic list (e.g., a fake name added temporarily) to confirm intake + pre-publish + audit fire correctly.
  6. Replace synthetic list with real list under access controls.
  7. Notify operators of the new hard-block error message + escalation path.
  8. Update docs/insights-agent-endpoint-audit-2026-05-14.md pre-activation requirement #8 to point at this doc + mark it satisfied.

Surfaced: 2026-05-12 (Insights Agent demo, captured 2026-05-14 intake). Owner: Pierce (scoping) + content team leads + Legal (source). Status: Scoping complete. Awaiting Pierce + content team conversation re. who currently holds the source list + handover path.