Skip to main content

Documentation Index

Fetch the complete documentation index at: https://setup.cevro.ai/llms.txt

Use this file to discover all available pages before exploring further.

What Is Content Shield?

Content Shield detects sensitive content in player messages — such as self-harm or suicidal intent — and silently escalates the conversation to a human agent. When triggered, zero automated messages reach the player. No greeting, no AI response, no transfer message. Only a human communicates.
Content Shield runs on every inbound message, not just the first. A player might start with a normal question and later express distress — Content Shield catches it regardless of when it appears.

How It Works

  1. Player sends a message (first or subsequent)
  2. Before any AI processing, Content Shield checks the message
  3. If sensitive content is detected → conversation is silently escalated to a human agent
  4. If not detected → normal flow continues (greeting, AI response, etc.)

What the Player Experiences

  • If Content Shield triggers: Nothing automated. A human agent picks up the conversation.
  • If Content Shield doesn’t trigger: Normal experience — greeting, AI response, etc.

What the Operator Sees

  • Ticket is flagged in the escalation queue
  • Detection metadata shows the type, confidence score, and matched reference
  • The suppressOutbound flag prevents any automated outbound for the ticket’s lifetime

Setting Up Content Shield

Content Shield is configured through Automation Rules using a special trigger type.
1

Create an Automation Rule

Go to Settings → Automation Rules and create a new rule:
  • Trigger: Content Detected
  • Detection Type: Self-harm
  • Action: Escalate to Human (with silent mode enabled)
Silent mode ensures no automated messages are sent to the player.
2

Assign an Agent Team

Make sure you have a team of human agents configured to receive escalated conversations. These are the agents who will handle flagged conversations.
3

Test

Send a test message through your chat widget to verify detection is working correctly. Content Shield will evaluate the message and trigger escalation if sensitive content is detected.

Detection

Content Shield uses a purpose-built safety classifier that works across languages. Detection is fast — messages are evaluated in real time with no noticeable delay to the player.

Key Behaviors

Not just the first message. A player might start with a withdrawal question and later express distress. Content Shield evaluates every inbound message.
Once a ticket is flagged, Content Shield won’t re-trigger on subsequent messages in the same conversation. The flag is permanent for that ticket’s lifetime.
Workspaces without a Content Detected automation rule skip detection entirely. No API calls, no latency.
The player’s message is saved to the transcript for completeness, even when intercepted. Only the AI processing is skipped.
Content Shield works on all supported channels — LiveChat, Zendesk, Zoho, Respond.io, Intercom, Web Messenger, and Web — with full first-message coverage.

Limitations

  • No per-brand thresholds — a single threshold applies across the workspace.
  • Very terse messages may not match — short phrases like “end it” with no context may fall below the detection threshold.
Content Shield is a safety layer that works alongside the AI agent’s built-in responsible gaming detection. The AI agent classifies RG concerns (problem gambling, financial hardship, self-exclusion, emotional distress) during normal conversation flow. Content Shield adds a fast pre-AI safety net for the most critical content.