Loading...

XML

Word

Printable

Type: Suggestion
Resolution: Unresolved
Component/s: Chat - Chat Response Relevance
Labels:
None

Summary

Provide visibility into whether agent responses are grounded in knowledge source content. Flag responses containing claims that cannot be traced back to any retrieved source, indicating potential fabrication or hallucination.

Problem Statement

Rovo agents retrieve content from knowledge sources and synthesise responses. When the agent provides a response, there is no indication of whether the response content is drawn from actual knowledge source material or whether the agent has fabricated information that sounds plausible but has no basis in the configured sources.

This is a known characteristic of large language models - they can generate confident, authoritative responses that are entirely fabricated. For enterprise support agents, this presents a direct brand and reputational risk: users trust the agent's responses and may act on incorrect information.

The risk is amplified for agents with org-wide knowledge scope, where the knowledge base is too large to comprehensively evaluate with a 50-prompt dataset. Fabrication detection provides a production safety net that catches hallucinations the evaluation datasets did not anticipate.

Proposed Solution

Response attribution analysis

For a sample set on a customer defined basis, the system analyses whether the claims in the response can be traced back to content retrieved from the agent's configured knowledge sources. Each response receives an attribution score representing the percentage of claims that are grounded in source content.

Attribution level Score Meaning Fully attributed 90-100% All or nearly all claims traceable to knowledge sources Partially attributed 50-89% Some claims grounded, some unattributed Low attribution 0-49% Majority of claims cannot be traced to knowledge sources

Responses with low attribution are flagged for platform team review.

Response attribution page

A new "Response attribution" item in the Agent Studio sidebar.

Summary metrics (30 days):

Total responses

Percentage fully attributed

Percentage partially attributed

Percentage with unattributed claims

Flagged responses table: A filterable table of responses with low or partial attribution, showing:

Date and time

User prompt (truncated preview)

Attribution score with visual meter

Number of unattributed claims

Actions: Review (drill into detail)

Response detail view

When the platform team reviews a flagged response, they see a claim-by-claim breakdown:

Attributed claims (green) - The specific statement from the response, with a link to the knowledge source page that contains the supporting content

Unattributed claims (red) - The specific statement from the response that could not be matched to any retrieved source

This allows the platform team to quickly identify exactly which parts of the response were fabricated and take appropriate action:

Add to accuracy dataset - Create a prompt/expected-response pair to ensure the agent handles this query correctly in future evaluations

View full conversation - Review the full interaction in context

Dismiss - Mark as acceptable (the claim may be general knowledge that doesn't require a specific source)

Design Reference

See attached mockup (mockup-fabrication-detection.html) showing:

Mockup 1 - Response attribution dashboard with summary metrics, and a flagged responses table showing four responses with low attribution scores and unattributed claim counts.

Mockup 2 - Response detail view showing a claim-by-claim analysis. Attributed claims are shown in green with links to their source pages. Unattributed claims are shown in red. Actions allow adding the prompt to the accuracy dataset or dismissing.

Use Cases

Catching hallucinated limits: A user asks about Jira custom field limits. The agent correctly references the field governance process (attributed) but fabricates a specific number ("the default limit is 500") that doesn't appear in any knowledge source. The platform team reviews the flagged response, identifies the fabricated claim, and adds the prompt with the correct expected response to the accuracy dataset.

Org-wide scope quality assurance: An agent with org-wide knowledge scope answers a question about a topic outside its primary domain. The response has 0% attribution because the agent synthesised an answer from general model knowledge rather than Telstra-specific content. The platform team reviews and determines whether to add domain-specific content to the knowledge sources or add deflection instructions for that topic.

Knowledge source gap identification: Multiple flagged responses cluster around the same topic - all with low attribution. This indicates the agent is being asked about something not covered in its knowledge sources. Rather than fabricating answers, the agent should be deflecting. The platform team either adds relevant content to the knowledge sources or adds a deflection rule to the agent's instructions.

Considerations

Performance and cost. Attribution analysis requires comparing agent output against retrieved source content at query time. This has a processing cost. Consider whether attribution analysis runs on every response or on a configurable sample rate (e.g. 10% of responses) to manage resource consumption.

Attribution threshold. The threshold for flagging should be configurable per agent. A general-purpose agent may reasonably include general knowledge statements that won't be fully attributed. A compliance-focused agent should have a higher attribution threshold.

General knowledge vs fabrication. Not all unattributed claims are fabrications. Statements like "Jira is a project management tool" are general knowledge. The dismiss action allows the platform team to clear these without cluttering the review queue. Consider allowing pattern-based auto-dismissal for common general knowledge claims.

Latency. Attribution analysis should not impact the response time experienced by the end user. Analysis should run asynchronously after the response is delivered, with results available in the dashboard within minutes.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

mockup-fabrication-detection_2026-04-07T01_48_35.043Z.html
20 kB
07/Apr/2026 6:29 AM

Assignee:: Neha Bora
Reporter:: Vindika D
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: 07/Apr/2026 6:29 AM
Updated:: 07/Apr/2026 6:29 AM

Details

Description

Summary

Problem Statement

Proposed Solution

Response attribution analysis

Response attribution page

Response detail view

Design Reference

Use Cases

Considerations

Attachments

Attachments

Forms

Activity

People

Dates