Loading...

Type: Suggestion
Resolution: Unresolved
Component/s: Agents
Labels:
None

Summary

Extend the existing thumbs up/down feedback mechanism to include categorised reasons for negative feedback, and surface this feedback in the Agent Studio for platform team review. Provide a workflow for converting user-reported issues into evaluation dataset candidates.

Problem Statement

Rovo agents currently support thumbs up/down feedback on responses, but this feedback does not flow into the agent governance pipeline. The platform team has no visibility into which responses users are flagging as problematic, what categories of problems are occurring, or how feedback trends over time.

This creates two gaps:

Blind spots in evaluation datasets. Evaluation datasets are built from anticipated prompts, not observed failures. Real users ask questions the platform team didn't anticipate, and encounter problems that aren't covered by existing datasets.

No early warning system. A spike in negative feedback on a specific topic could indicate a knowledge source has become outdated, an instruction gap has emerged, or a recent change has introduced a regression. Without visibility into feedback, the platform team only discovers these issues through ad-hoc reports or scheduled evaluations.

Proposed Solution

Enhanced user feedback (user-facing)

When a user clicks the thumbs down button on an agent response, a popover appears asking them to select a reason:

Incorrect information - The response contained factual errors

Out of date - The information was correct previously but is no longer current

Didn't answer my question - The agent deflected or couldn't find relevant information

Inappropriate or harmful - The response contained content that shouldn't have been generated

Other - None of the above categories apply

The reason selection is a single click - no free-text required. The user submits and continues their conversation. Thumbs up feedback remains a single click with no additional input.

Feedback review interface (platform team)

A new "User feedback" item in the Agent Studio sidebar, with a badge showing the count of pending (unreviewed) negative feedback items.

The page displays:

Summary metrics (30 days):

Total interactions

Positive feedback count

Negative feedback count

Pending review count

Reason breakdown: A horizontal bar chart showing the distribution of negative feedback by reason category over the last 30 days. This gives the platform team an at-a-glance view of whether the agent's primary issue is accuracy, staleness, coverage, or safety.

Pending review table: A filterable table of negative feedback items awaiting review, showing:

Date and time

Reason category (colour-coded tag)

The user's original prompt (truncated preview)

The agent's response (truncated preview)

Actions: View (full conversation), + Dataset (add to evaluation dataset), Dismiss

Filters:

By review status: pending, reviewed, added to dataset, dismissed

By reason category

Feedback-to-dataset workflow

When a platform team member reviews a feedback item and clicks "+ Dataset," they are prompted to:

Select which dataset to add the prompt to (accuracy, boundary, or a custom dataset)

Write or edit the expected response (what the agent should have said)

Confirm

The prompt and expected response are added to the selected dataset. The next scheduled or manual evaluation will include this prompt, closing the loop from user report to automated testing.

Feedback statuses

Pending >Feedback received, not yet reviewed by the platform team
Reviewed > Platform team has viewed the feedback
Added to dataset > The prompt has been added to an evaluation dataset
Dismissed > The platform team reviewed and determined no action is needed (e.g. user error, unreasonable expectation)

Design Reference

See attached mockup (mockup-user-feedback.html) showing:

Mockup 1 - User view: chat interface with thumbs down selected and the categorised reason popover displayed. Shows the five reason categories with radio selection and submit button.

Mockup 2 - Platform team view: User feedback page in the Agent Studio sidebar. Shows summary metrics, reason breakdown bar chart, and the pending review table with action buttons (View, + Dataset, Dismiss).

Use Cases

Knowledge source drift detection: The platform team notices a spike in "Out of date" feedback on a specific topic. Investigation reveals that the underlying Confluence page was updated two weeks ago with new process steps, but the agent is still giving the old instructions. The team updates the accuracy dataset with the corrected expected response and triggers a re-evaluation.

Evaluation dataset enrichment: A user asks a question the platform team didn't anticipate ("How do I configure ConfiForms conditional field visibility?"). The agent fails to answer. The platform team reviews the feedback, writes the correct expected response, and adds it to the accuracy dataset. Future evaluations now cover this query.

AUP bypass detection in production: A user reports an "Inappropriate" response where the agent inferred a colleague's emotional state from their Jira activity. The platform team reviews the feedback, confirms it's an AUP bypass, and adds the prompt to the AUP evaluation dataset. This expands the AUP dataset with a real-world prompt rather than only lab-crafted test cases.

False positive identification: A user marks a correct response as "Incorrect" because they disagree with the process described (e.g. they want self-service space creation but the agent correctly directs them to the request process). The platform team reviews and dismisses the feedback - the agent responded correctly, the user's expectation was wrong.

Interaction with Other Features

Evaluation datasets: The "+ Dataset" action directly adds prompts to existing evaluation datasets, which are then included in scheduled and re-verification evaluations.

Compliance dashboard: Feedback volume and sentiment could be surfaced as additional columns on the compliance dashboard, giving portfolio-level visibility into which agents have the most negative feedback.

Scheduled evaluations: New dataset entries from feedback are included in the next scheduled evaluation run automatically.

Considerations

Privacy. User prompts captured via feedback may contain sensitive information. The feedback review interface should be restricted to agent and organisation administrators. Consider whether user identity should be visible to reviewers or anonymised.

Feedback volume. High-traffic agents may generate significant feedback volume. The pending review queue should support bulk actions (e.g. dismiss all "Other" feedback older than 30 days) and the reason breakdown chart helps the platform team triage by category rather than reviewing every item individually.

Positive feedback value. While this feature request focuses on negative feedback, positive feedback data is also valuable. Responses with high positive feedback rates could be used to validate that evaluation dataset expected responses are aligned with what users consider good answers.

Notification. Platform team administrators should receive a notification when negative feedback is received, configurable by threshold (e.g. notify immediately for "Inappropriate," daily digest for other categories).

relates to: CES-155197 Loading...

Details

Description

Summary

Problem Statement

Proposed Solution

Enhanced user feedback (user-facing)

Feedback review interface (platform team)

Feedback-to-dataset workflow

Feedback statuses

Design Reference

Use Cases

Interaction with Other Features

Considerations

Attachments

Issue Links

Forms

Activity

People

Dates