Loading...

XML

Word

Printable

Type: Suggestion
Resolution: Unresolved
Component/s: Agents
Labels:
None

Issue Summary

Some LLM providers frequently update their models automatically (e.g. minor version bumps or silent model changes). Even when these changes are small, they can alter how prompts are interpreted and lead to unexpected differences in responses.
As a result, the following problems may happen.

Prompt behavior can suddenly change for production users.
Teams may see regressions in the quality or consistency of responses.
It becomes hard to confidently adopt newer LLM versions, since any change is effectively “testing in production.”

This is particularly frustrating for teams who have carefully tuned prompts or workflows and want stability and predictability, but also need to keep up with newer LLM versions.

Suggestion

Introduce a Sandbox environment or mode in Rovo where admins can do below things.

Configure or select a new LLM version/model that is not yet applied to production.
Test prompts, workflows, and typical user scenarios against this new version in the Sandbox.
Compare behavior between the current production LLM version and the candidate version (for example, via side‑by‑side responses or an A/B style comparison for key prompts).
Once the behavior is validated and accepted, promote the tested LLM version from Sandbox to Production in a controlled way.

Assignee:: Unassigned
Reporter:: Takeshi Muramatsu
Votes:: 1 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: 21/Jan/2026 1:30 AM
Updated:: 21/Jan/2026 1:31 AM

Details

Description

Issue Summary

Suggestion

Attachments

Forms

Activity

People

Dates