The decisions that shaped what shipped
What we saw
Agents weren't asking for "nicer text." They were trying to answer 5 questions fast: what happened, what was tried, what changed, what's in attachments, what's next.
What we chose
A structured, scannable summary that still carries enough context to reduce manual searching. Depth calibrated to the questions agents actually ask at handoff.
If the summary forces verification, it adds work. Depth is not a preference, it's a trust threshold.
What we saw
When agents couldn't immediately find the output, they assumed the feature failed and re-clicked. Not because the model was slow. Because the UI gave no signal.
What we chose
Treat discoverability as a requirement. Place the summary close to where decisions happen and make its presence obvious when new data arrives.
In real support work, "out of sight" becomes "not reliable," even if the model output is fine.
What we saw
Agents naturally tried again when something felt missing, especially during handoff and near closure. That behavior is feedback, not misuse.
What we chose
Support refresh, but remove uncertainty with strong "new data" signals and visible "what changed" so agents know exactly what the regeneration captured.
A summary that quietly updates creates doubt. A refresh model without clarity creates thrash.
What we saw
Agents want to correct wording, fill gaps, and reuse content in work notes. But teams need clarity on what is AI and what is human when that content enters the record.
What we chose
Allow human correction with explicit attribution when reused, so the system stays trustworthy without blocking efficiency.
This is the decision that determines whether AI stays a "reference card" or becomes workflow-native documentation support.