Scorecards
A scorecard is the AI grader’s verdict on one conversation. Each one has per-criterion scores, an overall score, a one-line rationale per criterion, and timestamped evidence quotes from the transcript.
What a scorecard looks like
| Criterion | Score | Rationale |
|---|---|---|
| Empathy | 4/5 | Acknowledged feelings at 0:14 (“I hear how frustrating this is”). Missed a second opportunity at 1:42. |
| Resolution | 5/5 | Full refund issued, customer confirmed satisfaction. |
| Compliance | 3/5 | Disclosure paraphrased rather than read verbatim. |
| Efficiency | 4/5 | Solved in 3:12, slight detour into upsell. |
| Overall | 4.0/5 | — |
Click any criterion to expand the evidence — the transcript turns the grader cited and links into the audio at the right timestamp.
Filtering and triage
The scorecard list supports the filters you’d expect plus a few that matter for QA work:
- Below threshold — show only calls below a score floor.
- Outliers — calls 2+ standard deviations off the team mean.
- Unreviewed — no human eyes on it yet.
- Disputed — agent disagreed with the AI score.
- By criterion — find every call that scored low on
compliancethis week.
The default landing view is Below threshold + Unreviewed — i.e. the calls that need attention.
Override an AI score
Coaches can override any score. The AI’s number stays visible (greyed out) so the disagreement is auditable.
- Open the scorecard.
- Click the criterion you want to adjust.
- Set your score and add a one-line reason.
- Save.
The agent gets a notification. The override flows into the dashboards immediately.
Agent disputes
Agents can dispute their own scores from their dashboard. Disputes show up in the coach’s queue with the original score, the agent’s note, and a one-click approve/reject.
| Outcome | What happens |
|---|---|
| Coach agrees | Score is updated, dashboards recompute. |
| Coach disagrees | Original score stands; coach’s note is saved with the dispute. |
| Coach escalates | Routed to a senior reviewer. |
Take disputes seriously. A pattern of disputes on the same criterion usually means the rubric description is unclear, not that the agent is wrong.
Calibration
Once a quarter, take a sample of 30-50 graded calls and have a senior reviewer score them blind. Compare scores. If the AI is consistently 0.5+ points off on a criterion, refine the description in the rubric and re-score the period.
Exporting scorecards
| Destination | How |
|---|---|
| CSV | Bulk export from the scorecard list. |
| Google Sheets | One-click via the Sheets integration. |
| Data warehouse | Stream via the API or scheduled CSV. |
| Slack | Daily/weekly digests via notifications. |
Open in Omniflow
Related
| If you want to… | Go to |
|---|---|
| Define a new rubric | Scoring Rubric |
| Add a yes/no rule | Custom KPIs |
| Read team-level trends | Reports & Trends |