Evaluation Dashboard
Monitor prompt performance and detect AI regressions in real-time.
🧪
- Total Experiments
- 4
📊
- Global Avg Score
- 2.0
⚠️
- Regressions Detected
- 3
Experiment Performance Over Time
Recent Experiments
| ID | Status | Avg Score | Regression? | Created |
|---|---|---|---|---|
| cca26466... | COMPLETED | 5.0 | No | 2026-07-05 21:08 |
| 4c6c5bcf... | COMPLETED | 1.0 | ⚠️ YES | 2026-07-05 21:09 |
| 1f5b0195... | COMPLETED | 1.0 | ⚠️ YES | 2026-07-05 21:11 |
| a84a4baf... | COMPLETED | 1.0 | ⚠️ YES | 2026-07-05 21:17 |