Prompt 1
Code Generation + Tests
Python function + exactly 2 pytest tests, with strict formatting constraints.
ZAI Monitor
Short version: the Coding Plan workflow runs monitored models sequentially under the same settings and tracks directional performance over matching time windows.
Sampling cadence: data is collected hourly.
The monitor uses two prompt types to avoid overfitting to one response style.
Prompt 1
Python function + exactly 2 pytest tests, with strict formatting constraints.
Prompt 2
Structured metrics from sample request logs, including error handling and brief calculations.
This dashboard is directional, not a controlled lab benchmark. Network conditions and provider load can influence any individual run.