ZAI Monitor

Methodology

Short version: the Coding Plan workflow runs monitored models sequentially under the same settings and tracks directional performance over matching time windows.

How Measurements Are Taken

Sampling cadence: data is collected hourly.

Prompt Suite

The monitor uses two prompt types to avoid overfitting to one response style.

Prompt 1

Code Generation + Tests

Python function + exactly 2 pytest tests, with strict formatting constraints.

Prompt 2

JSON Analysis

Structured metrics from sample request logs, including error handling and brief calculations.

This dashboard is directional, not a controlled lab benchmark. Network conditions and provider load can influence any individual run.

Back to Dashboard