Datasets
The Datasets page at /datasets is the per-org registry of benchmark data your evals reference. Datasets are cached per-workspace — pulled once, reused across every eval that pins them.
What’s in the registry
Each row shows:
- Name + version — e.g.
openml-credit-g@1.0.0,sachs@1.0.0 - Description — one-line summary of what the dataset is
- Size — chip with rough on-disk size
- License — clickable link to the dataset’s license page
- Used by — labels for the templates/evals that reference this dataset
- Splits + row counts — e.g. train: 800 rows, test: 200 rows
- Status pill — Available / Downloading / Downloaded / Failed / Coming soon
Actions
Download
For Available (not yet downloaded) rows, click Download. The page:
- Kicks off a fetch job
- Shows a progress bar with
bytes / total - Polls
GET /datasets/<id>/jobevery 1.5s - Flips status to Downloaded when complete
You can navigate away during the download — it keeps running. Coming back to /datasets shows the live progress again.
Remove
For Downloaded rows, the Remove button drops the cached copy. Useful for freeing space; the dataset stays in the registry and can be re-downloaded any time.
Pinning a dataset in an eval
Datasets are referenced by registry:<id>@<version> strings inside eval specs:
dataset: name: openml-credit-g source: registry:openml-credit-g@1.0.0 version: 1.0.0The version pin ensures eval reproducibility across re-runs and across workspaces. Pinning to latest is intentionally not supported.
Cache locality
The cache lives on the API server’s volume (not in your browser). Every member of your org shares the same downloaded copies — if your teammate already pulled sachs@1.0.0, you don’t pay the download again.
Adding a new dataset
The self-serve “Register dataset” flow is on the roadmap. Until it lands, contact us with the dataset’s source URL, license, splits, and expected row counts — we’ll add it to the registry for your workspace.
Coming-soon datasets
Rows with status Coming soon are registry entries with no download source wired up yet — usually because the source is gated behind an auth flow that requires manual setup. The chip names the blocker; once unblocked, the row flips to Available.