Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Datasets

The Datasets page at /datasets is the per-org registry of benchmark data your evals reference. Datasets are cached per-workspace — pulled once, reused across every eval that pins them.

What’s in the registry

Each row shows:

  • Name + version — e.g. openml-credit-g@1.0.0, sachs@1.0.0
  • Description — one-line summary of what the dataset is
  • Size — chip with rough on-disk size
  • License — clickable link to the dataset’s license page
  • Used by — labels for the templates/evals that reference this dataset
  • Splits + row counts — e.g. train: 800 rows, test: 200 rows
  • Status pillAvailable / Downloading / Downloaded / Failed / Coming soon

Actions

Download

For Available (not yet downloaded) rows, click Download. The page:

  1. Kicks off a fetch job
  2. Shows a progress bar with bytes / total
  3. Polls GET /datasets/<id>/job every 1.5s
  4. Flips status to Downloaded when complete

You can navigate away during the download — it keeps running. Coming back to /datasets shows the live progress again.

Remove

For Downloaded rows, the Remove button drops the cached copy. Useful for freeing space; the dataset stays in the registry and can be re-downloaded any time.

Pinning a dataset in an eval

Datasets are referenced by registry:<id>@<version> strings inside eval specs:

dataset:
name: openml-credit-g
source: registry:openml-credit-g@1.0.0
version: 1.0.0

The version pin ensures eval reproducibility across re-runs and across workspaces. Pinning to latest is intentionally not supported.

Cache locality

The cache lives on the API server’s volume (not in your browser). Every member of your org shares the same downloaded copies — if your teammate already pulled sachs@1.0.0, you don’t pay the download again.

Adding a new dataset

The self-serve “Register dataset” flow is on the roadmap. Until it lands, contact us with the dataset’s source URL, license, splits, and expected row counts — we’ll add it to the registry for your workspace.

Coming-soon datasets

Rows with status Coming soon are registry entries with no download source wired up yet — usually because the source is gated behind an auth flow that requires manual setup. The chip names the blocker; once unblocked, the row flips to Available.