Predict API

The Predict API lets any HTTP caller hand a trained brain its in-context data and get predictions back. It’s the same endpoint the basic-mode Try-it form calls.

Endpoint

POST /projects/<projectId>/runs/<runIdOrSlug>/predict

projectId — the UUID of the project containing the run
runIdOrSlug — either the run’s UUID or its slug (e.g. v0_1)

For share-link callers (no auth needed):

POST /shares/<token>/predict

Auth

Authorization: Bearer <PFNSTUDIO_TOKEN>

Get a token from /api-tokens — see API tokens.

Share-link callers omit the Authorization header; the token in the URL is the credential.

Request body

The shape varies by task type. The Try-it form picks the right shape automatically; HTTP callers must pick.

Regression

{
  "context": {
    "x": [[1.0, 2.5], [3.0, 1.8], [4.5, 0.9]],
    "y": [5.1, 4.3, 2.0]
  },
  "query": {
    "x": [[2.2, 1.4], [3.8, 0.7]]
  }
}

context.x is a 2-D array (rows × features). context.y is a 1-D array (one scalar per context row). query.x is rows of the same width as context.x.

Classification

{
  "context": {
    "x": [[0.1, 0.4, ...], [0.9, 0.2, ...]],
    "labels": [0, 1, 1, 0, 1]
  },
  "query": {
    "x": [[0.2, 0.5, ...]]
  }
}

labels is a 1-D array of category indices.

Time-series (forecast)

For prior types like AR(2) or TabPFN-TS, the basic-mode form sends a single column of past values; the studio’s predict packer transforms it into the model’s expected feature shape (lags + calendar features for TabPFN-TS style).

If you’re calling the API directly for a time-series brain, send the already-engineered features:

{
  "context": {
    "x": [[lag_1, lag_2, ..., sin_phase, cos_phase, step_pct]],
    "y": [target_value]
  },
  "query": {
    "x": [[...]]
  }
}

Inspect the brain’s model spec via GET /projects/<id>/models/<modelId> to see the exact d_in.

Discovery (causal)

Not supported via the Predict API. Use the advanced view at /projects/<id>/runs/<runId> directly.

Response

{
  "predictions": [5.1, 4.3, 2.0, ...]
}

predictions is a 1-D array, one entry per query row. Type depends on the task:

Regression → number
Classification → number (0 or 1, or class index)

Confidence bands — not yet exposed

Brains trained with the Show confidence capability internally produce a full posterior (the defining feature of PFNs), but the predict endpoint currently returns only the point estimate in predictions. Surfacing the posterior bounds (lower / upper fields, or full quantile arrays) is on the roadmap — when it lands, the response shape will extend additively, so callers reading only predictions won’t break.

If you need the posterior today, use the advanced run-detail Try-it (/projects/:id/runs/:runId) which can render the model’s raw output, or call POST /priors/<id>/sample to inspect the prior’s full output schema.

Errors

{
  "error": "Model expects 30 features per row but got 28."
}

Common errors:

Error	Cause	Fix
`Model expects N features per row but got M`	`context.x` row width doesn’t match the model’s `d_in`	Add/remove columns to match N
`context.x and context.y have different lengths`	Mismatched row counts	Make sure `len(x) == len(y)`
`Run is not in completed state`	The brain hasn’t finished training	Wait for status → `completed`
`Unauthorized`	Invalid or missing token	Generate a fresh one at `/api-tokens`

The basic-mode Try-it form parses Model expects N features and uses N to re-template the form with the right number of columns.

SDK / CLI

A Python SDK is in development; for now, use any HTTP client:

import requests

resp = requests.post(
    "https://cloud.pfnstudio.com/projects/<id>/runs/v0_1/predict",
    headers={"Authorization": "Bearer <token>"},
    json={
        "context": {"x": [[1.0, 2.5], [3.0, 1.8]], "y": [5.1, 4.3]},
        "query": {"x": [[2.2, 1.4]]},
    },
)
print(resp.json()["predictions"])

curl -X POST https://cloud.pfnstudio.com/projects/<id>/runs/v0_1/predict \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"context": {"x": [[1.0, 2.5]], "y": [5.1]}, "query": {"x": [[2.2]]}}'

Latency expectations

Single-row predict (small context, single query): ~50-200 ms
100-row context, 10 queries: ~100-300 ms
Larger payloads scale roughly with context_size + query_size

Predicts run on the brain’s hosted worker pool, not on Vast.ai. Marathon-trained brains predict at the same speed as Standard-trained brains of the same architecture.