Skip to content

Search is only available in production builds. Try building and previewing the site to test it out locally.

Predict API

The Predict API lets any HTTP caller hand a trained brain its in-context data and get predictions back. It’s the same endpoint the basic-mode Try-it form calls.

Endpoint

POST /projects/<projectId>/runs/<runIdOrSlug>/predict
  • projectId — the UUID of the project containing the run
  • runIdOrSlug — either the run’s UUID or its slug (e.g. v0_1)

For share-link callers (no auth needed):

POST /shares/<token>/predict

Auth

Authorization: Bearer <PFNSTUDIO_TOKEN>

Get a token from /api-tokens — see API tokens.

Share-link callers omit the Authorization header; the token in the URL is the credential.

Request body

The shape varies by task type. The Try-it form picks the right shape automatically; HTTP callers must pick.

Regression

{
"context": {
"x": [[1.0, 2.5], [3.0, 1.8], [4.5, 0.9]],
"y": [5.1, 4.3, 2.0]
},
"query": {
"x": [[2.2, 1.4], [3.8, 0.7]]
}
}

context.x is a 2-D array (rows × features). context.y is a 1-D array (one scalar per context row). query.x is rows of the same width as context.x.

Classification

{
"context": {
"x": [[0.1, 0.4, ...], [0.9, 0.2, ...]],
"labels": [0, 1, 1, 0, 1]
},
"query": {
"x": [[0.2, 0.5, ...]]
}
}

labels is a 1-D array of category indices.

Time-series (forecast)

For prior types like AR(2) or TabPFN-TS, the basic-mode form sends a single column of past values; the studio’s predict packer transforms it into the model’s expected feature shape (lags + calendar features for TabPFN-TS style).

If you’re calling the API directly for a time-series brain, send the already-engineered features:

{
"context": {
"x": [[lag_1, lag_2, ..., sin_phase, cos_phase, step_pct]],
"y": [target_value]
},
"query": {
"x": [[...]]
}
}

Inspect the brain’s model spec via GET /projects/<id>/models/<modelId> to see the exact d_in.

Discovery (causal)

Not supported via the Predict API. Use the advanced view at /projects/<id>/runs/<runId> directly.

Response

{
"predictions": [5.1, 4.3, 2.0, ...]
}

predictions is a 1-D array, one entry per query row. Type depends on the task:

  • Regression → number
  • Classification → number (0 or 1, or class index)

Confidence bands — not yet exposed

Brains trained with the Show confidence capability internally produce a full posterior (the defining feature of PFNs), but the predict endpoint currently returns only the point estimate in predictions. Surfacing the posterior bounds (lower / upper fields, or full quantile arrays) is on the roadmap — when it lands, the response shape will extend additively, so callers reading only predictions won’t break.

If you need the posterior today, use the advanced run-detail Try-it (/projects/:id/runs/:runId) which can render the model’s raw output, or call POST /priors/<id>/sample to inspect the prior’s full output schema.

Errors

{
"error": "Model expects 30 features per row but got 28."
}

Common errors:

ErrorCauseFix
Model expects N features per row but got Mcontext.x row width doesn’t match the model’s d_inAdd/remove columns to match N
context.x and context.y have different lengthsMismatched row countsMake sure len(x) == len(y)
Run is not in completed stateThe brain hasn’t finished trainingWait for status → completed
UnauthorizedInvalid or missing tokenGenerate a fresh one at /api-tokens

The basic-mode Try-it form parses Model expects N features and uses N to re-template the form with the right number of columns.

SDK / CLI

A Python SDK is in development; for now, use any HTTP client:

import requests
resp = requests.post(
"https://cloud.pfnstudio.com/projects/<id>/runs/v0_1/predict",
headers={"Authorization": "Bearer <token>"},
json={
"context": {"x": [[1.0, 2.5], [3.0, 1.8]], "y": [5.1, 4.3]},
"query": {"x": [[2.2, 1.4]]},
},
)
print(resp.json()["predictions"])
Terminal window
curl -X POST https://cloud.pfnstudio.com/projects/<id>/runs/v0_1/predict \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"context": {"x": [[1.0, 2.5]], "y": [5.1]}, "query": {"x": [[2.2]]}}'

Latency expectations

  • Single-row predict (small context, single query): ~50-200 ms
  • 100-row context, 10 queries: ~100-300 ms
  • Larger payloads scale roughly with context_size + query_size

Predicts run on the brain’s hosted worker pool, not on Vast.ai. Marathon-trained brains predict at the same speed as Standard-trained brains of the same architecture.