Setup

Get ready to run

One path: account, runner, first benchmark.

Show full local path

Overview

Find the setup to run next, then inspect the evidence behind it.

Start with Recommend for an answer-first setup choice. Explore public evidence or queue a benchmark when you already know what you want to inspect.

How evidence works

Evidence and setup status Checking evidence and runner readiness.

Active runs

—

syncing

Verified results

—

usable in decisions

Open blockers

—

checking

Account attached

Pair a runner

Local execution ready

Choose evidence

Recommendation ready

Run or compare

Next action

Recommend

Which setup should I run?

Recent runs

Tracked execution

More tools

Exports and community

Open exports and contributor activity Download evidence snapshots or inspect community activity.

Top contributors

Community evidence stays cumulative and exportable.

Recommendations

Find the setup to run

More evidence and candidates Candidate table, caveats, source notes, and the next benchmark. Evidence details ready Open for the complete candidate and proof trail.

Question and filters Known-good questions first, with light scope edits.

Hardware Task Memory

Advanced filters

Use case Hardware class Runtime Trust

Execution Hardware model Run memory cap X-axis Y-axis Target TTFT Max cost Family Parameters Quant Capability Sort

Explore

Inspect families, setup matches, and evidence

Historical Results

Recent benchmark evidence

Model	Backend	Use Case	TTFT	Tok/s	Hardware	Capability	Verification

Compare

Choose between families, variants, and quants

Preset views

Start from a useful model-choice stance, then refine the exact variants or inspect individual runs.

Individual run comparison

Result

Family Explorer

Branches, quants, and nearby matches

Build

Build the next evidence run

Why run this benchmark

Run the benchmark that would change the answer.

Start from Recommend when possible; otherwise choose a model and evidence lane below.

1 Model 2 Benchmarks 3 Queue

Model

Choose the model first. The goal filter only narrows suggested starters and benchmark hints.

Model Name Backend Hugging Face reference Artifact choice

Use public artifacts without connecting Hugging Face.

Benchmark scope

Choose the evidence this run should produce.

Benchmark groups

Adjust related checks together.

Individual checks

Exact checks for this run.

Run details

Optional context for history.

Description Created by Tags

Advanced overrides

Artifact and runtime

Only adjust these if you need an override.

Artifact URI Artifact filename Artifact SHA256 Backend image Artifact cache dir Hourly rate (USD) Notes

Ontology hints

Most users should keep the inferred values.

Family name Checkpoint name Training stage Quantization family Quantization scheme Weight bits

Execute this run

Start a tracked local or cloud run directly from the Hub.

Run locally

Pair a machine once, keep it listening, and queue tracked runs.

No local run has been created for this plan yet.

Start a runner

InferGrade will highlight the one next action that matters for this plan.

Runner recovery commands

Start listener

Use this if the paired app or listener is not running.

Run immediately

Run this once on the current machine.

Run in cloud

Create a managed cloud run when this Hub has a provider configured.

Make this run count as evidence

These steps help the result join the comparable evidence pool instead of staying sample-only.

Use a real run, not a dry run, so timing measurements are recorded.
Keep the artifact pinned so others can reproduce the same bundle.
Let the run finish and upload so its evidence label is applied automatically.

Advanced recovery commands

Preflight only

Check local readiness before starting a run.

Execute only

Run directly if Hub queueing is unavailable.

Upload only

Publish a completed result if automatic upload did not run.

Run plan JSON Inspect or export the prepared plan.

No run plan prepared yet.

Which setup should I run?

Loading workspace state

Get ready to run

Which setup should I run?

Tracked execution

Exports and community

Top contributors

Find the setup to run

Inspect families, setup matches, and evidence

Recent benchmark evidence

Choose between families, variants, and quants

Preset views

Result

Branches, quants, and nearby matches

Build the next evidence run

Run the benchmark that would change the answer.

Execute this run

Run locally

Start a runner

Start listener

Run immediately

Run in cloud

Make this run count as evidence

Preflight only

Execute only

Upload only

Active and recent runs

Recent runs

Live timeline

Reusable runs

Contributor activity