Skip to content

Installing ScarfBench

ScarfBench uses the scarf command line tool to pull benchmark assets, run evaluations, and work with submissions.

Choose one installation method.

Terminal window
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/scarfbench/scarf/releases/latest/download/scarfbench-cli-installer.sh | sh
Terminal window
scarf --help

You should see CLI help output. For most users, the core commands needed post install are:

GroupCommandPurposeLink
Benchmark setupscarf bench pullDownload benchmark files for local evaluationCommand help
Agent evaluationscarf evalEvaluate agents on benchmarkGo to section
scarf eval runRun agent evaluation jobsCommand help

Use scarf bench pull to download the benchmark to a local directory.

Terminal window
scarf bench pull --dest /path/to/scarfbench-data

This creates or updates a local benchmark checkout in /path/to/scarfbench-data.

Terminal window
scarf bench pull --dest ./scarfbench-data --version <VERSION>

Replace <VERSION> with the release or version identifier you want to evaluate against.

  • --dest <DIR>: required destination directory where benchmark files will be pulled
  • --version <VERSION>: optional version selector; if omitted, the latest available version is pulled

After pulling, use scarf eval ... commands to run your agent evaluations. Continue with the Quickstart Guide to run your first evaluation.