Submit Conversions
This page walks you through submitting evaluation artifacts to Scarfbench by opening a pull request against scarfbench/scarfbench-submit. A CI workflow validates your submission and records the result on the leaderboard.
Prerequisites
Section titled “Prerequisites”Before you submit, you should have generated evaluation artifacts locally with the scarf CLI by following the Quickstart. Each submission is made up of one or more conversion directories. A conversion directory corresponds to a single benchmark 3-tuple — (layer, source framework, target framework) — and holds one or more run_N subdirectories, each with a metadata.json file.
Expected folder structure
Section titled “Expected folder structure”The validation workflow looks for conversion roots by finding any directory that contains run_*/metadata.json files. The key requirement is simple: each agent gets its own folder, and conversion directories live directly inside it.
Do — group runs under a folder named after your agent, inside your fork:
your-fork/└── your-agent/ ├── your-agent__business_domain__cart__spring__quarkus/ │ ├── run_1/{input, output, validation, metadata.json} │ └── run_2/{input, output, validation, metadata.json} └── your-agent__data__orders__spring__quarkus/ └── run_1/{input, output, validation, metadata.json}Don’t put conversion directories at the fork root (no agent folder):
your_fork/└── your-agent__business_domain__cart__spring__quarkus/...Don’t put a single run_N directory at the top — runs must sit inside a conversion directory:
your_fork/└── run_1/...Don’t bury the agent folder under arbitrary nesting:
your-fork/some_folder/.../your-agent/your-agent__layer__.../...1. Fork the repository
Section titled “1. Fork the repository”Create a personal fork of scarfbench/scarfbench-submit. Please keep the fork named <your-username>/scarfbench-submit — don’t rename it. Your fork will inherit the default main branch, which holds the workflow definitions and documentation.
2. Add conversion artifacts
Section titled “2. Add conversion artifacts”Clone your fork locally and copy the conversion directories into it, following the layout above. You can commit directly to main or use a feature branch — either works, as long as each conversion root contains the expected run_*/metadata.json and run_*/validation/run.log files.
git clone https://github.com/<your-username>/scarfbench-submit.gitcd scarfbench-submit# Copy conversion artifacts into the working tree, then:git add .git commit -m "Add submission: <brief descriptor>"git push origin main3. Open a pull request against the submission branch
Section titled “3. Open a pull request against the submission branch”One important thing to watch for: your pull request must target the upstream submission branch, not main. GitHub’s PR interface defaults the base branch to main, so you’ll need to manually switch the base branch selector to submission before opening the PR. If you miss this step, the validation workflow won’t run.
You can use a pre-filled comparison URL like this to land on the PR page with the right base branch already selected:
https://github.com/scarfbench/scarfbench-submit/compare/submission...<your-username>:scarfbench-submit:main?expand=1In the PR description, please mention the agent, model, and any variant or configuration details that aren’t already captured in the submission’s metadata.
4. Await validation
Section titled “4. Await validation”As soon as you open the PR, the CI workflow kicks off automatically. First-time contributors will hit GitHub’s standard approval gate; after that, later submissions from the same contributor run without manual approval. Here’s what the workflow does:
- Finds all conversion roots in your submission.
- Splits them across parallel validation shards, each running
scarf validateon its subset. - Combines the per-shard results into one or more leaderboard JSON files matching the Scarfbench leaderboard schema.
- Posts a confirmation comment on the PR and closes it once everything succeeds.
Validation outputs — per-run logs, updated metadata, and the generated leaderboard JSON — are kept as workflow artifacts attached to the PR’s check run. You can download them from the GitHub Actions interface to take a closer look.
If validation fails, the PR stays open and you’ll get a diagnostic comment. Just push corrective commits to your source branch and each push will re-trigger the workflow.
What happens to submissions
Section titled “What happens to submissions”PRs are closed rather than merged. Your submission’s content stays preserved in the closed PR’s diff, which remains accessible indefinitely on GitHub. The submission branch itself isn’t modified — it just acts as a routing target for the workflow’s branch filter.
The leaderboard JSON produced by a successful validation is the canonical record of your submission’s outcome. Publishing those results to scarfbench.info is handled separately by the maintainers.
Contact
Section titled “Contact”If you have questions about the submission process or the benchmark methodology, feel free to open an issue in the submission repository or visit the project homepage at scarfbench.info.