Benchmark

ScarfBench evaluates agentic transformation of Java applications across Jakarta EE, Quarkus, and Spring.

The suite combines focused examples and whole applications to measure migration quality, framework idiomaticity, and behavioral parity. Every conversion is manually implemented and developer-verified, with tests to confirm post-migration behavior.

At a glance

Apps102

Layers6

Frameworks3

Tests1,331

Dive Deeper

Focused Examples Narrow, layer-level apps that isolate one enterprise concern at a time (CDI, persistence, infrastructure, presentation, or business domain logic).

Whole Applications End-to-end applications where multiple layers interact, useful for evaluating broad migration correctness and architectural fidelity.