SWE-bench CLI

A command-line interface for interacting with the SWE-bench API. Use this tool to submit predictions, manage runs, and retrieve evaluation reports.

Read the full documentation here.

Installation

pip install sb-cli

Authentication

Before using the CLI, you'll need to get an API key:

Generate an API key:

sb-cli gen-api-key [email protected]

Set your API key as an environment variable - and store it somewhere safe!

export SWEBENCH_API_KEY=your_api_key
# or add export SWEBENCH_API_KEY=your_api_key to your .*rc file

You'll receive an email with a verification code. Verify your API key:

sb-cli verify-api-key YOUR_VERIFICATION_CODE

Subsets and Splits

SWE-bench has different subsets and splits available:

Subsets

swe-bench-m: The main dataset
swe-bench_lite: A smaller subset for testing and development

Splits

dev: Development/validation split
test: Test split (currently only available for swe-bench_lite)

You'll need to specify both a subset and split for most commands.

Usage

Submit Predictions

Submit your model's predictions to SWE-bench:

sb-cli submit swe-bench-m test \
    --predictions_path predictions.json \
    --run_id my_run_id

Options:

--run_id: ID of the run to submit predictions for (optional, defaults to the name of the parent directory of the predictions file)
--instance_ids: Comma-separated list of specific instance IDs to submit (optional)
--output_dir: Directory to save report files (default: sb-cli-reports)
--overwrite: Overwrite existing report (default: 0)
--gen_report: Generate a report after evaluation is complete (default: 1)

Get Report

Retrieve evaluation results for a specific run:

sb-cli get-report swe-bench-m dev my_run_id -o ./reports

List Runs

View all your existing run IDs for a specific subset and split:

sb-cli list-runs swe-bench-m dev

Predictions File Format

Your predictions file should be a JSON file in one of these formats:

{
    "instance_id_1": {
        "model_patch": "...",
        "model_name_or_path": "..."
    },
    "instance_id_2": {
        "model_patch": "...",
        "model_name_or_path": "..."
    }
}

Or as a list:

[
    {
        "instance_id": "instance_id_1",
        "model_patch": "...",
        "model_name_or_path": "..."
    },
    {
        "instance_id": "instance_id_2",
        "model_patch": "...",
        "model_name_or_path": "..."
    }
]

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
assets		assets
docs		docs
sb_cli		sb_cli
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
mkdocs.yaml		mkdocs.yaml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SWE-bench CLI

Installation

Authentication

Subsets and Splits

Subsets

Splits

Usage

Submit Predictions

Get Report

List Runs

Predictions File Format

About

Releases

Packages

Contributors 2

Languages

License

swe-bench/sb-cli

Folders and files

Latest commit

History

Repository files navigation

SWE-bench CLI

Installation

Authentication

Subsets and Splits

Subsets

Splits

Usage

Submit Predictions

Get Report

List Runs

Predictions File Format

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages