Competition Track

Security Vulnerabilities

This track is focused on fixing security vulnerabilities reported in real-world software programs. Repair tools will be fixing vulnerabilities which has a single failing test-case extracted from a bug report.

The competition will run using Cerberus platform for program repair. All repair tools will be executed using Docker containers using machines at the National University of Singapore. The server machines have the following specifications.

The competition platform provides the repair tools with parameters and meta-data when started for a repair task. In order to participate at APR-Comp, the repair tool needs to be integrated into the Cerberus platform under directory app/drivers/tools/repair using a pull request.

Specification

Each repair task will be executed on a separate container created using the author provided docker image. For each track, following resource limits will be enforced to the running container.


    Processor: Intel(R) Xeon(R) Platinum 8468V CPU @ 2.40GHz
    Cache: 97.5 MB
    Number of Cores: 8
    Memory: 64 GB
    GPU: 2 x NVIDIA A40 GPU (48 GB GDDR6 RAM)
    Time Duration: 1 hour
                  

Setup

Each track would contain N number of repair tasks, for each repair task the tool will be invoked once in which the specified time and resource constraints will be applied. The exact value for N will differ for each track but consistent within all sub-tracks.

Expected Output

The repair tool should generate at most 5 patch files in unified diff format for the provided source code. The patch will be applied on the original source-code, those which fails to successfully apply will be considered invalid. Please ensure your repair driver places the generated patches in a subdirectory named patches within self.output directory.

Sub Tracks

The first edition of APR-Comp will target the two most widely supported programming languages, in vulnerability repair.

Benchmark programs will be selected from VulnLoc and recently disclosed zero-day subjects.
Demo Benchmark: GitHub Repo
Time Duration: 1 hour
Input to Repair: source code, single failing test-case exposing the vulnerability

Benchmark programs will be selected from Vul4J and recently disclosed zero-day subjects.
Demo Benchmark: GitHub Repo
Time Duration: 1 hour
Input to Repair: source code, single failing test-case exposing the vulnerability

Timeout

If a tool becomes unresponsive after specified timeout (i.e. hangs) then the tool is killed after 5 min of wall time, and the resulting runtime is set to +5 minutes (specified timeout + 5).

Disclaimer

The organizers reserve the right to make software, hardware and configuration changes before the competition. Please exercise your tools on a few benchmarks to ensure that they run successfully!