ICFP 2025
Sun 12 - Sat 18 October 2025 Singapore
co-located with ICFP/SPLASH 2025

Authors with a paper conditionally accepted at ICFP 2025 are invited to submit an artifact that supports the conclusions of the paper. The Artifact Evaluation Committee will read the paper, explore the artifact, and provide feedback on how easy it would be for future researchers to build on. The ultimate goal of artifact evaluation is to support future researchers in their ability to reproduce and build on today’s work.

If you have a conditionally accepted paper at ICFP 2025, please see the Call for Artifacts for instructions on submitting an artifact for evaluation.

Call for Artifacts

The Artifact Evaluation Committee (AEC) invites authors of conditionally accepted papers to submit an artifact that supports the conclusions of the paper. The committee will read the paper, explore the artifact, and provide feedback on how easy it would be for future researchers to build on that artifact. The ultimate goal of artifact evaluation is to support future researchers in their ability to reproduce and build on today’s work.

The submission of artifacts for review is voluntary and will not influence the final decision on whether the paper itself is accepted. Papers with successfully reviewed artifacts will receive a seal of approval printed on the first page of the paper in the ICFP proceedings. Authors of papers with successfully reviewed artifacts are encouraged to make these artifacts publicly available upon publication of the proceedings by including them as “source materials” in the ACM Digital Library.

The artifact evaluation process starts when a paper is conditionally accepted and an artifact is submitted. If a paper is rejected in the second round of reviews, the corresponding artifact will not be evaluated further.

Types of Artifacts

An artifact that supports the paper’s conclusions can take many forms, including:

  • a working copy of the software and its dependencies, including benchmarks, examples and/or case studies
  • experimental data sets
  • a mechanized proof

Paper proofs are not accepted as artifacts for evaluation.

Selection Criteria

Artifacts have two broad purposes: facilitating reproduction and reuse of the work by future scientists. Reuse goes beyond reproduction by allowing future scientists to, for example, extend a tool with new features or to inspect the exact definitions used in a formal proof.

To facilitate reproduction and reuse, an artifact should be:

  • consistent with the claims and results presented in the paper;
  • as complete as possible, supporting all claims of the paper;
  • well-documented;
  • future-proof;
  • easy to extend and modify.

Artifacts that satisfy the reproduction criteria will be awarded the “Functional” badge. In addition, artifacts deemed to be reusable will be awarded the “Reusable” badge and artifacts that are made durably available will be awarded the “Available” badge. For more details on the badges and the evaluation criteria, see the Evaluation Guidelines.

We expect that most artifacts submitted for review at ICFP will have one of a few common forms: compilers, interpreters, proof scripts, and so on. We have codified common forms of artifacts on a separate page. If you are considering submitting an artifact that does not have one of these forms, we encourage you to contact the Artifact Evaluation chairs before the submission deadline to discuss what is expected.

Submission Process

The evaluation process uses a single-blind system: authors will not know the identities of reviewers.

Authors may iteratively improve their artifacts during the process to overcome small technical problems, but may not submit new material that would require substantially more reviewing effort.

We expect that most artifact submissions include BOTH:

  1. Software installed into a virtual machine (VM) image provided by the committee. See this page for details.

  2. A separate archive file that includes just the source files.

The intention is that reviewers who are familiar with certain tools (e.g. Agda or OCaml) can inspect the artifact sources directly, while reviewers who are less familiar can still execute the artifact without needing to install new software on their own machines. The VM image also ensures that future researchers, say in 5 years’ time, do not need to worry about version incompatibilities between old tool versions and new operating systems.

The detailed submission process is as follows:

  1. Read the Submission Guidelines page for details on artifact preparation.
  2. Register your intent to submit an artifact on the separate artifact-only HotCRP site.
  3. Prepare your artifact, building upon the base VM image.
  4. Upload your artifact to Zenodo (recommended), or otherwise make it available via a stable URL (i.e. the URL should not change if you later make updates to the artifact; and ideally, the URL has a good chance of continuing to exist well into the future).
  5. Finalize your submission on HotCRP with a link to your artifact. You should also upload a preprint of your paper, and any additional materials the reviewers may find helpful (e.g., appendices).

Timeline

The schedule for artifact evaluation is fairly compressed because it needs to fit between the notification of conditional acceptance and the camera-ready deadline. These are the key dates (all dates are in the Anywhere on Earth (AOE / UTC-12) timezone):

Event Date
Notification of paper acceptance (R1) Fri May 23
Artifact registration Thu May 29
Artifact submission Fri Jun 13
Start of review phase Mon Jun 16
Notification of paper acceptance (R2) Fri Jun 27
Further clarifications until Tue Jul 17
Final decisions sent to authors Wed Jul 18

Ideally, artifact evaluation is a collaborative process during which authors and evaluators work together to improve the artifact. We therefore recommend that at least one author of each artifact is available to respond to evaluators’ comments throughout the review period.

More Information

For additional information, clarification, or answers to questions, please contact the ICFP Artifact Evaluation co-chairs:

Most artifacts that are submitted for review at ICFP have one of a few common forms, so we provide some advice for authors about how to prepare them. This material should be taken as highly suggestive, but not prescriptive. If you have questions about what is expected, or if your artifact does not fit into any of the categories below, please contact the AE co-chairs as early as possible.

Instructions for All Artifacts

Artifact Registration

Artifacts must be registered via a separate artifact-only HotCRP instance by May 29th.

For registration, please write a short abstract of the artifact. This allows us to detect in advance artifacts that may have special requirements (e.g., specific hardware).

The submission of the full artifact is due on June 13th.

Sources and VM Image

Artifacts should generally consist of two components:

  1. A source archive file. Please use a widely available compressed archive format such as ZIP (.zip), tar and gzip (.tgz), or tar and bzip2 (.tbz2).
  2. A virtual machine image containing the same sources, with the artifact’s dependencies already installed.

The VM image primarily facilitates reproduction: it allows future scientists to reproduce the artifact’s results without having to deal with incompatible dependencies, changes to operating system interfaces, etc. The source source archive primarily facilitates reuse.

Both components must contain a README.md file that gives the name of the paper and step-by-step instructions for how to execute the artifact. For the source archive, these instructions should include how to install the artifact’s dependencies.

If the artifact requires particular hardware or software, e.g., a specific platform (x86, ARM, RISCV, a specific GPU…), a large amount of memory, or a specific operating system (Windows, BSD, …), these requirements should be written explicitly in the README.md file and the authors should also indicate these requirements in their submission on HotCRP.

The VM image may be produced by taking our base image, unpacking the source archive into the VM, and executing a prefix of the source archive’s instructions.

The instructions for building your artifact from within the VM image should not involve the network: in particular, they should not download packages or source files, because the URLs and the contents the URLs point to might not be stable. Therefore, all the needed dependencies of your artifact must have been fetched in the VM image already. This will ensure that your artifact can be built again in several years without problems.

Try to avoid installing graphical environments in the VM unless this is truly necessary. Graphical environments in VMs are sometimes slow and unstable.

Readme

In most cases, the step-by-step instructions in your README.md should be a list of commands to build and test the artifact on the examples described in the paper and to reproduce any graphs and benchmarking results. The instructions should call out particular features of the results, such as “this produces the graph in Fig. 5 that shows our algorithm runs in linear time”. Try to keep the instructions clear enough, so that reviewers can work through them in under 30 minutes. Consider providing a top-level Makefile so that the commands to be executed are just make targets that automatically build their prerequisites.

If the build process emits warning messages, perhaps when building libraries that are not under the author’s control, include a note in the instructions that this is the case. Without such a note, the reviewers may assume something is wrong with the artifact itself.

Separately from the step-by-step instructions, provide other details about what a reviewer should look at. For example, “our artifact extends existing system X and our extension is the code located in file Y”.

Upload to Zenodo

Once you have prepared your artifact, upload it to Zenodo to ensure that it will remain publicly accessible in perpetuity. Similar publicly accessible long-term archives are also acceptable (but not GitHub, Docker Hub or your personal website).

No Need for Anonymization

We use a single-blind review process. This means that while authors will not know the identity of reviewers, reviewers on the author hand will know who the authors of a paper are.

Revised Papers

Artifacts should, if possible, be evaluated against the revised version of the paper. To facilitate this, you can upload the revised version of the paper, or a partially revised draft, when you submit the artifact. Please also add a note to the artifact’s README alerting reviewers to the revisions.

Instructions for Common Types of Artifacts

Command-line Tools

Unix command-line tools should have standard --help-style command-line help pages. It is not acceptable for an executable to throw uninformative exceptions when executed with no flags or with the wrong flags.

Compilers and Interpreters

It should be obvious how to run the tool on new examples that the reviewers write themselves. Do not just hard-code the examples described in the paper.

If your tool consumes expressions in a custom DSL, then we recommend supplying a grammar for the concrete syntax, so that reviewers can try the tool on new examples. Papers that describe such languages often give just an abstract syntax, and it is often not clear what the full concrete syntax is from the paper alone.

Proof Scripts

In most cases, the artifact VM should contain an installation of the proof checker and specify a single command (preferably make) to re-check the proof. It is fine to leave the VM itself command-line only and require reviewers to browse the proof script locally on their own machines. It should not be necessary to have an IDE (e.g. CoqIDE, Emacs, or VSCode) installed into the VM, unless the paper is specifically about IDE functionality.

Include comments in the proof scripts that highlight the main theorems described in the paper. Use comments like “This is Theorem 1.2: Soundness described on page 6 of the paper”. Proof scripts written in “apply style” are typically unreadable without loading them into an IDE, but reviewers will still want to find the main lemmas and understand how they relate.

Reviewers often complain about a lack of comments in proof scripts. To authors, the logical statements of the lemmas themselves may be quite readable, but reviewers typically want English prose that repeats the same information.

Your proof artifact should also provide a command that gathers and prints the axioms that your proof relies on. For example, with the Coq/Rocq proof assistant, the Makefile produced by coq_makefile contains the target validate that does this. An instruction in your Readme.md file such as “look in the development for the admit keyword” is not acceptable.

Common Problems

This section discusses common problems with artifacts. If your artifact has any special requirements, please contact the AE co-chairs well in advance. We will then discuss how the artifact can be best reviewed. The advice below has been distilled from past experience at a variety of events and does not describe specific papers, artifacts, or authors.

Proprietary Software

It is reasonable for artifacts to depend on commercially licensed tools (e.g., MATLAB or some commercial SMT solver) if the paper’s audience would generally have access to these tools. In such cases, we will try to match the artifact with reviewers who also have access to these tools. If this is not possible, we will ask authors to provide an accessible environment in which the required tools are installed.

If parts of the artifact depend on a proprietary tool or proprietary data which cannot be made available to the reviewers at all, please contact the AE co-chairs to discuss whether and how the artifact can still be reviewed.

Special Hardware Requirements

Some artifacts require special hardware, e.g., GPUs or a computer cluster. If the hardware is relatively common, e.g., an NVIDIA GPU, we will try to find reviewers who have access to the hardware. If not, it is usually still possible to (partially) evaluate the artifact:

  • Authors may provide a scaled-down version of the artifact (e.g., a benchmark) which can be executed on a commodity computer.
  • Authors may provide anonymous access to an environment which includes the needed hardware (e.g. a computer cluster).
  • Authors may provide access to a simulator, e.g., for a specific FPGA.

Long-Running Computations

Some artifacts require extensive computations (on the order of days, not hours). In such cases, please provide a scaled-down version of the artifact which can be evaluated within a reasonable amount of time. If possible, make sure that the computation can be paused and resumed.

Unstable or Dangerous Software

Reviewers may be reluctant to install software that may destabilize their systems (e.g., kernel drivers) or that intentionally performs dangerous actions (e.g., proof-of-concept exploits). In such cases, document very clearly any possible effects on the host system and how to reverse them. If possible, prepare the artifact in such a way that the dangerous software is isolated from the reviewer’s system.

Web Interfaces

If your artifact has a web interface, try to get the server running locally inside the VM and allow the reviewer to connect to it via a web browser running natively on their host machine. Graphical environments installed into VMs are sometimes laggy and unstable, and standard web protocols are stable enough that such artifacts should be usable with any recent browser.

Programs that Generate Images

If the artifact produces an image file (e.g., a graph), then expect the reviewers to use scp or some such to copy it out to the host machine and view it. Authors should test that the connection to the VM works, as explained in the VM image’s README.md, so that this is possible.

Since 2022 at ICFP, artifact evaluation does not only make a binary decision between ‘artifact accepted’ and ‘artifact rejected’, but also distinguishes between two levels of artifact quality. To facilitate this, we believe it is important to lay out in some detail what these levels mean, to guide reviewers, to ensure fairness, and to help authors prepare their submissions.

However, this document cannot possibly capture all the nuances of what makes a good artifact. Any prescriptive statement should be read as a guideline, not a fixed rule. The ultimate aim is to produce artifacts that are useful to the research community and if that aim is better served by deviating from these guidelines, authors and reviewers should do so. (If the deviation is substantial, though, please contact the AE co-chairs to discuss it; the earlier, the better). Our guidelines are also not necessarily complete: if an artifact satisfies all the criteria below but has other major issues, it may still be denied the corresponding badges.

Badges

Per the ACM guidelines, artifacts can earn three badges:

  • Functional badge Artifact functional: for artifacts which are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation.
  • Reusable badge Artifact reusable: for artifacts that are Functional and facilitate reuse through careful documentation and clear organization.
  • Available badge Artifact available: This badge is for artifacts that are published in a permanent location (with a DOI). Artifacts do not need to be evaluated to receive this badge.

Functional and Reusable Badges

We award two badges for quality: functional and reusable. To be deemed reusable, an artifact must also fulfill all the criteria for functional artifacts. The difference between the two badges is, roughly, the difference between reproducibility and reusability:

  • Functional artifacts allow future scientists to confirm that any claims from the paper which are supposed to be supported by the artifact are in fact supported by the artifact. For example, if the artifact is a program, reviewers must be able to build it, to run it and to confirm that it yields the expected results.

  • Reusable artifacts go beyond reproduction by enabling future scientists to build upon the artifact. For example, programs should run on modern operating systems and have up-to-date dependencies.

In the following, we spell out our expectations for the two badges in more detail.

Functional Badge

Consistency and Completeness

The artifact should directly implement or support the technical content of the paper (consistency). It should validate any claims made in the paper about the artifact or, if there are no explicit claims in the paper, any claims that one would expect to be validated (completeness).

  • For programs, the program should work as described in the paper. The program may be an extended version of the one described in the paper, but all examples discussed in the paper should run with at most minimal and clearly documented changes.
  • For benchmarks, the results obtained by running the benchmarks should be consistent (within the expected variance) with the results reported in the paper. All graphs, tables, etc. should be reproducible. Exceptions are possible, e.g., when a benchmark takes a very long time to run. We expect such exceptions to be clearly documented.
  • For formal proofs, the proved statements should match those from the papers. Axioms or incomplete proofs are acceptable if they are documented in the paper.

Exercisability

It should be possible to reproduce the artifact’s contribution in any commonly used environment. For executable artifacts, this requirement is satisfied if the artifact is packaged as a VM image containing all relevant software and data sets (as described in the submission guidelines). If external data are required, it should be clear how to access them.

This requirement does not apply to artifacts which necessarily require a non-standard environment, e.g., special hardware or large amounts of computing power. But if at all possible, the artifact should still allow reviewers to partially verify the artifact, for example by

  • providing simulators for special hardware;
  • providing anonymous remote access to special hardware or compute clusters;
  • providing downscaled versions of the artifact which can be run on standard hardware.

Authors of such artifacts should contact us already before the submission to discuss these issues.

Documentation

The artifact should contain sufficient documentation for reviewers to perform the activities mentioned above.

  • For programs, it should be clear how to build the program and how to run it on the examples provided in the paper.
  • For benchmarks, it should be clear how to run the benchmark and how to interpret the resulting data.
  • For formal proofs, it should be clear:
  1. how to check that the proofs are axiom-free;
  2. which parts of the formal proof correspond to which theorem in the paper;
  3. how the notation and definitions used for the formal proof correspond to those used in the paper.

Reusable Badge

Reusable artifacts should satisfy the following requirements in addition to the functional ones. The reusable requirements are necessarily fuzzier than the functional ones. We rely on reviewers’ good judgment to determine whether an artifact can, in fact, be reused by future scientists. However, we ask reviewers to be lenient with regard to the more work-intensive requirements. It is generally easy to find fault with the documentation and code quality of research projects, but artifacts should be evaluated against the state of the art, not against a theoretical ideal of perfect software.

Exercisability

The artifact should work not only inside the VM image, but also in other standard environments. This means:

  • The artifact’s dependencies should be reasonably up-to-date. The artifact should not unnecessarily depend on specific drivers, instructions sets, etc.
  • The artifact should not depend on undocumented changes to, for example, the operating system or dependencies.
  • The artifact’s packaging should facilitate reuse (e.g., a Coq library could be packaged as an opam package).
  • The artifact should give reasonable error messages if it cannot be executed in some particular environment.

Documentation

The artifact should be documented in a way that facilitates reuse. This means:

  • There should be installation instructions for all supported operating systems. Dependencies should be clearly documented.
  • For programs, it should be clear how to run the program on inputs other than those from the paper. For example, for a compiler, the concrete syntax of the input language should be documented. Any options to the program should be documented. The main parts of the implementation should be documented to a reasonable degree. It should be clear how to run the test suite (if any).
  • For benchmarks, it should be clear how to run the benchmark on inputs other than those from the paper and how to prepare such inputs.
  • For formal proofs, the main parts of the proof (key lemmas and definitions) should be documented, especially if the notation differs from that used in the paper.

Quality

The artifact should be of sufficiently high quality that future scientists could reuse it without major changes. For example:

  • Code should be reasonably clear and consistently formatted.
  • The build process should be reasonably simple.
  • Error messages should be clear enough to facilitate debugging.
  • Web interfaces should work with all modern browsers.

Available Badge

The Artifact available badge is awarded to artifacts that meet at least the quality criteria of the functional badge and that are additionally stored in a long-term, publicly accessible archive. The archive must have a plan for permanent accessibility along the lines of Zenodo’s or FigShare’s policies. Other archives are also accepted, but they must fulfill the two above criteria:

  • Long-term archival: the archive must ensure that artifacts are available indefinitely. This excludes commercial offerings such as GitHub which make no long-term commitments as well as personal websites.
  • Public accessibility: the archive must be freely accessible to the general public.

We provide a VM image that you may use as a base for your own VM image. If you wish to use this image, follow the instructions in the next section. If you wish to create your own VM image from scratch, follow the instructions further down below.

Creating an Artifact Using the Base Image

Download an archive containing the base image and some supporting files:

base-image.tar.xz (548M)

To verify the download, create a sha file next to base-image.tar.xz with content

05838cabb4333f92c1b5bf7cff005bd04cf909286882fe85c1901ee34d07e185854a34e843ebcdb94ce0d24666b2627541013870a6cb69ea5e64d2914252dc73  base-image.tar.xz

Then run the command

$ sha512sum -c sha

Unpack base-image.tar.xz. This requires about 2.2GB of disk space. The unpacked directory contains a file README.md with further instructions.

The base VM image uses a virtual hard disk with a maximum capacity of 16GB. (Its actual size expands dynamically up to this maximum.) If you need more disk space, please create a custom VM image.

Creating an Artifact Using a Custom Image

Download an archive containing the supporting files. (These are the same files contained in base-image.tar.xz, minus the actual VM image.)

supporting-files.tar.xz (5.6K)

To verify the download, create a file sha next to supporting-files.tar.xz with content

d2f8cbaa8ea3939f848061035f8b67f9b7648da00a52ff2ee8da2d63caf2ff87ea6afbf55dcfb89372bbf768694090434419f02ad307bb9d22a9815131e414a1  supporting-files.tar.xz

Then run the command

$ sha512sum -c sha

Unpack the archive. The unpacked directory contains a file ImageCreation.md detailing the steps we took to create the base VM image. When creating your own image, please follow these steps as closely as possible (while making any modifications you need). This makes sure that the reviewers can run your VM image without too many surprises.

Adjust the supporting files as necessary for your custom VM image. Check that the start.sh and start.bat scripts still work. Add a prominent notice for the reviewers to your README.md.

M1 Macs

We adapted the script start.sh to work on Apple Silicon M1 macs. However, you must use macOS >= 12.4 to run this script. Earlier versions of macOS contain a bug which will lead to a kernel panic on certain M1 chips (at least M1 Pro and M1 Max).

You can also use the UTM app as a graphical QEMU frontend. Create a VM with the following settings:

  • 4GB of RAM
  • Use disk.qcow as the virtual hard drive.
  • Forward port 22 of the VM to port 5555 on the host (if you want to SSH into the VM).

Specifying number of cores

In case you need to rely on more than one core in your VM, you can specify the number of cores as extra parameters when launching the start.sh or start.bat script, e.g., start.sh -smp 2 will launch QEMU with two available cores.