Publish a new benchmark

Upon registration, it is also possible to define and upload materials to include a new benchmark in EvALL and share it with the research community. Note that it is not required to upload the test collection itself (which would complicate matters legally from the point of view of distribution), but only the gold standard and basic evaluation specifications (type of task, official metrics, etc.). When uploading a new benchmark at least one baseline or system output associated with the gold standard must be also published.

Once the benchmark is in the repository, the research community can:

  • Evaluate their results in a way compliant with measurement theory and with state-of-the-art evaluation practices in the field.

  • Quantitatively and qualitatively compare their results regarding the benchmark with the state-of-the-art.

  • Provide their results as reusable data to the scientific community.

You can explore already stored benchmarks in EvALL in the browsing window.

If you want to publish a new benchmark, please select the option "Publish a new benchmark" in the main menu of the Home page.