Skip to content

Releases: irgroup/repro_eval

v0.5.0

02 Dec 17:44
63b5d8c

Choose a tag to compare

Notes:
The main contribution of this release is the replacement of the evaluation backend. Specifically, ir‑measures replaces pytrec_eval. Future releases of repro_eval could include different evaluation backends; their comparison would be an interesting reproducibility study in its own right. In general, the project’s code is now more agnostic about the evaluation backend. Internally, runs, qrels, and topics use a nested‑dictionary data structure. To integrate another evaluation backend, the code must be added to repro_eval/util.py.

New features:

  • ir-measures replaces pytrec_eval as the evaluation backend
  • RBO and KTU are now computed as an aggregated score per default, setting per_topic=True returns topic-wise rank correlation scores
  • RBO is parameterizable from the repro_eval interface
  • The old RBO implementation was removed.
  • Kendall's tau Union is now called via ktu() instead of ktau_union()

Bugfixes:

  • Some methods were improved to evaluating additional/external runs that are not provided on initialization.
  • Rename variable qrel to qrels.

v0.4.0

15 Feb 14:58

Choose a tag to compare

New feature: metadata support

v0.3.3

30 Aug 13:34

Choose a tag to compare

New feature: Break score ties (like it is implemented in trec_eval) for runs that are not added via pytrec_eval.

v0.3.2

25 Aug 13:42

Choose a tag to compare

Bugfix: Avoid rounding errors related to Kendall's tau

v0.3.1

22 Jul 12:03

Choose a tag to compare

v0.3

09 Jul 07:00

Choose a tag to compare

  • New feature: The normalized RMSE (nRMSE) can now be determined for reproduced runs. Use nrmse() with the RpdEvaluator.

v0.2

22 Jun 09:23

Choose a tag to compare

  • New feature: It is now possible to pass the paths of the run files to the function of each measure.
  • More unit tests
  • Minor bug fixes

v0.1

07 Apr 17:04

Choose a tag to compare

update pip installation