Releases · irgroup/repro_eval

02 Dec 17:44

breuert

v0.5.0

63b5d8c

v0.5.0 Latest

Latest

Notes:
The main contribution of this release is the replacement of the evaluation backend. Specifically, ir‑measures replaces pytrec_eval. Future releases of repro_eval could include different evaluation backends; their comparison would be an interesting reproducibility study in its own right. In general, the project’s code is now more agnostic about the evaluation backend. Internally, runs, qrels, and topics use a nested‑dictionary data structure. To integrate another evaluation backend, the code must be added to repro_eval/util.py.

New features:

ir-measures replaces pytrec_eval as the evaluation backend
RBO and KTU are now computed as an aggregated score per default, setting per_topic=True returns topic-wise rank correlation scores
RBO is parameterizable from the repro_eval interface
The old RBO implementation was removed.
Kendall's tau Union is now called via ktu() instead of ktau_union()

Bugfixes: