Release v0.5.0 · irgroup/repro_eval

Notes:
The main contribution of this release is the replacement of the evaluation backend. Specifically, ir‑measures replaces pytrec_eval. Future releases of repro_eval could include different evaluation backends; their comparison would be an interesting reproducibility study in its own right. In general, the project’s code is now more agnostic about the evaluation backend. Internally, runs, qrels, and topics use a nested‑dictionary data structure. To integrate another evaluation backend, the code must be added to repro_eval/util.py.

New features:

ir-measures replaces pytrec_eval as the evaluation backend
RBO and KTU are now computed as an aggregated score per default, setting per_topic=True returns topic-wise rank correlation scores
RBO is parameterizable from the repro_eval interface
The old RBO implementation was removed.
Kendall's tau Union is now called via ktu() instead of ktau_union()

Bugfixes:

Some methods were improved to evaluating additional/external runs that are not provided on initialization.
Rename variable qrel to qrels.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!