v2.0.0
Release v2.0.0
What's changed
Added features
- We welcome CAPO to the family of our optimizers! CAPO is an optimizer, capable of utilizing few-shot examples to improve prompt performance. Additionally it implements multiple AutoML-approaches. Check out the paper by Zehle et al. (2025) for more details (yep it's us :))
- Eval-Cache is now part of the ClassificationTask! This saves a lot of LLM-calls as we do not rerun already evaluated data points
- Similar to the Eval-Cache, we added a Sequence-Cache, allowing to extract reasoning chains for few-shot examples
- introduced evaluation strategies to the ClassificationTask, allowing for random subsampling, sequential blocking of the dataset or just retrieving scores of datapoints that were already evaluated on prompts
Further changes
- rearanged imports and module memberships
- Classificators are now called Classifiers
- Fixed multiple docstrings and namings of variables.
- Simplified testing and extended the testcases to the new implementations
- Classification task can now also output a per-datapoint score
- Introduced statistical tests (specifically paired-t-test), for CAPO
Full Changelog: here