MRCpy: A Library for Minimax Risk Classifiers

MRCpy library implements minimax risk classifiers (MRCs) that are based on robust risk minimization and can utilize 0-1-loss. Such techniques give rise to a manifold of classification methods that can provide tight bounds on the expected loss. MRCpy provides a unified interface for different variants of MRCs and follows the standards of popular Python libraries. The presented library also provides implementation for popular techniques that can be seen as MRCs such as L1-regularized logistic regression, zero-one adversarial, and maximum entropy machines. In addition, MRCpy implements recent feature mappings such as Fourier, ReLU, and threshold features.

MRCpy library incorporates a variety of datasets, along with descriptions and convenient loader functions for each dataset. More information about loaders is available in Dataset Loaders load. The available datasets from the UCI Repository are: credit, diabetes, ecoli, glass, haberman, indian liver patient, iris, letter recognition, mammographic, optdigits, redwine, satellite and segment.

There also be several datasets realated to computer vision which are actually “feature datasets”. We obtained these features by using a pretrained neural network over the images and taking the features from the second last layer. You can check more about this in our examples MRCs with Deep Neural Networks: Part I and MRCs with Deep Neural Networks: Part II. The image dataset Yearbook dataset is available both in its original version consisting of portrait images and in extracted features form in a CSV file. There are also feature datasets of MNIST and Cats vs Dogs datasets which images versions are directly available using Tensorflow Datasets. For all these feature datasets we used a pretrained ResNet18 over ImageNet.

Documentation outline

Getting started

MRCpy Package Contents

Gallery of examples
- Basic Examples
- Further applications

References

For more information about the MRC method and the MRCpy library, one can refer to the following resources:

[1] Mazuelas, S., Zanoni, A., & Pérez, A. (2020). Minimax Classification with 0-1 Loss and Performance Guarantees. Advances in Neural Information Processing Systems, 33, 302-312.

@article{mazuelas2020minimax,
   title={Minimax Classification with 0-1 Loss and Performance Guarantees},
   author={Mazuelas, Santiago and Zanoni, Andrea and P{\'e}rez, Aritz},
   journal={Advances in Neural Information Processing Systems},
   volume={33},
   pages={302--312},
   year={2020}
}

[2] Mazuelas, S., Shen, Y., & Pérez, A. (2022). Generalized Maximum Entropy for Supervised Classification. IEEE Transactions on Information Theory, 68(4), 2530-2550.

@article{MazShePer:22,
          author = {Santiago Mazuelas and Yuan Shen and Aritz P\'{e}rez},
          title = {Generalized Maximum Entropy for Supervised Classification},
          journal={IEEE Transactions on Information Theory},
          volume = {68},
          number = {4},
          pages = {2530-2550},
          year={2022}
         }

[3] Bondugula, K., Mazuelas, S., & Pérez, A. (2021). MRCpy: A Library for Minimax Risk Classifiers. arXiv preprint arXiv:2108.01952.

@article{bondugula2021mrcpy,
   title={MRCpy: A Library for Minimax Risk Classifiers},
   author={Bondugula, Kartheek and Mazuelas, Santiago and P{\'e}rez, Aritz},
   journal={arXiv preprint arXiv:2108.01952},
   year={2021}
}

[4] Álvarez, V., Mazuelas, S., & Lozano, J.A. (2022). Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees.

@inproceedings{AlvMazLoz:22,
   title={Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees},
   author={Ver{\'o}nica {\'A}lvarez and Santiago Mazuelas and Jose A. Lozano},
   booktitle={Proceedings of the 39th International Conference on Machine Learning},
   pages={486--499},
   month={17--23 Jul}
   year={2022}
}

[5] Álvarez, V., Mazuelas, S., & Lozano, J.A. (2023). Minimax Forward and Backward Learning of Evolving Tasks with Performance Guarantees.

@inproceedings{AlvMazLoz:23,
   title={Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees},
   author={Ver{\'o}nica {\'A}lvarez and Santiago Mazuelas and Jose A. Lozano},
   booktitle={Advances in Neural Information Processing Systems},
   volume={36},
   pages={65678--65702},
   year={2023}
}

[6] Bondugula, K., Mazuelas, S., & Pérez A. (2023). Efficient Learning of Minimax Risk Classifiers in High Dimensions.

@inproceedings{RedMazPer:23,
   title={Efficient Learning of Minimax Risk Classifiers in High Dimensions},
   author={Kartheek Bondugula and Santiago Mazuelas and Aritz P\'{e}rez},
   booktitle={Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
   pages={206--215},
   year={2023}
}

[7] Segovia-Martin, J.I., Mazuelas, S., & Liu, A. (2023). Double-Weighting for Covariate Shift Adaptation.

@inproceedings{SegMazLiu:23,
   title={Double-Weighting for Covariate Shift Adaptation},
   author={Jose I. Segovia-Martin and Santiago Mazuelas and Anqi Liu},
   booktitle={Proceedings of the International Conference on Machine Learning},
   pages={30439--30457},
   volume={40},
   year={2023}
}

Funding

Funding in direct support of this work has been provided through different research projects by the following institutions.

Spanish Ministry of Science and Innovation through the projects PID2019-105058GA-I00, PID2022-137063NB-I00 and CNS2022-135203 funded by MCIN/AEI/10.13039/501100011033.

AXA Research Fund through the project “Early Prognosis of COVID-19 Infections via Machine Learning” funded in the Exceptional Flash Call “Mitigating risk in the wake of the COVID-19 pandemic”.

Basque Government through the project “Mathematical Modeling Applied to Health”, and through the “ELKARTEK Program”.