MRCpy: A Library for Minimax Risk Classifiers

Travis-CI Build Status Code coverage

MRCpy library implements minimax risk classifiers (MRCs) that are based on robust risk minimization and can utilize 0-1-loss. Such techniques give rise to a manifold of classification methods that can provide tight bounds on the expected loss. MRCpy provides a unified interface for different variants of MRCs and follows the standards of popular Python libraries. The presented library also provides implementation for popular techniques that can be seen as MRCs such as L1-regularized logistic regression, zero-one adversarial, and maximum entropy machines. In addition, MRCpy implements recent feature mappings such as Fourier, ReLU, and threshold features.

MRCpy library incorporates a variety of datasets, along with descriptions and convenient loader functions for each dataset. More information about loaders is available in Dataset Loaders load. The available datasets from the UCI Repository are: credit, diabetes, ecoli, glass, haberman, indian liver patient, iris, letter recognition, mammographic, optdigits, redwine, satellite and segment.

There also be several datasets realated to computer vision which are actually “feature datasets”. We obtained these features by using a pretrained neural network over the images and taking the features from the second last layer. You can check more about this in our examples MRCs with Deep Neural Networks: Part I and MRCs with Deep Neural Networks: Part II. The image dataset Yearbook dataset is available both in its original version consisting of portrait images and in extracted features form in a CSV file. There are also feature datasets of MNIST and Cats vs Dogs datasets which images versions are directly available using Tensorflow Datasets. For all these feature datasets we used a pretrained ResNet18 over ImageNet.

Documentation outline

References

For more information about the MRC method and the MRCpy library, one can refer to the following resources:

Funding

Funding in direct support of this work has been provided through different research projects by the following institutions.

Spanish Ministry of Science and Innovation logo

Spanish Ministry of Science and Innovation through the projects PID2019-105058GA-I00, PID2022-137063NB-I00 and CNS2022-135203 funded by MCIN/AEI/10.13039/501100011033.

AXA Research Fund logo

AXA Research Fund through the project “Early Prognosis of COVID-19 Infections via Machine Learning” funded in the Exceptional Flash Call “Mitigating risk in the wake of the COVID-19 pandemic”.

Basque Government logo

Basque Government through the project “Mathematical Modeling Applied to Health”, and through the “ELKARTEK Program”.