MRCpy.MRC
- class MRCpy.MRC(loss='0-1', s=0.3, deterministic=True, random_state=None, fit_intercept=True, solver='subgrad', max_iters=10000, n_max=100, k_max=20, eps=0.0001, phi='linear', **phi_kwargs)[source]
Minimax Risk Classifier
The class MRC implements the method Minimimax Risk Classifiers (MRC) proposed in [1] using the default constraints. It implements two kinds of loss functions, namely 0-1 and log loss.
The method MRC approximates the optimal classification rule by an optimization problem of the form
\[\mathcal{P}_{\text{MRC}}: \min_{h\in T(\mathcal{X},\mathcal{Y})} \max_{p\in\mathcal{U}} \ell(h,p)\]where we consider an uncertainty set \(\mathcal{U}\) of potential probabilities. These untertainty sets of distributions are given by constraints on the expectations of a vector-valued function \(\phi : \mathcal{X} \times \mathcal{Y} \rightarrow \mathbb{R}^m\) referred to as feature mapping.
This is a subclass of the super class
BaseMRC.See Examples of use for futher applications of this class and its methods.
See also
For more information about MRC, one can refer to the following resources:
- Parameters:
- loss
str{‘0-1’, ‘log’}, default = ‘0-1’ Type of loss function to use for the risk minimization. 0-1 loss quantifies the probability of classification error at a certain example for a certain rule. Log-loss quantifies the minus log-likelihood at a certain example for a certain rule.
- s
float, default =0.3 Parameter that tunes the estimation of expected values of feature mapping function. It is used to calculate \(\lambda\) (variance in the mean estimates for the expectations of the feature mappings) in the following way
\[\lambda = s * \text{std}(\phi(X,Y)) / \sqrt{\left| X \right|}\]where (X,Y) is the dataset of training samples and their labels respectively and \(\text{std}(\phi(X,Y))\) stands for standard deviation of \(\phi(X,Y)\) in the supervised dataset (X,Y).
- deterministic
bool, default =True Whether the prediction of the labels should be done in a deterministic way (given a fixed
random_statein the case of using random Fourier or random ReLU features).- random_state
int, RandomState instance, default =None Random seed used when ‘fourier’ and ‘relu’ options for feature mappings are used to produce the random weights.
- fit_intercept
bool, default =True Whether to calculate the intercept for MRCs If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
- solver{‘cvx’, ’subgrad’, ’cg’}, default = ’subgrad’
Method to use in solving the optimization problem. Default is ‘cvx’. To choose a solver, you might want to consider the following aspects:
- ’cvx’
Solves the optimization problem using the CVXPY library. Obtains an accurate solution while requiring more time than the other methods. Note that the library uses the GUROBI solver in CVXpy for which one might need to request for a license. A free license can be requested here
- ’subgrad’
Solves the optimization using a subgradient approach. The parameter
max_itersdetermines the number of iterations for this approach. More iteration lead to an accurate solution while requiring more time.- ’cg’
Solves the optimization using an algorithm based on constraint generation. This algorithm provides efficient learning especially for scenarios with large number of features.
See also
For more information about the constraint generation algorithm for 0-1 MRC, one can refer to the following resource:
- max_iters
int, default =10000 Maximum number of iterations to use for finding the solution of optimization when using the subgradient approach.
- n_max
int, default =100 Maximum number of features selected in each iteration in case of ’cg’ solver.
- k_max
int, default =20 Maximum number of iterations in case of ’cg’ solver.
- eps
float, default =1e-4 Dual constraints’ violation threshold for ’cg’ solver.
- phi
strorBasePhiinstance, default = ‘linear’ Type of feature mapping function to use for mapping the input data. The currenlty available feature mapping methods are ‘fourier’, ‘relu’, ‘threshold’ and ‘linear’. The users can also implement their own feature mapping object (should be a
BasePhiinstance) and pass it to this argument. Note that when using ‘fourier’ or ‘relu’ feature mappings, training and testing instances are expected to be normalized. To implement a feature mapping, please go through the Feature Mappings section.- ‘linear’
It uses the identity feature map referred to as Linear feature map. See class
BasePhi.- ‘fourier’
It uses Random Fourier Feature map. See class
RandomFourierPhi.- ‘relu’
It uses Rectified Linear Unit (ReLU) features. See class
RandomReLUPhi.- ‘threshold’
It uses Feature mappings obtained using a threshold. See class
ThresholdPhi.
- **phi_kwargsAdditional parameters for feature mappings.
Groups the multiple optional parameters for the corresponding feature mappings(
phi).For example in case of fourier features, the number of features is given by
n_componentsparameter which can be passed as argumentMRC(loss='log', phi='fourier', n_components=500)The list of arguments for each feature mappings class can be found in the corresponding documentation.
- loss
Examples
Simple example of using MRC with default seetings: 0-1 loss and linear feature mapping. We first load the data and split it into train and test sets. We fit the model with the training samples using
fitfunction. Then, we predict the class of some test samples withpredict. We can also obtain the probabilities of each class withpredict_proba. Finally, we calculate the score of the model over the test set usingscore.>>> from MRCpy import MRC >>> from MRCpy.datasets import load_mammographic >>> from sklearn import preprocessing >>> from sklearn.model_selection import train_test_split >>> # Loading the dataset >>> X, Y = load_mammographic(return_X_y=True) >>> # Split the dataset into training and test instances >>> X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=0) >>> # Standarize the data >>> std_scale = preprocessing.StandardScaler().fit(X_train, Y_train) >>> X_train = std_scale.transform(X_train) >>> X_test = std_scale.transform(X_test) >>> # Fit the MRC model >>> clf = MRC().fit(X_train, Y_train) >>> # Prediction. The predicted values for the first 10 test instances are: >>> clf.pre (X_test[:10, :]) [1 0 0 0 0 1 0 1 0 0] >>> # Predicted probabilities. >>> # The predicted probabilities for the first 10 test instances are: >>> clf.predict_proba(X_test[:10, :]) [[2.80350905e-01 7.19649095e-01] [9.99996406e-01 3.59370941e-06] [8.78592959e-01 1.21407041e-01] [8.78593719e-01 1.21406281e-01] [8.78595619e-01 1.21404381e-01] [1.58950511e-01 8.41049489e-01] [9.99997060e-01 2.94047920e-06] [4.01753510e-01 5.98246490e-01] [8.78595322e-01 1.21404678e-01] [6.35793570e-01 3.64206430e-01]] >>> # Calculate the score of the predictor >>> # (mean accuracy on the given test data and labels) >>> clf.score(X_test, Y_test) 0.7731958762886598
- Attributes:
- is_fitted_
bool Whether the classifier is fitted i.e., the parameters are learnt or not.
- tau_
array-like of shape (n_features) orfloat Mean estimates for the expectations of feature mappings.
- lambda_
array-like of shape (n_features) orfloat Variance in the mean estimates for the expectations of the feature mappings.
- mu_
array-like of shape (n_features) orfloat Parameters learnt by the optimization.
- nu_
float Parameter learnt by the optimization.
- mu_l_
array-like of shape (n_features) orfloat Parameters learnt by solving the lower bound optimization of MRC.
- upper_
float Optimized upper bound of the MRC classifier.
- lower_
float Optimized lower bound of the MRC classifier.
- upper_params_
dict Dictionary that stores the optimal points and best value for the upper bound of the function.
- params_
dict Dictionary that stores the optimal points and best value for the lower bound of the function.
- is_fitted_
Methods
compute_lambda(X, Y)Compute deviation in the mean estimate tau using the given training instances.
compute_phi(X)Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
compute_tau(X, Y)Compute mean estimate tau using the given training instances.
error(X, Y)Return the mean error obtained for the given test data and labels.
fit(X, Y[, X_])Fit the MRC model.
Obtains the lower bound on the expected loss for the fitted classifier.
Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
Returns the upper bound on the expected loss for the fitted classifier.
minimax_risk(X, tau_, lambda_, n_classes)Solves the minimax risk problem for different types of loss (0-1 and log loss).
predict(X)Predicts classes for new instances using a fitted model.
Conditional probabilities corresponding to each class for each unlabeled input instance
score(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_fit_request(*[, X_])Request metadata passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_score_request(*[, sample_weight])Request metadata passed to the
scoremethod.- __init__(loss='0-1', s=0.3, deterministic=True, random_state=None, fit_intercept=True, solver='subgrad', max_iters=10000, n_max=100, k_max=20, eps=0.0001, phi='linear', **phi_kwargs)[source]
- compute_lambda(X, Y)
Compute deviation in the mean estimate tau using the given training instances.
- compute_phi(X)
Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
- compute_tau(X, Y)
Compute mean estimate tau using the given training instances.
- error(X, Y)
Return the mean error obtained for the given test data and labels.
- Parameters:
- Returns:
- errorfloat
Mean error of the learned MRC classifier
- fit(X, Y, X_=None)
Fit the MRC model.
Computes the parameters required for the minimax risk optimization and then calls the
minimax_riskfunction to solve the optimization.- Parameters:
- X
array-like of shape (n_samples,n_dimensions) Training instances used in
Calculating the expectation estimates that constrain the uncertainty set for the minimax risk classification
Solving the minimax risk optimization problem.
n_samplesis the number of training samples andn_dimensionsis the number of features.- Y
array-like of shape (n_samples, 1), default =None Labels corresponding to the training instances used only to compute the expectation estimates.
- X_array-like of shape (
n_samples2,n_dimensions), default = None These instances are optional and when given, will be used in the minimax risk optimization. These extra instances are generally a smaller set and give an advantage in training time.
- X
- Returns:
- self
Fitted estimator
- get_lower_bound()[source]
Obtains the lower bound on the expected loss for the fitted classifier.
- Returns:
- lower_bound
float Lower bound of the error for the fitted classifier.
- lower_bound
- get_metadata_routing()
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequestencapsulating routing information.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_upper_bound()[source]
Returns the upper bound on the expected loss for the fitted classifier.
- Returns:
- upper_bound
float Upper bound of the expected loss for the fitted classifier.
- upper_bound
- minimax_risk(X, tau_, lambda_, n_classes)[source]
Solves the minimax risk problem for different types of loss (0-1 and log loss). The solution of the default MRC optimization gives the upper bound of the error.
- Parameters:
- X
array-like of shape (n_samples,n_dimensions) Training instances used for solving the minimax risk optimization problem.
- tau_
array-like of shape (n_features*n_classes) Mean estimates for the expectations of feature mappings.
- lambda_
array-like of shape (n_features*n_classes) Variance in the mean estimates for the expectations of the feature mappings.
- n_classes
int Number of labels in the dataset.
- X
- Returns:
- self
Fitted estimator
- predict(X)
Predicts classes for new instances using a fitted model.
Returns the predicted classes for the given instances in
Xusing the probabilities given by the functionpredict_proba.
- predict_proba(X)[source]
Conditional probabilities corresponding to each class for each unlabeled input instance
- score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- set_fit_request(*, X_: bool | None | str = '$UNCHANGED$') MRC
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
- X_str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_parameter infit.
- Returns:
- selfobject
The updated object.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') MRC
Request metadata passed to the
scoremethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.
Examples using MRCpy.MRC
Example: Predicting COVID-19 patients outcome using MRCs