MRCpy.AMRC

class MRCpy.AMRC(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]

Adaptative Minimax Risk Classifier

The class AMRC implements the method Adaptative Minimimax Risk Classificafiers (AMRCs) proposed in [1]. It is designed for online learning with streaming data. Training samples are fed sequentially and the classification rule is updated every time a new sample is provided.

AMRC provides adaptation to concept drift (change in the underlying distribution of the data). Such concept drift is common in multiple applications including electricity price prediction, spam mail filtering, and credit card fraud detection. AMRC accounts for multidimensional time changes by means of a multivariate and high-order tracking of the time-varying underlying distribution. In addition, differently from conventional techniques, AMRCs can provide computable tight performance guarantees at learning.

It implements 0-1 loss function and it can be used with linear and Random Fourier features.

See also

For more information about AMRC, one can refer to the following paper:

[1] `Álvarez, V., Mazuelas, S., & Lozano, J. A. (2022). Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees. International Conference on Machine Learning (ICML) 2022.

@InProceedings{AlvMazLoz22, title = {Minimax Classification under Concept Drift with

Multidimensional Adaptation and Performance Guarantees},

author = {{‘A}lvarez, Ver{‘o}nica

and Mazuelas, Santiago and Lozano, Jose A},

booktitle = {Proceedings of the 39th

International Conference on Machine Learning},

pages = {486–499}, year = {2022}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {Jul}, publisher = {PMLR}, }

Parameters:
n_classesint

Number of different possible labels for an instance.

deterministicbool, default = True

Whether the prediction of the labels should be done in a deterministic way (given a fixed random_state in the case of using random Fourier or random ReLU features).

lossstr {‘0-1’}, default = ‘0-1’

Type of loss function to use for the risk minimization. AMRC supports 0-1 loss. 0-1 loss quantifies the probability of classification error at a certain example for a certain rule.

unidimensionalbool, default = False

Whether to model change in the variables unidimensionally or not. Available for comparison purposes.

deltafloat, default = 0.05

Significance of the upper bound on the accumulated mistakes. Lower values will produce higher values for bounds.

orderint, default = 1

Order of the subgradients used in optimization.

Wint, default = 200

Window size. The model uses the last W samples for fitting the model.

Nint, default = 100

Number of subgradients used for optimization.

max_itersint, default = 2000

Maximum number of iterations to use for finding the solution of optimization in the subgradient approach.

phistr or BasePhi instance, default = ‘linear’

Type of feature mapping function to use for mapping the input data. The currenlty available feature mapping methods are ‘fourier’, ‘relu’, ‘threshold’ and ‘linear’. The users can also implement their own feature mapping object (should be a BasePhi instance) and pass it to this argument. Note that when using ‘fourier’ feature mapping, training and testing instances are expected to be normalized. To implement a feature mapping, please go through the Feature Mappings section.

‘linear’

It uses the identity feature map referred to as Linear feature map. See class BasePhi.

‘fourier’

It uses Random Fourier Feature map. See class RandomFourierPhi.

random_stateint, RandomState instance, default = None

Random seed used when using ‘fourier’ for feature mappings to produce the random weights.

fit_interceptbool, default = True

Whether to calculate the intercept for MRCs If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).

**phi_kwargsAdditional parameters for feature mappings.

Groups the multiple optional parameters for the corresponding feature mappings(phi).

For example in case of fourier features, the number of features is given by n_components parameter which can be passed as argument AMRC(phi='fourier', n_components=500)

The list of arguments for each feature mappings class can be found in the corresponding documentation.

Methods

compute_lambda(X, Y)

Compute deviation in the mean estimate tau using the given training instances.

compute_phi(X)

Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).

compute_tau(X, Y)

Compute mean estimate tau using the given training instances.

error(X, Y)

Return the mean error obtained for the given test data and labels.

fit(x, y[, X_])

Fit the AMRC model.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

get_upper_bound()

Returns the upper bound on the expected loss for the fitted classifier.

get_upper_bound_accumulated()

Returns the upper bound on the accumulated mistakes of the fitted classifier.

minimax_risk(x, tau_, lambda_, n_classes)

Learning

predict(X)

Predicts classes for new instances using a fitted model.

predict_proba(x)

Conditional probabilities corresponding to each class for each unlabeled input instance

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_fit_request(*[, X_, x])

Request metadata passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_predict_proba_request(*[, x])

Request metadata passed to the predict_proba method.

set_score_request(*[, sample_weight])

Request metadata passed to the score method.

__init__(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]
compute_lambda(X, Y)

Compute deviation in the mean estimate tau using the given training instances.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Training instances used for solving the minimax risk optimization problem.

Yarray-like of shape (n_samples, 1), default = None

Labels corresponding to the training instances used only to compute the expectation estimates.

compute_phi(X)

Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Instances to be converted to features.

compute_tau(X, Y)

Compute mean estimate tau using the given training instances.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Training instances used for solving the minimax risk optimization problem.

Yarray-like of shape (n_samples, 1), default = None

Labels corresponding to the training instances used only to compute the expectation estimates.

error(X, Y)

Return the mean error obtained for the given test data and labels.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Test instances for which the labels are to be predicted by the MRC model.

Yarray-like of shape (n_samples, 1), default = None

Labels corresponding to the testing instances used to compute the error in the prediction.

Returns:
errorfloat

Mean error of the learned MRC classifier

fit(x, y, X_=None)[source]

Fit the AMRC model.

Computes the parameters required for the minimax risk optimization and then calls the minimax_risk function to solve the optimization.

Parameters:
Xarray-like of shape (n_features)

Training instances used in

  • Calculating the expectation estimates that constrain the uncertainty set for the minimax risk classification

  • Solving the minimax risk optimization problem.

Yint, default = None

Label corresponding to the training instance used only to compute the expectation estimates.

X_None

Unused in AMRC

Returns:
self

Fitted estimator

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

get_upper_bound()[source]

Returns the upper bound on the expected loss for the fitted classifier.

Returns:
upper_boundfloat

Upper bound of the expected loss for the fitted classifier.

get_upper_bound_accumulated()[source]

Returns the upper bound on the accumulated mistakes of the fitted classifier.

Returns:
upper_bound_accumulatedfloat

Upper bound of the accumulated for the fitted classifier.

minimax_risk(x, tau_, lambda_, n_classes)[source]

Learning

This function efficiently learns classifier parameters

predict(X)[source]

Predicts classes for new instances using a fitted model.

Returns the predicted classes for the given instances in X using the probabilities given by the function predict_proba.

Parameters:
Xarray-like of shape (n_features)

Test instance for to predict by the AMRC model.

Returns:
y_predint

Predicted labels corresponding to the given instances.

predict_proba(x)[source]

Conditional probabilities corresponding to each class for each unlabeled input instance

Parameters:
xarray-like of shape (n_dimensions)

Testing instance for which the prediction probabilities are calculated for each class.

Returns:
hndarray of shape (n_classes)

Probabilities \((p(y|x))\) corresponding to the predictions for each class.

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:
scorefloat

Mean accuracy of self.predict(X) w.r.t. y.

set_fit_request(*, X_: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') AMRC

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
X_str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_ parameter in fit.

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for x parameter in fit.

Returns:
selfobject

The updated object.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

set_predict_proba_request(*, x: bool | None | str = '$UNCHANGED$') AMRC

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict_proba.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for x parameter in predict_proba.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') AMRC

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

Examples using MRCpy.AMRC

Example: Use of AMRC (Adaptative MRC) for Online Learning

Example: Use of AMRC (Adaptative MRC) for Online Learning