MRCpy.AMRC

class MRCpy.AMRC(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]

Adaptive Minimax Risk Classifier

This class implements Adaptive Minimax Risk Classifiers (AMRCs) proposed in [1] for online learning with streaming data. Training samples are fed sequentially and the classification rule is updated every time a new sample is provided.

At each time step \(t\), the classifier solves the minimax risk problem:

\[\mathrm{h}_t^{\mathcal{U}_t} \in \arg\min_{\mathrm{h}} \max_{\mathrm{p} \in \mathcal{U}_t} \ell(\mathrm{h}, \mathrm{p})\]

which finds the classifier \(\mathrm{h}_t\) that minimizes the worst-case expected 0-1 loss over a time-varying uncertainty set \(\mathcal{U}_t\) of distributions.

The uncertainty set \(\mathcal{U}_t\) follows the same form as \(\mathcal{U}_1\) (no marginal constraint), with time-varying parameters:

\[\mathcal{U}_t = \left\{ \mathrm{p} : \left| \mathbb{E}_{\mathrm{p}}[\Phi(x,y)] - \boldsymbol{\tau}_t \right| \leq \boldsymbol{\lambda}_t \right\}\]

where \(\boldsymbol{\tau}_t\) and \(\boldsymbol{\lambda}_t\) are updated at each time step via Kalman filtering to track the time-varying underlying distribution.

AMRC provides adaptation to concept drift (change in the underlying distribution of the data) by means of a multivariate and high-order tracking of the distribution. The mean vector estimates \(\boldsymbol{\tau}_t\) and confidence vectors \(\boldsymbol{\lambda}_t\) are obtained from a linear dynamical system model with Kalman filter recursions. The kinematic model order (controlled by the order parameter) determines the complexity of the tracking: order 0 corresponds to a zero-order model, order 1 to white noise acceleration, and order 2 to Wiener process acceleration.

AMRC provides computable tight performance guarantees at learning. The instantaneous error probability satisfies \(R(\mathrm{h}_t) \leq R(\mathcal{U}_t) + \alpha_t\), and the accumulated mistakes are bounded using the sum of minimax risks and a confidence term controlled by the delta parameter.

It implements the 0-1 loss function and can be used with linear and Random Fourier features.

See [1] for further.

Parameters:
n_classesint

Number of different possible labels for an instance.

deterministicbool, default = True

Whether the prediction of the labels should be done in a deterministic way (given a fixed random_state in the case of using random Fourier features).

lossstr {‘0-1’}, default = ‘0-1’

Type of loss function to use for the risk minimization. AMRC supports 0-1 loss only. 0-1 loss quantifies the probability of classification error at a certain example for a certain rule.

unidimensionalbool, default = False

Whether to model change in the variables unidimensionally or not. When True, the kinematic model order is forced to 0 and tracking is performed independently per dimension. Available for comparison purposes.

deltafloat, default = 0.05

Significance level for the upper bound on accumulated mistakes. Lower values produce higher (more conservative) bounds.

orderint, default = 1

Order of the kinematic model used for tracking the time-varying distribution. Controls the complexity of the Kalman filter:

  • 0: Zero-order model (constant state).

  • 1: White noise acceleration model.

  • 2: Wiener process acceleration model.

Ignored when unidimensional=True (forced to 0).

Wint, default = 200

Window size for estimating label probabilities. The model uses the last W samples for sliding-window probability estimation.

Nint, default = 100

Maximum number of subgradients retained for the local approximation of \(\varphi(\cdot)\) in the optimization.

max_itersint, default = 2000

Maximum number of iterations for the accelerated subgradient method used to solve the minimax risk optimization.

phistr or BasePhi instance, default = ‘linear’

Type of feature mapping function to use for mapping the input data. The currently available feature mapping methods are ‘fourier’ and ‘linear’. The users can also implement their own feature mapping object (should be a BasePhi instance) and pass it to this argument. Note that when using ‘fourier’ feature mapping, training and testing instances are expected to be normalized. To implement a feature mapping, please go through the Feature Mappings section.

‘linear’

It uses the identity feature map referred to as Linear feature map. See class BasePhi.

‘fourier’

It uses Random Fourier Feature map. See class RandomFourierPhi.

random_stateint, RandomState instance, default = None

Random seed used when using ‘fourier’ for feature mappings to produce the random weights.

fit_interceptbool, default = False

Whether to calculate the intercept for MRCs. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).

**phi_kwargsAdditional parameters for feature mappings.

Groups the multiple optional parameters for the corresponding feature mappings(phi).

For example in case of fourier features, the number of features is given by n_components parameter which can be passed as argument AMRC(n_classes=2, phi='fourier', n_components=500)

The list of arguments for each feature mappings class can be found in the corresponding documentation.

See also

MRCpy.MRC

MRC using uncertainty set \(\mathcal{U}_1\) without marginal constraints [2].

MRCpy.CMRC

CMRC using uncertainty set \(\mathcal{U}_2\) with marginal constraints [3].

References

[1] (1,2)

Álvarez, V., Mazuelas, S., & Lozano, J.A. (2022). Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees. In Proceedings of the 39th International Conference on Machine Learning, pp. 486-499.

[2]

Mazuelas, S., Zanoni, A., & Pérez, A. (2020). Minimax Classification with 0-1 Loss and Performance Guarantees. Advances in Neural Information Processing Systems, 33, 302-312.

[3]

Mazuelas, S., Shen, Y., & Pérez, A. (2022). Generalized Maximum Entropy for Supervised Classification. IEEE Transactions on Information Theory, 68(4), 2530-2550.

Attributes:
is_fitted_bool

Whether the classifier is fitted i.e., the parameters are learnt.

muarray-like of shape (m, 1)

Classifier parameters learnt by the minimax risk optimization.

varphifloat

Value of the \(\varphi\) function at the current solution.

sample_counterint

Number of samples processed so far.

params_dict

Dictionary storing optimization state including Kalman filter parameters, subgradient approximation matrices, and upper bounds.

Methods

error(X, Y)

Return the mean error obtained for the given test data and labels.

fit(x, y[, X_])

Fit the AMRC model.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

get_upper_bound()

Returns the upper bound on the expected loss for the fitted classifier.

get_upper_bound_accumulated()

Returns the upper bound on the accumulated mistakes of the fitted classifier.

minimax_risk(x, tau_, lambda_, n_classes)

Learn classifier parameters via minimax risk optimization.

predict(X)

Predict the class for a new instance using the fitted model.

predict_proba(x)

Compute conditional probabilities for each class.

score(X, y[, sample_weight])

Return the mean accuracy on the given test data and labels.

set_fit_request(*[, X_, x])

Request metadata passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_predict_proba_request(*[, x])

Request metadata passed to the predict_proba method.

set_score_request(*[, sample_weight])

Request metadata passed to the score method.

__init__(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

error(X, Y)

Return the mean error obtained for the given test data and labels.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Test instances for which the labels are to be predicted by the MRC model.

Yarray-like of shape (n_samples, 1), default=None

Labels corresponding to the testing instances used to compute the error in the prediction.

Returns:
errorfloat

Mean error of the learned MRC classifier

fit(x, y, X_=None)[source]

Fit the AMRC model.

Computes the parameters required for the minimax risk optimization and then calls the minimax_risk function to solve the optimization. Designed for online learning where samples are fed sequentially.

Parameters:
xarray-like of shape (n_dimensions,)

Training instance used in

  • Calculating the expectation estimates that constrain the uncertainty set for the minimax risk classification

  • Solving the minimax risk optimization problem.

yint

Label corresponding to the training instance used only to compute the expectation estimates.

X_None

Unused in AMRC. Kept for API compatibility with BaseMRC.

Returns:
self :

Fitted estimator

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

get_upper_bound()[source]

Returns the upper bound on the expected loss for the fitted classifier.

Returns:
upper_boundfloat

Upper bound of the expected loss for the fitted classifier.

get_upper_bound_accumulated()[source]

Returns the upper bound on the accumulated mistakes of the fitted classifier.

Returns:
upper_bound_accumulatedfloat

Upper bound of the accumulated for the fitted classifier.

minimax_risk(x, tau_, lambda_, n_classes)[source]

Learn classifier parameters via minimax risk optimization.

This function efficiently learns classifier parameters by solving the minimax risk optimization problem using a subgradient approach.

Parameters:
xarray-like of shape (n_dimensions,)

Training instance used for solving the minimax risk optimization problem.

tau_array-like of shape (m, 1)

Mean estimates for the expectations of feature mappings.

lambda_array-like of shape (m, 1)

Variance in the mean estimates for the expectations of the feature mappings.

n_classesint

Number of labels in the dataset.

Returns:
self :

Fitted estimator with updated mu, is_fitted_, varphi, and params_ attributes.

predict(X)[source]

Predict the class for a new instance using the fitted model.

Returns the predicted class for the given instance in X using the probabilities given by the function predict_proba.

Parameters:
Xarray-like of shape (n_dimensions,)

Test instance to predict by the AMRC model.

Returns:
y_predint

Predicted label corresponding to the given instance.

predict_proba(x)[source]

Compute conditional probabilities for each class.

Parameters:
xarray-like of shape (n_dimensions,)

Testing instance for which the prediction probabilities are calculated for each class.

Returns:
hndarray of shape (n_classes,)

Conditional probabilities \(p(y|x)\) corresponding to the predictions for each class.

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters:
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns:
scorefloat

Mean accuracy of self.predict(X) w.r.t. y.

set_fit_request(*, X_: Union[bool, None, str] = '$UNCHANGED$', x: Union[bool, None, str] = '$UNCHANGED$')MRCpy.amrc.AMRC

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to fit.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
X_str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for X_ parameter in fit.

xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for x parameter in fit.

Returns:
selfobject

The updated object.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

set_predict_proba_request(*, x: Union[bool, None, str] = '$UNCHANGED$')MRCpy.amrc.AMRC

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict_proba.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for x parameter in predict_proba.

Returns:
selfobject

The updated object.

set_score_request(*, sample_weight: Union[bool, None, str] = '$UNCHANGED$')MRCpy.amrc.AMRC

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

New in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:
sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score.

Returns:
selfobject

The updated object.

Examples using MRCpy.AMRC

Example: Use of AMRC (Adaptative MRC) for Online Learning

Example: Use of AMRC (Adaptative MRC) for Online Learning