MRCpy
.AMRC
- class MRCpy.AMRC(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]
Adaptative Minimax Risk Classifier
The class AMRC implements the method Adaptative Minimimax Risk Classificafiers (AMRCs) proposed in [1]. It is designed for online learning with streaming data. Training samples are fed sequentially and the classification rule is updated every time a new sample is provided.
AMRC provides adaptation to concept drift (change in the underlying distribution of the data). Such concept drift is common in multiple applications including electricity price prediction, spam mail filtering, and credit card fraud detection. AMRC accounts for multidimensional time changes by means of a multivariate and high-order tracking of the time-varying underlying distribution. In addition, differently from conventional techniques, AMRCs can provide computable tight performance guarantees at learning.
It implements 0-1 loss function and it can be used with linear and Random Fourier features.
See also
For more information about AMRC, one can refer to the following paper:
[1] `Álvarez, V., Mazuelas, S., & Lozano, J. A. (2022). Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees. International Conference on Machine Learning (ICML) 2022.
@InProceedings{AlvMazLoz22, title = {Minimax Classification under Concept Drift with
Multidimensional Adaptation and Performance Guarantees},
- author = {{‘A}lvarez, Ver{‘o}nica
and Mazuelas, Santiago and Lozano, Jose A},
- booktitle = {Proceedings of the 39th
International Conference on Machine Learning},
pages = {486–499}, year = {2022}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {Jul}, publisher = {PMLR}, }
- Parameters:
- n_classes
int
Number of different possible labels for an instance.
- deterministic
bool
, default =True
Whether the prediction of the labels should be done in a deterministic way (given a fixed
random_state
in the case of using random Fourier or random ReLU features).- loss
str
{‘0-1’}, default = ‘0-1’ Type of loss function to use for the risk minimization. AMRC supports 0-1 loss. 0-1 loss quantifies the probability of classification error at a certain example for a certain rule.
- unidimensional
bool
, default = False Whether to model change in the variables unidimensionally or not. Available for comparison purposes.
- delta
float
, default = 0.05 Significance of the upper bound on the accumulated mistakes. Lower values will produce higher values for bounds.
- order
int
, default = 1 Order of the subgradients used in optimization.
- W
int
, default = 200 Window size. The model uses the last
W
samples for fitting the model.- N
int
, default = 100 Number of subgradients used for optimization.
- max_iters
int
, default =2000
Maximum number of iterations to use for finding the solution of optimization in the subgradient approach.
- phi
str
orBasePhi
instance, default = ‘linear’ Type of feature mapping function to use for mapping the input data. The currenlty available feature mapping methods are ‘fourier’, ‘relu’, ‘threshold’ and ‘linear’. The users can also implement their own feature mapping object (should be a
BasePhi
instance) and pass it to this argument. Note that when using ‘fourier’ feature mapping, training and testing instances are expected to be normalized. To implement a feature mapping, please go through the Feature Mappings section.- ‘linear’
It uses the identity feature map referred to as Linear feature map. See class
BasePhi
.- ‘fourier’
It uses Random Fourier Feature map. See class
RandomFourierPhi
.
- random_state
int
, RandomState instance, default =None
Random seed used when using ‘fourier’ for feature mappings to produce the random weights.
- fit_intercept
bool
, default =True
Whether to calculate the intercept for MRCs If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
- **phi_kwargsAdditional parameters for feature mappings.
Groups the multiple optional parameters for the corresponding feature mappings(
phi
).For example in case of fourier features, the number of features is given by
n_components
parameter which can be passed as argumentAMRC(phi='fourier', n_components=500)
The list of arguments for each feature mappings class can be found in the corresponding documentation.
- n_classes
Methods
compute_lambda
(X, Y)Compute deviation in the mean estimate tau using the given training instances.
compute_phi
(X)Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
compute_tau
(X, Y)Compute mean estimate tau using the given training instances.
error
(X, Y)Return the mean error obtained for the given test data and labels.
fit
(x, y[, X_])Fit the AMRC model.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
Returns the upper bound on the expected loss for the fitted classifier.
Returns the upper bound on the accumulated mistakes of the fitted classifier.
minimax_risk
(x, tau_, lambda_, n_classes)Learning
predict
(X)Predicts classes for new instances using a fitted model.
Conditional probabilities corresponding to each class for each unlabeled input instance
score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_fit_request
(*[, X_, x])Request metadata passed to the
fit
method.set_params
(**params)Set the parameters of this estimator.
set_predict_proba_request
(*[, x])Request metadata passed to the
predict_proba
method.set_score_request
(*[, sample_weight])Request metadata passed to the
score
method.- __init__(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]
- compute_lambda(X, Y)
Compute deviation in the mean estimate tau using the given training instances.
- compute_phi(X)
Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
- compute_tau(X, Y)
Compute mean estimate tau using the given training instances.
- error(X, Y)
Return the mean error obtained for the given test data and labels.
- Parameters:
- Returns:
- errorfloat
Mean error of the learned MRC classifier
- fit(x, y, X_=None)[source]
Fit the AMRC model.
Computes the parameters required for the minimax risk optimization and then calls the
minimax_risk
function to solve the optimization.- Parameters:
- X
array
-like of shape (n_features
) Training instances used in
Calculating the expectation estimates that constrain the uncertainty set for the minimax risk classification
Solving the minimax risk optimization problem.
- Y
int
, default =None
Label corresponding to the training instance used only to compute the expectation estimates.
- X_None
Unused in AMRC
- X
- Returns:
- self
Fitted estimator
- get_metadata_routing()
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- get_upper_bound()[source]
Returns the upper bound on the expected loss for the fitted classifier.
- Returns:
- upper_bound
float
Upper bound of the expected loss for the fitted classifier.
- upper_bound
- get_upper_bound_accumulated()[source]
Returns the upper bound on the accumulated mistakes of the fitted classifier.
- Returns:
- upper_bound_accumulated
float
Upper bound of the accumulated for the fitted classifier.
- upper_bound_accumulated
- minimax_risk(x, tau_, lambda_, n_classes)[source]
Learning
This function efficiently learns classifier parameters
- predict(X)[source]
Predicts classes for new instances using a fitted model.
Returns the predicted classes for the given instances in
X
using the probabilities given by the functionpredict_proba
.- Parameters:
- X
array
-like of shape (n_features
) Test instance for to predict by the AMRC model.
- X
- Returns:
- y_pred
int
Predicted labels corresponding to the given instances.
- y_pred
- predict_proba(x)[source]
Conditional probabilities corresponding to each class for each unlabeled input instance
- score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- set_fit_request(*, X_: bool | None | str = '$UNCHANGED$', x: bool | None | str = '$UNCHANGED$') AMRC
Request metadata passed to the
fit
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed tofit
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it tofit
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- X_str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
X_
parameter infit
.- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
x
parameter infit
.
- Returns:
- selfobject
The updated object.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- set_predict_proba_request(*, x: bool | None | str = '$UNCHANGED$') AMRC
Request metadata passed to the
predict_proba
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed topredict_proba
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it topredict_proba
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- xstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
x
parameter inpredict_proba
.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') AMRC
Request metadata passed to the
score
method.Note that this method is only relevant if
enable_metadata_routing=True
(seesklearn.set_config()
). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True
: metadata is requested, and passed toscore
if provided. The request is ignored if metadata is not provided.False
: metadata is not requested and the meta-estimator will not pass it toscore
.None
: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str
: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED
) retains the existing request. This allows you to change the request for some parameters and not others.Added in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline
. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weight
parameter inscore
.
- Returns:
- selfobject
The updated object.
Examples using MRCpy.AMRC
Example: Use of AMRC (Adaptative MRC) for Online Learning