MRCpy
.AMRC
- class MRCpy.AMRC(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]
Adaptative Minimax Risk Classifier
The class AMRC implements the method Adaptative Minimimax Risk Classificafiers (AMRCs) proposed in [1]. It is designed for online learning with streaming data. Training samples are fed sequentially and the classification rule is updated every time a new sample is provided.
AMRC provides adaptation to concept drift (change in the underlying distribution of the data). Such concept drift is common in multiple applications including electricity price prediction, spam mail filtering, and credit card fraud detection. AMRC accounts for multidimensional time changes by means of a multivariate and high-order tracking of the time-varying underlying distribution. In addition, differently from conventional techniques, AMRCs can provide computable tight performance guarantees at learning.
It implements 0-1 loss function and it can be used with linear and Random Fourier features.
See also
For more information about AMRC, one can refer to the following paper:
[1] `Álvarez, V., Mazuelas, S., & Lozano, J. A. (2022). Minimax Classification under Concept Drift with Multidimensional Adaptation and Performance Guarantees. International Conference on Machine Learning (ICML) 2022.
@InProceedings{AlvMazLoz22, title = {Minimax Classification under Concept Drift with
Multidimensional Adaptation and Performance Guarantees},
- author = {{‘A}lvarez, Ver{‘o}nica
and Mazuelas, Santiago and Lozano, Jose A},
- booktitle = {Proceedings of the 39th
International Conference on Machine Learning},
pages = {486–499}, year = {2022}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {Jul}, publisher = {PMLR}, }
- Parameters
- n_classes
int
Number of different possible labels for an instance.
- deterministic
bool
, default =True
Whether the prediction of the labels should be done in a deterministic way (given a fixed
random_state
in the case of using random Fourier or random ReLU features).- loss
str
{‘0-1’}, default = ‘0-1’ Type of loss function to use for the risk minimization. AMRC supports 0-1 loss. 0-1 loss quantifies the probability of classification error at a certain example for a certain rule.
- unidimensional
bool
, default = False Whether to model change in the variables unidimensionally or not. Available for comparison purposes.
- delta
float
, default = 0.05 Significance of the upper bound on the accumulated mistakes. Lower values will produce higher values for bounds.
- order
int
, default = 1 Order of the subgradients used in optimization.
- W
int
, default = 200 Window size. The model uses the last
W
samples for fitting the model.- N
int
, default = 100 Number of subgradients used for optimization.
- max_iters
int
, default =2000
Maximum number of iterations to use for finding the solution of optimization in the subgradient approach.
- phi
str
orBasePhi
instance, default = ‘linear’ Type of feature mapping function to use for mapping the input data. The currenlty available feature mapping methods are ‘fourier’, ‘relu’, ‘threshold’ and ‘linear’. The users can also implement their own feature mapping object (should be a
BasePhi
instance) and pass it to this argument. Note that when using ‘fourier’ feature mapping, training and testing instances are expected to be normalized. To implement a feature mapping, please go through the Feature Mappings section.- ‘linear’
It uses the identity feature map referred to as Linear feature map. See class
BasePhi
.- ‘fourier’
It uses Random Fourier Feature map. See class
RandomFourierPhi
.
- random_state
int
, RandomState instance, default =None
Random seed used when using ‘fourier’ for feature mappings to produce the random weights.
- fit_intercept
bool
, default =True
Whether to calculate the intercept for MRCs If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
- **phi_kwargsAdditional parameters for feature mappings.
Groups the multiple optional parameters for the corresponding feature mappings(
phi
).For example in case of fourier features, the number of features is given by
n_components
parameter which can be passed as argumentAMRC(phi='fourier', n_components=500)
The list of arguments for each feature mappings class can be found in the corresponding documentation.
- n_classes
Methods
compute_lambda
(X, Y)Compute deviation in the mean estimate tau using the given training instances.
compute_phi
(X)Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
compute_tau
(X, Y)Compute mean estimate tau using the given training instances.
error
(X, Y)Return the mean error obtained for the given test data and labels.
fit
(x, y[, X_])Fit the AMRC model.
get_params
([deep])Get parameters for this estimator.
Returns the upper bound on the expected loss for the fitted classifier.
Returns the upper bound on the accumulated mistakes of the fitted classifier.
minimax_risk
(x, tau_, lambda_, n_classes)Learning
predict
(X)Predicts classes for new instances using a fitted model.
Conditional probabilities corresponding to each class for each unlabeled input instance
score
(X, y[, sample_weight])Return the mean accuracy on the given test data and labels.
set_params
(**params)Set the parameters of this estimator.
- __init__(n_classes, loss='0-1', deterministic=True, random_state=None, phi='linear', unidimensional=False, delta=0.05, order=1, W=200, N=100, fit_intercept=False, max_iters=2000, **phi_kwargs)[source]
- compute_lambda(X, Y)
Compute deviation in the mean estimate tau using the given training instances.
- compute_phi(X)
Compute the feature mapping corresponding to instances given for learning the classifiers (in case of training) and prediction (in case of testing).
- compute_tau(X, Y)
Compute mean estimate tau using the given training instances.
- error(X, Y)
Return the mean error obtained for the given test data and labels.
- Parameters
- Returns
- errorfloat
Mean error of the learned MRC classifier
- fit(x, y, X_=None)[source]
Fit the AMRC model.
Computes the parameters required for the minimax risk optimization and then calls the
minimax_risk
function to solve the optimization.- Parameters
- X
array
-like of shape (n_features
) Training instances used in
Calculating the expectation estimates that constrain the uncertainty set for the minimax risk classification
Solving the minimax risk optimization problem.
- Y
int
, default =None
Label corresponding to the training instance used only to compute the expectation estimates.
- X_None
Unused in AMRC
- X
- Returns
- self
Fitted estimator
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsdict
Parameter names mapped to their values.
- get_upper_bound()[source]
Returns the upper bound on the expected loss for the fitted classifier.
- Returns
- upper_bound
float
Upper bound of the expected loss for the fitted classifier.
- upper_bound
- get_upper_bound_accumulated()[source]
Returns the upper bound on the accumulated mistakes of the fitted classifier.
- Returns
- upper_bound_accumulated
float
Upper bound of the accumulated for the fitted classifier.
- upper_bound_accumulated
- minimax_risk(x, tau_, lambda_, n_classes)[source]
Learning
This function efficiently learns classifier parameters
- predict(X)[source]
Predicts classes for new instances using a fitted model.
Returns the predicted classes for the given instances in
X
using the probabilities given by the functionpredict_proba
.- Parameters
- X
array
-like of shape (n_features
) Test instance for to predict by the AMRC model.
- X
- Returns
- y_pred
int
Predicted labels corresponding to the given instances.
- y_pred
- predict_proba(x)[source]
Conditional probabilities corresponding to each class for each unlabeled input instance
- score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
Estimator parameters.
- Returns
- selfestimator instance
Estimator instance.
Examples using MRCpy.AMRC
Example: Use of AMRC (Adaptative MRC) for Online Learning