`MRCpy.phi`.BasePhi

class MRCpy.phi.BasePhi(n_classes, fit_intercept=True, one_hot=False)[source]

Base class for feature mappings

The class provides a base for different feature mapping functions that can be used with the MRC. This class provides definition for some utility functions that are used by the MRCs in the library. It corresponds to the usual identity feature map referred to as Linear feature map.

To see an example of how to extend the class BasePhi to implement yout own feature mapping see this example.

Note

This is a base class for all the feature mappings. To create a new feature mapping that can be used with MRC objects, the user can extend this class and then implement the functions -

fit - learns the required parameters for feature
transformation

transform - transforms the input instances to the features

The above functions are principal components for different feature transformation. Apart from these functions, the users can also re-define other functions in this class according to their need.

The definition of fit and transform in this class correspond to the usual identity feature map referred to as Linear feature map.

The transform function is only used by the eval_xy function of this class to get the one-hot encoded features. If the user defines his own eval_xy function that returns the features directly without the need of transform function, then the transform function can be omitted.

See also

For more information about MRC, one can refer to the following resources:

[1] Mazuelas, S., Zanoni, A., & Pérez, A. (2020). Minimax Classification with 0-1 Loss and Performance Guarantees. Advances in Neural Information Processing Systems, 33, 302-312.

[2] Mazuelas, S., Shen, Y., & Pérez, A. (2020). Generalized Maximum Entropy for Supervised Classification. arXiv preprint arXiv:2007.05447.

[3] Bondugula, K., Mazuelas, S., & Pérez, A. (2021). MRCpy: A Library for Minimax Risk Classifiers. arXiv preprint arXiv:2108.01952.

Parameters:

n_classesint: Number of classes in the dataset
fit_interceptbool, default = True: Whether to calculate the intercept. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered)
one_hotbool, default = False: Controls the method used for evaluating the features of the given instances in the binary case. Only applies in the binary case, namely, only when there are two classes. When set to true, one-hot-encoding will be used. If set to false a more efficient shorcut will be performed.

Attributes:

is_fitted_bool: Whether the feature mapping has learned its hyperparameters (if any) and the length of the feature mapping is set.
len_int: Length of the feature mapping vector.

Methods

`est_exp`(X, Y)	Average value of \(\phi(x,y)\) in the supervised dataset (X,Y).
`est_std`(X, Y)	Standard deviation of \(\phi(x,y)\) in the supervised dataset (X,Y).
`eval_x`(X)	Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and all the labels.
`eval_xy`(X, Y)	Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and y \(\in\) Y.
`fit`(X[, Y])	Performs training stage.
`transform`(X)	Transform the given instances to the features.

__init__(n_classes, fit_intercept=True, one_hot=False)[source]

est_exp(X, Y)[source]

Average value of \(\phi(x,y)\) in the supervised dataset (X,Y). Used in the learning stage to estimate the expectation of \(\phi(x,y)\), denoted by \({\tau}\)

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances.
Yarray-like of shape (n_samples,): Labels corresponding to the unlabeled training instances

Returns:

tau_array-like of shape (n_features * n_classes): Average value of phi

est_std(X, Y)[source]

Standard deviation of \(\phi(x,y)\) in the supervised dataset (X,Y). Used in the learning stage to estimate the variance in the expectation of \(\phi(x,y)\), denoted by \(\lambda\)

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances.
Yarray-like of shape (n_samples,): Labels corresponding to the unlabeled training instances

Returns:

lambda_array-like of shape (n_features * n_classes): Standard deviation of phi

eval_x(X)[source]

Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and all the labels. The output is 3D matrix that is composed of 2D matrices corresponding to each of the instance. These 2D matrices are the one-hot encodings of the instances’ features corresponding to all the possible labels in the data.

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances for developing the feature matrix.

Returns:

phiarray-like of shape: (n_samples, n_classes, n_features * n_classes)

Matrix containing the one-hot encoding for all the classes for each of the instances given.

eval_xy(X, Y)[source]

Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and y \(\in\) Y. The encodings are calculated, corresponding to the given labels, which is used by the learning stage for estimating the expectation of \(\phi(x,y)\).

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances for developing the feature matrix
Yarray-like of shape (n_samples): Labels corresponding to the unlabeled training instances

Returns:

phiarray-like of shape: (n_samples, n_features * n_classes)

Matrix containing the one-hot encoding with respect to the labels given for all the instances.

fit(X, Y=None)[source]

Performs training stage.

Learns the required hyperparameters for the feature mapping transformation from the training instances and set the length of the feature mapping (one-hot encoded) obtained from the eval_xy function.

Note

If a user implements fit function in his own feature mapping, then it is recommended to call this fit function at the end of his own function to automatically set the length of the feature mapping. This fit function can be called in a subclass as follows -

super().fit(X,Y)

Feature mappings implemented in this library follow this same approach.

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances used to learn the feature configurations
Yarray-like of shape (n_samples): Labels corresponding to the unlabeled instances.

Returns:

self: Fitted estimator

transform(X)[source]

Transform the given instances to the features.

Parameters:

Xarray-like of shape (n_samples, n_dimensions): Unlabeled training instances.

Returns:

X_featarray-like of shape (n_samples, n_features): Transformed features from the given instances.

MRCpy.phi.BasePhi

`MRCpy.phi`.BasePhi