MRCpy.phi.RandomReLUPhi

class MRCpy.phi.RandomReLUPhi(n_classes, fit_intercept=True, sigma='scale', n_components=600, random_state=None, one_hot=False)[source]

ReLU features

Rectified Linear Unit (ReLU) features are given by:

\[z(x) = \max(w^t * (2\sigma^2,x), 0)\]

where w is a vector(dimension d) of random weights uniformly distributed over a sphere of unit radius and \(\sigma\) is the scaling parameter similar to the one in random Fourier features.

ReLU function is defined as:

\[f(x) = \max(0, x)\]

Note that when using ReLU feature mapping, training and testing instances are expected to be normalized.

See also

For more information about ReLU Features check:

For more information about MRC, one can refer to the following resources:

Parameters:
n_classesint

Number of classes in the dataset.

fit_interceptbool, default = True

Whether to calculate the intercept. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).

one_hotbool, default = False

Controls the method used for evaluating the features of the given instances in the binary case. Only applies in the binary case, namely, only when there are two classes. If set to true, one-hot-encoding will be used. If set to false a more efficient shorcut will be performed.

sigmastr or float, default = ‘scale’

When given a string, it defines the type of heuristic to be used to calculate the scaling parameter sigma using the data. For comparison its relation with parameter gamma used in other methods is \(\gamma=1/(2\sigma^2)\). When given a float, it is the value for the scaling parameter.

‘scale’

Approximates sigma by \(\sqrt{\frac{\textrm{n_features} * \textrm{var}(X)}{2}}\) so that gamma is \(\frac{1}{\textrm{n_features} * \textrm{var}(X)}\) where var is the variance function.

‘avg_ann_50’

Approximates sigma by the average distance to the \(50^{\textrm{th}}\) nearest neighbour estimated from 1000 samples of the dataset using the function rff_sigma.

n_componentsint, default = 600

Number of features which the transformer transforms the input into.

random_stateint, RandomState instance, default = None

Random seed used to produce the random_weights_ used for the approximation of the gaussian kernel.

Attributes:
random_weights_array-like of shape (n_features, n_components)

Random weights applied to the training samples as a step for computing the ReLU random features.

is_fitted_bool

Whether the feature mappings has learned its hyperparameters (if any) and the length of the feature mapping is set.

len_int

Length of the feature mapping vector.

Methods

est_exp(X_transform, Y)

Computes the average value of \(\Phi(x,y)\) to estimate \(\boldsymbol{\tau}\) that defines the constraint of the uncertainty set of distribution.

est_std(X_transform, Y, tau_mat)

Standard deviation of \(\Phi(x,y)\) that accounts for inaccuracies in the mean estimate \(\boldsymbol{\tau}\).

eval_x(X)

Evaluates the one-hot encoded features of the given instances i.e., X, \(\Phi(x,y)\), x \(\in\) X and all the labels.

eval_xy(X, Y)

Evaluates the one-hot encoded features of the given instances i.e., X, \(\Phi(x,y)\), x \(\in\) X and y \(\in\) Y.

fit(X[, Y])

Learns the set of random weights for computing the features space.

rff_sigma(X)

Computes the scaling parameter for the ReLU features using the heuristic given in the paper “Compact Nonlinear Maps and Circulant Extensions” [1].

transform(X)

Compute the ReLU random features (\(z(x)\)).

__init__(n_classes, fit_intercept=True, sigma='scale', n_components=600, random_state=None, one_hot=False)[source]

Initialize self. See help(type(self)) for accurate signature.

est_exp(X_transform, Y)

Computes the average value of \(\Phi(x,y)\) to estimate \(\boldsymbol{\tau}\) that defines the constraint of the uncertainty set of distribution.

Parameters:
Xarray-like of shape (n_samples, n_features)

Features corresponding with the training instances \(\psi(x)\).

Yarray-like of shape (n_samples,)

Labels corresponding to the unlabeled training instances

Returns:
tau_array-like of shape (n_classes, n_features) or (1, n_features)

Empirical mean of \(\Phi(x,y)\).

est_std(X_transform, Y, tau_mat)

Standard deviation of \(\Phi(x,y)\) that accounts for inaccuracies in the mean estimate \(\boldsymbol{\tau}\). It is used to estimate \(\boldsymbol{\lambda}\) defining the uncertainty set constraints.

Parameters:
Xarray-like of shape (n_samples, n_features)

Features corresponding with the training instances \(\psi(x)\).

Yarray-like of shape (n_samples,)

Labels corresponding to the unlabeled training instances

Returns:
lambda_array-like of shape (n_classes, n_features) or (1, n_features)

Standard deviation of \(\Phi(x,y)\).

eval_x(X)

Evaluates the one-hot encoded features of the given instances i.e., X, \(\Phi(x,y)\), x \(\in\) X and all the labels. The output is 3D matrix that is composed of 2D matrices corresponding to each of the instance. These 2D matrices are the one-hot encodings of the instances’ features corresponding to all the possible labels in the data.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Unlabeled training instances for developing the feature matrix.

Returns:
phiarray-like of shape (n_samples, n_classes, n_features * n_classes)

Matrix containing the one-hot encoding for all the classes for each of the instances given.

eval_xy(X, Y)

Evaluates the one-hot encoded features of the given instances i.e., X, \(\Phi(x,y)\), x \(\in\) X and y \(\in\) Y. The encodings are calculated, corresponding to the given labels, which is used by the learning stage for estimating the expectation of \(\Phi(x,y)\).

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Unlabeled training instances for developing the feature matrix

Yarray-like of shape (n_samples)

Labels corresponding to the unlabeled training instances

Returns:
phiarray-like of shape (n_samples, n_features * n_classes)

Matrix containing the one-hot encoding with respect to the labels given for all the instances.

fit(X, Y=None)[source]

Learns the set of random weights for computing the features space. Also, compute the scaling parameter if the value is not given.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Unlabeled training instances used to learn the feature configurations.

Yarray-like of shape (n_samples,), default = None

This argument will never be used in this case. It is present for the consistency of signature of function among different feature mappings.

Returns:
self :

Fitted estimator

rff_sigma(X)[source]

Computes the scaling parameter for the ReLU features using the heuristic given in the paper “Compact Nonlinear Maps and Circulant Extensions” [1].

The heuristic states that the scaling parameter is obtained as the average distance to the 50th nearest neighbour estimated from 1000 samples of the dataset.

See also

[1] Yu, F. X., Kumar, S., Rowley, H., & Chang, S. F. (2015). Compact nonlinear maps and circulant extensions.

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Unlabeled instances.

Returns:
sigmafloat value

Scaling parameter computed using the heuristic.

transform(X)[source]

Compute the ReLU random features (\(z(x)\)).

Parameters:
Xarray-like of shape (n_samples, n_dimensions)

Unlabeled training instances.

Returns:
X_featarray-like of shape (n_samples, n_features)

Transformed features from the given instances.