MRCpy.phi
.RandomFourierPhi
- class MRCpy.phi.RandomFourierPhi(n_classes, fit_intercept=True, sigma='scale', n_components=600, random_state=None, one_hot=False)[source]
Fourier features
Features obtained by approximating the rbf kernel by Random Fourier Feature map -
\[z(x) = \sqrt{2/D} * [\cos(w_1^t * x), ..., \cos(w_D^t * x), \sin(w_1^t * x), ..., \sin(w_D^t * x)]\]where w is a vector(dimension d) of random weights from gaussian distribution with mean 0 and variance \(1/\sigma\) and D is the number of components in the resulting feature map. The parameter \(\sigma\) in the variance is similar to the scaling parameter of the radial basis function kernel:
\[K(x, x') = \exp{\frac{-\| x-x'\|^2}{2\sigma^2}}\]Note that when using Random Fourier feature mapping, training and testing instances are expected to be normalized.
See also
For more information about Random Features check:
For more information about MRC, one can refer to the following resources:
- Parameters:
- n_classes
int
Number of classes in the dataset.
- fit_intercept
bool
, default =True
Whether to calculate the intercept. If set to false, no intercept will be used in calculations (i.e. data is expected to be already centered).
- one_hot
bool
, default =False
Controls the method used for evaluating the features of the given instances in the binary case. Only applies in the binary case, namely, only when there are two classes. If set to true, one-hot-encoding will be used. If set to false a more efficient shorcut will be performed.
- sigma
str
orfloat
, default = ‘scale’ When given a string, it defines the type of heuristic to be used to calculate the scaling parameter
sigma
using the data. For comparison its relation with parametergamma
used in other methods is \(\gamma=1/(2\sigma^2)\). When given a float, it is the value for the scaling parameter.- ‘scale’
Approximates
sigma
by \(\sqrt{\frac{\textrm{n_features} * \textrm{var}(X)}{2}}\) so thatgamma
is \(\frac{1}{\textrm{n_features} * \textrm{var}(X)}\) wherevar
is the variance function.- ‘scale2’
Approximates
sigma
by \(\sqrt{\frac{\textrm{n_features}}{2}}\) so thatgamma
is \(\frac{1}{\textrm{n_features}}\) wherevar
is the variance function.- ‘avg_ann_50’
Approximates
sigma
by the average distance to the \(50^{\textrm{th}}\) nearest neighbour estimated from 1000 samples of the dataset using the functionrff_sigma
.
- n_components
int
, default =600
Number of features which the transformer transforms the input into.
- random_state
int
,RandomState
instance, default = None Random seed used to produce the
random_weights_
used for the approximation of the gaussian kernel.
- n_classes
- Attributes:
- random_weights_
array
-like of shape (n_features
,n_components
/2) Random weights applied to the training samples as a step for computing the random Fourier features.
- is_fitted_
bool
Whether the feature mappings has learned its hyperparameters (if any) and the length of the feature mapping is set.
- len_
int
Length of the feature mapping vector.
- random_weights_
Methods
est_exp
(X, Y)Average value of \(\phi(x,y)\) in the supervised dataset (X,Y).
est_std
(X, Y)Standard deviation of \(\phi(x,y)\) in the supervised dataset (X,Y).
eval_x
(X)Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and all the labels.
eval_xy
(X, Y)Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and y \(\in\) Y.
fit
(X[, Y])Learns the set of random weights for computing the features.
rff_sigma
(X)Computes the scaling parameter for the fourier features using the heuristic given in the paper "Compact Nonlinear Maps and Circulant Extensions" [1].
transform
(X)Compute the random Fourier features ((\(z(x)\))).
- __init__(n_classes, fit_intercept=True, sigma='scale', n_components=600, random_state=None, one_hot=False)[source]
- est_exp(X, Y)
Average value of \(\phi(x,y)\) in the supervised dataset (X,Y). Used in the learning stage to estimate the expectation of \(\phi(x,y)\), denoted by \({\tau}\)
- est_std(X, Y)
Standard deviation of \(\phi(x,y)\) in the supervised dataset (X,Y). Used in the learning stage to estimate the variance in the expectation of \(\phi(x,y)\), denoted by \(\lambda\)
- eval_x(X)
Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and all the labels. The output is 3D matrix that is composed of 2D matrices corresponding to each of the instance. These 2D matrices are the one-hot encodings of the instances’ features corresponding to all the possible labels in the data.
- Parameters:
- Returns:
- phi
array
-like of shape (
n_samples
,n_classes
,n_features
*n_classes
)Matrix containing the one-hot encoding for all the classes for each of the instances given.
- phi
- eval_xy(X, Y)
Evaluates the one-hot encoded features of the given instances i.e., X, \(\phi(x,y)\), x \(\in\) X and y \(\in\) Y. The encodings are calculated, corresponding to the given labels, which is used by the learning stage for estimating the expectation of \(\phi(x,y)\).
- Parameters:
- Returns:
- phi
array
-like of shape (
n_samples
,n_features
*n_classes
)Matrix containing the one-hot encoding with respect to the labels given for all the instances.
- phi
- fit(X, Y=None)[source]
Learns the set of random weights for computing the features. Also, compute the scaling parameter if the value is not given.
- Parameters:
- X
array
-like of shape (n_samples
,n_dimensions
) Unlabeled training instances used to learn the feature configurations.
- Y
array
-like of shape (n_samples
,), default =None
This argument will never be used in this case. It is present in the signature for consistency in the signature of the function among different feature mappings.
- X
- Returns:
- self
Fitted estimator
- rff_sigma(X)[source]
Computes the scaling parameter for the fourier features using the heuristic given in the paper “Compact Nonlinear Maps and Circulant Extensions” [1].
The heuristic states that the scaling parameter is obtained as the average distance to the 50th nearest neighbour estimated from 1000 samples of the dataset.
See also