.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/further_examples/plot_2_grid.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_further_examples_plot_2_grid.py: .. _grid: Hyperparameter Tuning: Upper Bound vs Cross-Validation ============================================================================== Example of how to use the Upper Bounds provided by the `MRC` method in the `MRCpy` library for hyperparameter tuning and comparison to Cross-Validation. We will see that using the Upper Bound gets similar performances to Cross-Validation but being four times faster. We are using '0-1' loss and `RandomFourierPhi` map (`phi='fourier'`). We are going to tune the regularization parameter `s` of the feature mapping using a random grid. We will used the usual method :ref:`RandomizedSearchCV` from `scikit-learn`. In the following example, we will use the nesterov subgradient solver for the MRC classifier by setting the parameter `solver = 'subgrad'`. .. GENERATED FROM PYTHON SOURCE LINES 25-41 .. code-block:: default # Import needed modules import random import time import matplotlib.pyplot as plt import numpy as np import pandas as pd from scipy.spatial import distance from sklearn import preprocessing from sklearn.model_selection import RandomizedSearchCV, train_test_split from MRCpy import MRC from MRCpy.datasets import * .. GENERATED FROM PYTHON SOURCE LINES 42-53 Random Grid using Upper Bound parameter ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We select random `n_iter` random values for the parameter to tune in a given range and select the parameter which minimizes the upper bound provided by the MRC method. On each repetition we calculate and store the upper bound for each possible value of `s`. We are selecting `n_iter = 10` in the following code because it is the default value for the RandomGridCV method. .. GENERATED FROM PYTHON SOURCE LINES 53-84 .. code-block:: default def run_RandomGridUpper(X_train, Y_train, X_test, Y_test, s_ini, s_fin, index): n_iter = 10 startTime = time.time() s_id = [(s_fin - s_ini) * random.random() + s_ini for i in range(n_iter)] upps = np.zeros(n_iter) for i in range(n_iter): clf = MRC(phi='fourier', s=s_id[i], random_state=0, deterministic=False, solver='subgrad') clf.fit(X_train, Y_train) upps[i] = clf.get_upper_bound() min_upp = np.min(upps) best_s = s_id[np.argmin(upps)] clf = MRC(phi='fourier', s=best_s, random_state=0, deterministic=False, solver='subgrad') clf.fit(X_train, Y_train) Y_pred = clf.predict(X_test) best_err = np.average(Y_pred != Y_test) totalTime = time.time() - startTime return pd.DataFrame({'upper': [min_upp], 's': best_s, 'time': totalTime, 'error': best_err}) .. GENERATED FROM PYTHON SOURCE LINES 85-87 RandomGridCV ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. GENERATED FROM PYTHON SOURCE LINES 87-118 .. code-block:: default def run_RandomGridCV(X_train, Y_train, X_test, Y_test, s_ini, s_fin, index): n_iter = 10 startTime = time.time() X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25, random_state=rep) # Normalizing the data std_scale = preprocessing.StandardScaler().fit(X_train, Y_train) X_train = std_scale.transform(X_train) X_test = std_scale.transform(X_test) s_values = np.linspace(s_ini, s_fin, num=5000) param = {'s': s_values} mrc = MRC(phi='fourier', random_state=0, deterministic=False, solver='subgrad') clf = RandomizedSearchCV(mrc, param, random_state=0, n_iter=n_iter) clf.fit(X_train, Y_train) Y_pred = clf.predict(X_test) error = np.average(Y_pred != Y_test) totalTime = time.time() - startTime return pd.DataFrame({'upper': [clf.best_estimator_.get_upper_bound()], 's': clf.best_estimator_.s, 'time': totalTime, 'error': error}) .. GENERATED FROM PYTHON SOURCE LINES 119-131 Comparison ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ We are performing both of the previous methods for hyperparameter tuning over a set of different datasets and comparing the performances. Before calling them, we set a range of values for the hyperpatameter. Empirical knowledge tells us that best values for `s` lie in between 0.3 and 0.6. We repeat these processes several times to make sure performances do not rely heavily on the train_test_split selected. .. GENERATED FROM PYTHON SOURCE LINES 131-222 .. code-block:: default def plot_table(df, title, color): fig, ax = plt.subplots() # hide axes fig.patch.set_visible(False) ax.axis('off') ax.axis('tight') t = ax.table(cellText=df.values, colLabels=df.columns, loc='center', colColours=color, cellColours=[color] * len(df)) t.auto_set_font_size(False) t.set_fontsize(8) t.auto_set_column_width(col=list(range(len(df.columns)))) fig.tight_layout() plt.title(title) plt.show() loaders = [load_mammographic, load_haberman, load_indian_liver, load_diabetes, load_credit] dataNameList = ["mammographic", "haberman", "indian_liver", "diabetes", "credit"] dfCV = pd.DataFrame() dfUpper = pd.DataFrame() f = '%1.3g' # format for j, load in enumerate(loaders): # Loading the dataset X, Y = load() dataName = dataNameList[j] # In order to avoid the possible bias made by the choice of the train-test # split, we do this process several (20) times and average the # obtained results dfCV_aux = pd.DataFrame() dfUpper_aux = pd.DataFrame() for rep in range(10): X_train, X_test, Y_train, Y_test = \ train_test_split(X, Y, test_size=0.25, random_state=rep) # Normalizing the data std_scale = preprocessing.StandardScaler().fit(X_train, Y_train) X_train = std_scale.transform(X_train) X_test = std_scale.transform(X_test) s_ini = 0.3 s_fin = 0.6 # We tune the parameters using both method and store the results dfCV_aux = pd.concat([dfCV_aux, run_RandomGridCV(X_train, Y_train, X_test, Y_test, s_ini, s_fin, rep)], ignore_index=True) dfUpper_aux = pd.concat([dfUpper_aux, run_RandomGridUpper(X_train, Y_train, X_test, Y_test, s_ini, s_fin, rep)], ignore_index=True) # We save the mean results of the 20 repetitions mean_err = f % np.mean(dfCV_aux['error']) + ' ± ' + \ f % np.std(dfCV_aux['error']) mean_s = f % np.mean(dfCV_aux['s']) + ' ± ' + f % np.std(dfCV_aux['s']) mean_time = f % np.mean(dfCV_aux['time']) + ' ± ' + \ f % np.std(dfCV_aux['time']) mean_upper = f % np.mean(dfCV_aux['upper']) + ' ± ' + \ f % np.std(dfCV_aux['upper']) dfCV = pd.concat([dfCV, pd.DataFrame({'dataset': [dataName], 'error': mean_err, 's': mean_s, 'upper': mean_upper, 'time': mean_time})], ignore_index=True) mean_err = f % np.mean(dfUpper_aux['error']) + ' ± ' + \ f % np.std(dfUpper_aux['error']) mean_s = f % np.mean(dfUpper_aux['s']) + ' ± ' + \ f % np.std(dfUpper_aux['s']) mean_time = f % np.mean(dfUpper_aux['time']) + ' ± ' + \ f % np.std(dfUpper_aux['time']) mean_upper = f % np.mean(dfUpper_aux['upper']) + ' ± ' + \ f % np.std(dfUpper_aux['upper']) dfUpper = pd.concat([dfUpper, pd.DataFrame({'dataset': [dataName], 'error': mean_err, 's': mean_s, 'upper': mean_upper, 'time': mean_time})], ignore_index=True) .. GENERATED FROM PYTHON SOURCE LINES 223-227 .. code-block:: default dfCV.style.set_caption('RandomGridCV Results').set_properties( **{'background-color': 'lightskyblue'}, subset=['error', 'time']) .. raw:: html
RandomGridCV Results
  dataset error s upper time
0 mammographic 0.212 ± 0.024 0.433 ± 0.0689 0.227 ± 0.0126 36.7 ± 0.365
1 haberman 0.274 ± 0.0481 0.531 ± 0.0696 0.271 ± 0.0162 30.4 ± 0.277
2 indian_liver 0.288 ± 0.0179 0.44 ± 0.0487 0.296 ± 0.00526 37.6 ± 0.422
3 diabetes 0.277 ± 0.0302 0.48 ± 0.0942 0.288 ± 0.007 37.2 ± 0.703
4 credit 0.205 ± 0.0247 0.512 ± 0.0759 0.2 ± 0.00776 34.5 ± 0.157


.. GENERATED FROM PYTHON SOURCE LINES 228-232 .. code-block:: default dfUpper.style.set_caption('RandomGridUpper Results').set_properties( **{'background-color': 'lightskyblue'}, subset=['error', 'time']) .. raw:: html
RandomGridUpper Results
  dataset error s upper time
0 mammographic 0.216 ± 0.0222 0.329 ± 0.025 0.224 ± 0.0125 8.28 ± 0.103
1 haberman 0.283 ± 0.0502 0.338 ± 0.0262 0.261 ± 0.0153 6.87 ± 0.105
2 indian_liver 0.288 ± 0.0185 0.339 ± 0.028 0.293 ± 0.00625 8.71 ± 0.175
3 diabetes 0.294 ± 0.0401 0.337 ± 0.035 0.281 ± 0.00758 8.53 ± 0.0622
4 credit 0.199 ± 0.0287 0.331 ± 0.0286 0.188 ± 0.00827 8.04 ± 0.0954


.. GENERATED FROM PYTHON SOURCE LINES 233-247 Results ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Comparing the resulting tables above we notice that both methods: RandomGridCV and Random Grid using Upper bounds are really similar in performance, one can do better than the other depending on the datasets but have overall the same error range. Furthermore we can see how using the Upper bounds results in a great improvement in the running time being around 4 times quicker than the usual RandomGrid method. We note that in every dataset the optimum value for the parameter `s` seems to be always around 0.3, that is why this value has been chosen to be the default value for the library. .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 41 minutes 40.407 seconds) .. _sphx_glr_download_auto_examples_further_examples_plot_2_grid.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_2_grid.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_2_grid.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_