.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/further_examples/plot_2_grid.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        Click :ref:`here <sphx_glr_download_auto_examples_further_examples_plot_2_grid.py>`
        to download the full example code

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_further_examples_plot_2_grid.py:


.. _grid:

Hyperparameter Tuning: Upper Bound vs Cross-Validation
==============================================================================

Example of how to use the Upper Bounds provided by the `MRC` method in the
`MRCpy` library for hyperparameter tuning and comparison to Cross-Validation.
We will see that using the Upper Bound gets similar performances to
Cross-Validation but being four times faster.

We are using '0-1' loss and `RandomFourierPhi`
map (`phi='fourier'`). We are going to tune
the regularization parameter `s` of the
feature mapping using a random grid. We will used the usual method
:ref:`RandomizedSearchCV<https://scikit-learn.org/stable/modules/
generated/sklearn.model_selection.RandomizedSearchCV.html>`
from `scikit-learn`.

In the following example, we will use the nesterov subgradient solver
for the MRC classifier by setting the parameter `solver = 'subgrad'`.

.. GENERATED FROM PYTHON SOURCE LINES 25-41

.. code-block:: default


    # Import needed modules
    import random
    import time

    import matplotlib.pyplot as plt
    import numpy as np
    import pandas as pd
    from scipy.spatial import distance
    from sklearn import preprocessing
    from sklearn.model_selection import RandomizedSearchCV, train_test_split

    from MRCpy import MRC
    from MRCpy.datasets import *


.. GENERATED FROM PYTHON SOURCE LINES 42-53

Random Grid using Upper Bound parameter
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We select random `n_iter` random values for the parameter to tune in
a given range and select the parameter which minimizes the upper
bound provided by the MRC method.
On each repetition we calculate and store the upper bound for each possible
value of `s`.
We are selecting `n_iter = 10` in the following code because it is the default
value for the RandomGridCV method.

.. GENERATED FROM PYTHON SOURCE LINES 53-84

.. code-block:: default


    def run_RandomGridUpper(X_train, Y_train, X_test, Y_test,
                        s_ini, s_fin, index):
        n_iter = 10
        startTime = time.time()
        s_id = [(s_fin - s_ini) * random.random() + s_ini for i in range(n_iter)]
        upps = np.zeros(n_iter)

        for i in range(n_iter):
            clf = MRC(phi='fourier', s=s_id[i], random_state=0,
                      deterministic=False, solver='subgrad')
            clf.fit(X_train, Y_train)
            upps[i] = clf.get_upper_bound()

        min_upp = np.min(upps)
        best_s = s_id[np.argmin(upps)]
        clf = MRC(phi='fourier', s=best_s, random_state=0,
                  deterministic=False, solver='subgrad')
        clf.fit(X_train, Y_train)
        Y_pred = clf.predict(X_test)
        best_err = np.average(Y_pred != Y_test)
        totalTime = time.time() - startTime

        return pd.DataFrame({'upper': [min_upp], 's': best_s,
                'time': totalTime, 'error': best_err})


.. GENERATED FROM PYTHON SOURCE LINES 85-87

RandomGridCV
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. GENERATED FROM PYTHON SOURCE LINES 87-118

.. code-block:: default


    def run_RandomGridCV(X_train, Y_train, X_test, Y_test,
                     s_ini, s_fin, index):
        n_iter = 10
        startTime = time.time()
        X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.25,
                                                            random_state=rep)
        # Normalizing the data
        std_scale = preprocessing.StandardScaler().fit(X_train, Y_train)
        X_train = std_scale.transform(X_train)
        X_test = std_scale.transform(X_test)

        s_values = np.linspace(s_ini, s_fin, num=5000)
        param = {'s': s_values}

        mrc = MRC(phi='fourier', random_state=0, deterministic=False,
                  solver='subgrad')
        clf = RandomizedSearchCV(mrc, param, random_state=0, n_iter=n_iter)
        clf.fit(X_train, Y_train)
        Y_pred = clf.predict(X_test)
        error = np.average(Y_pred != Y_test)

        totalTime = time.time() - startTime

        return pd.DataFrame({'upper': [clf.best_estimator_.get_upper_bound()],
                's': clf.best_estimator_.s,
                'time': totalTime, 'error': error})


.. GENERATED FROM PYTHON SOURCE LINES 119-131

Comparison
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We are performing both of the previous methods for hyperparameter tuning
over a set of different datasets and comparing the performances.
Before calling them, we set a range of values for the hyperpatameter.
Empirical knowledge tells us that best values for `s` lie in between
0.3 and 0.6.

We repeat these processes several times to make sure performances do not
rely heavily on the train_test_split selected.

.. GENERATED FROM PYTHON SOURCE LINES 131-222

.. code-block:: default


    def plot_table(df, title, color):
        fig, ax = plt.subplots()
        # hide axes
        fig.patch.set_visible(False)
        ax.axis('off')
        ax.axis('tight')
        t = ax.table(cellText=df.values, colLabels=df.columns, loc='center',
                     colColours=color, cellColours=[color] * len(df))
        t.auto_set_font_size(False)
        t.set_fontsize(8)
        t.auto_set_column_width(col=list(range(len(df.columns))))
        fig.tight_layout()
        plt.title(title)
        plt.show()


    loaders = [load_mammographic, load_haberman, load_indian_liver,
               load_diabetes, load_credit]
    dataNameList = ["mammographic", "haberman", "indian_liver",
                    "diabetes", "credit"]

    dfCV = pd.DataFrame()
    dfUpper = pd.DataFrame()
    f = '%1.3g'  # format
    for j, load in enumerate(loaders):

        # Loading the dataset
        X, Y = load()
        dataName = dataNameList[j]

        # In order to avoid the possible bias made by the choice of the train-test
        # split, we do this process several (20) times and average the
        # obtained results
        dfCV_aux = pd.DataFrame()
        dfUpper_aux = pd.DataFrame()
        for rep in range(10):
            X_train, X_test, Y_train, Y_test = \
                train_test_split(X, Y, test_size=0.25, random_state=rep)
            # Normalizing the data
            std_scale = preprocessing.StandardScaler().fit(X_train, Y_train)
            X_train = std_scale.transform(X_train)
            X_test = std_scale.transform(X_test)

            s_ini = 0.3
            s_fin = 0.6

            # We tune the parameters using both method and store the results
            dfCV_aux = pd.concat([dfCV_aux,
                run_RandomGridCV(X_train, Y_train, X_test, Y_test,
                                 s_ini, s_fin, rep)], ignore_index=True)
            dfUpper_aux = pd.concat([dfUpper_aux,
                run_RandomGridUpper(X_train, Y_train, X_test, Y_test,
                                    s_ini, s_fin, rep)], ignore_index=True)

        # We save the mean results of the 20 repetitions
        mean_err = f % np.mean(dfCV_aux['error']) + ' ± ' + \
            f % np.std(dfCV_aux['error'])
        mean_s = f % np.mean(dfCV_aux['s']) + ' ± ' + f % np.std(dfCV_aux['s'])
        mean_time = f % np.mean(dfCV_aux['time']) + ' ± ' + \
            f % np.std(dfCV_aux['time'])
        mean_upper = f % np.mean(dfCV_aux['upper']) + ' ± ' + \
            f % np.std(dfCV_aux['upper'])
        dfCV = pd.concat([dfCV, pd.DataFrame({'dataset': [dataName], 'error': mean_err,
                            's': mean_s,
                            'upper': mean_upper,
                            'time': mean_time})], ignore_index=True)
        mean_err = f % np.mean(dfUpper_aux['error']) + ' ± ' + \
            f % np.std(dfUpper_aux['error'])
        mean_s = f % np.mean(dfUpper_aux['s']) + ' ± ' + \
            f % np.std(dfUpper_aux['s'])
        mean_time = f % np.mean(dfUpper_aux['time']) + ' ± ' + \
            f % np.std(dfUpper_aux['time'])
        mean_upper = f % np.mean(dfUpper_aux['upper']) + ' ± ' + \
            f % np.std(dfUpper_aux['upper'])
        dfUpper = pd.concat([dfUpper, pd.DataFrame({'dataset': [dataName], 'error': mean_err,
                                  's': mean_s,
                                  'upper': mean_upper,
                                  'time': mean_time})], ignore_index=True)


.. GENERATED FROM PYTHON SOURCE LINES 223-227

.. code-block:: default


    dfCV.style.set_caption('RandomGridCV Results').set_properties(
        **{'background-color': 'lightskyblue'}, subset=['error', 'time'])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style type="text/css">
    #T_0e1cb_row0_col1, #T_0e1cb_row0_col5, #T_0e1cb_row1_col1, #T_0e1cb_row1_col5, #T_0e1cb_row2_col1, #T_0e1cb_row2_col5, #T_0e1cb_row3_col1, #T_0e1cb_row3_col5, #T_0e1cb_row4_col1, #T_0e1cb_row4_col5 {
      background-color: lightskyblue;
    }
    </style>
    <table id="T_0e1cb">
      <caption>RandomGridCV Results</caption>
      <thead>
        <tr>
          <th class="blank level0" >&nbsp;</th>
          <th id="T_0e1cb_level0_col0" class="col_heading level0 col0" >dataset</th>
          <th id="T_0e1cb_level0_col1" class="col_heading level0 col1" >error</th>
          <th id="T_0e1cb_level0_col3" class="col_heading level0 col3" >s</th>
          <th id="T_0e1cb_level0_col4" class="col_heading level0 col4" >upper</th>
          <th id="T_0e1cb_level0_col5" class="col_heading level0 col5" >time</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th id="T_0e1cb_level0_row0" class="row_heading level0 row0" >0</th>
          <td id="T_0e1cb_row0_col0" class="data row0 col0" >mammographic</td>
          <td id="T_0e1cb_row0_col1" class="data row0 col1" >0.212 ± 0.024</td>
          <td id="T_0e1cb_row0_col2" class="data row0 col2" >0.433 ± 0.0689</td>
          <td id="T_0e1cb_row0_col4" class="data row0 col4" >0.227 ± 0.0126</td>
          <td id="T_0e1cb_row0_col5" class="data row0 col5" >36.7 ± 0.365</td>
        </tr>
        <tr>
          <th id="T_0e1cb_level0_row1" class="row_heading level0 row1" >1</th>
          <td id="T_0e1cb_row1_col0" class="data row1 col0" >haberman</td>
          <td id="T_0e1cb_row1_col1" class="data row1 col1" >0.274 ± 0.0481</td>
          <td id="T_0e1cb_row1_col2" class="data row1 col2" >0.531 ± 0.0696 </td>
          <td id="T_0e1cb_row1_col4" class="data row1 col4" >0.271 ± 0.0162</td>
          <td id="T_0e1cb_row1_col5" class="data row1 col5" >30.4 ± 0.277</td>
        </tr>
        <tr>
          <th id="T_0e1cb_level0_row2" class="row_heading level0 row2" >2</th>
          <td id="T_0e1cb_row2_col0" class="data row2 col0" >indian_liver</td>
          <td id="T_0e1cb_row2_col1" class="data row2 col1" >0.288 ± 0.0179</td>
          <td id="T_0e1cb_row2_col2" class="data row2 col2" >0.44 ± 0.0487</td>
          <td id="T_0e1cb_row2_col4" class="data row2 col4" >0.296 ± 0.00526</td>
          <td id="T_0e1cb_row2_col5" class="data row2 col5" >37.6 ± 0.422</td>
        </tr>
        <tr>
          <th id="T_0e1cb_level0_row3" class="row_heading level0 row3" >3</th>
          <td id="T_0e1cb_row3_col0" class="data row3 col0" >diabetes</td>
          <td id="T_0e1cb_row3_col1" class="data row3 col1" >0.277 ± 0.0302</td>
          <td id="T_0e1cb_row3_col2" class="data row3 col2" >0.48 ± 0.0942</td>
          <td id="T_0e1cb_row3_col4" class="data row3 col4" >0.288 ± 0.007</td>
          <td id="T_0e1cb_row3_col5" class="data row3 col5" >37.2 ± 0.703</td>
        </tr>
        <tr>
          <th id="T_0e1cb_level0_row4" class="row_heading level0 row4" >4</th>
          <td id="T_0e1cb_row4_col0" class="data row4 col0" >credit</td>
          <td id="T_0e1cb_row4_col1" class="data row4 col1" >0.205 ± 0.0247</td>
          <td id="T_0e1cb_row4_col2" class="data row4 col2" >0.512 ± 0.0759</td>
          <td id="T_0e1cb_row4_col3" class="data row4 col3" >0.2 ± 0.00776</td>
          <td id="T_0e1cb_row4_col4" class="data row4 col4" >34.5 ± 0.157</td>
        </tr>
      </tbody>
    </table>

    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 228-232

.. code-block:: default


    dfUpper.style.set_caption('RandomGridUpper Results').set_properties(
        **{'background-color': 'lightskyblue'}, subset=['error', 'time'])


.. raw:: html

    <div class="output_subarea output_html rendered_html output_result">
    <style type="text/css">
    #T_44556_row0_col1, #T_44556_row0_col5, #T_44556_row1_col1, #T_44556_row1_col5, #T_44556_row2_col1, #T_44556_row2_col5, #T_44556_row3_col1, #T_44556_row3_col5, #T_44556_row4_col1, #T_44556_row4_col5 {
      background-color: lightskyblue;
    }
    </style>
    <table id="T_44556">
      <caption>RandomGridUpper Results</caption>
      <thead>
        <tr>
          <th class="blank level0" >&nbsp;</th>
          <th id="T_44556_level0_col0" class="col_heading level0 col0" >dataset</th>
          <th id="T_44556_level0_col1" class="col_heading level0 col1" >error</th>
          <th id="T_44556_level0_col3" class="col_heading level0 col3" >s</th>
          <th id="T_44556_level0_col4" class="col_heading level0 col4" >upper</th>
          <th id="T_44556_level0_col5" class="col_heading level0 col5" >time</th>
        </tr>
      </thead>
      <tbody>
        <tr>
          <th id="T_44556_level0_row0" class="row_heading level0 row0" >0</th>
          <td id="T_44556_row0_col0" class="data row0 col0" >mammographic</td>
          <td id="T_44556_row0_col1" class="data row0 col1" >0.216 ± 0.0222</td>
          <td id="T_44556_row0_col2" class="data row0 col2" >0.329 ± 0.025</td>
          <td id="T_44556_row0_col3" class="data row0 col3" >0.224 ± 0.0125</td>
          <td id="T_44556_row0_col5" class="data row0 col5" >8.28 ± 0.103</td>
        </tr>
        <tr>
          <th id="T_44556_level0_row1" class="row_heading level0 row1" >1</th>
          <td id="T_44556_row1_col0" class="data row1 col0" >haberman</td>
          <td id="T_44556_row1_col1" class="data row1 col1" >0.283 ± 0.0502</td>
          <td id="T_44556_row1_col3" class="data row1 col3" >0.338 ± 0.0262</td>
          <td id="T_44556_row1_col4" class="data row1 col4" >0.261 ± 0.0153 </td>
          <td id="T_44556_row1_col5" class="data row1 col5" >6.87 ± 0.105</td>
        </tr>
        <tr>
          <th id="T_44556_level0_row2" class="row_heading level0 row2" >2</th>
          <td id="T_44556_row2_col0" class="data row2 col0" >indian_liver</td>
          <td id="T_44556_row2_col1" class="data row2 col1" >0.288 ± 0.0185</td>
          <td id="T_44556_row2_col2" class="data row2 col2" >0.339 ± 0.028</td>
          <td id="T_44556_row2_col3" class="data row2 col3" >0.293 ± 0.00625</td>
          <td id="T_44556_row2_col5" class="data row2 col5" >8.71 ± 0.175</td>
        </tr>
        <tr>
          <th id="T_44556_level0_row3" class="row_heading level0 row3" >3</th>
          <td id="T_44556_row3_col0" class="data row3 col0" >diabetes</td>
          <td id="T_44556_row3_col1" class="data row3 col1" >0.294 ± 0.0401</td>
          <td id="T_44556_row3_col3" class="data row3 col3" >0.337 ± 0.035</td>
          <td id="T_44556_row3_col4" class="data row3 col4" >0.281 ± 0.00758</td>
          <td id="T_44556_row3_col5" class="data row3 col5" >8.53 ± 0.0622</td>
        </tr>
        <tr>
          <th id="T_44556_level0_row4" class="row_heading level0 row4" >4</th>
          <td id="T_44556_row4_col0" class="data row4 col0" >credit</td>
          <td id="T_44556_row4_col1" class="data row4 col1" >0.199 ± 0.0287</td>
          <td id="T_44556_row4_col3" class="data row4 col3" >0.331 ± 0.0286</td>
          <td id="T_44556_row4_col4" class="data row4 col4" >0.188 ± 0.00827</td>
          <td id="T_44556_row4_col5" class="data row4 col5" >8.04 ± 0.0954</td>
        </tr>
      </tbody>
    </table>

    </div>
    <br />
    <br />

.. GENERATED FROM PYTHON SOURCE LINES 233-247

Results
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Comparing the resulting tables above we notice that both methods:
RandomGridCV and Random Grid using Upper bounds are really similar in
performance, one can do better than the other depending on the datasets but
have overall the same error range.

Furthermore we can see how using the Upper bounds results in a great
improvement in the running time being around 4 times quicker than
the usual RandomGrid method.

We note that in every dataset the optimum value for the parameter `s` seems
to be  always around 0.3, that is why this value has been chosen to be
the default value for the library.


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 41 minutes  40.407 seconds)


.. _sphx_glr_download_auto_examples_further_examples_plot_2_grid.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download sphx-glr-download-python

     :download:`Download Python source code: plot_2_grid.py <plot_2_grid.py>`


  .. container:: sphx-glr-download sphx-glr-download-jupyter

     :download:`Download Jupyter notebook: plot_2_grid.ipynb <plot_2_grid.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_