.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/plot_1_example_mrc.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_plot_1_example_mrc.py: .. _ex1: Example: Use of MRC with different settings =========================================== Example of using MRC with some of the common classification datasets with different losses and feature mappings settings. We load the different datasets and use 10-Fold Cross-Validation to generate the partitions for train and test. We separate 1 partition each time for testing and use the others for training. On each iteration we calculate the classification error as well as the upper and lower bounds for the error. We also calculate the mean training time. Note that we set the parameter use_cvx=False. In the case of MRC classifiers this means that we will use nesterov subgradient optimized approach to perform the optimization. You can check a more elaborated example in :ref:`ex_comp`. .. GENERATED FROM PYTHON SOURCE LINES 23-130 .. code-block:: default import time import numpy as np import pandas as pd from sklearn import preprocessing from sklearn.model_selection import StratifiedKFold from MRCpy import MRC # Import the datasets from MRCpy.datasets import * # Data sets loaders = [ load_mammographic, load_haberman, load_indian_liver, load_diabetes, load_credit, ] dataName = ["mammographic", "haberman", "indian_liver", "diabetes", "credit"] def runMRC(phi, loss): results = pd.DataFrame() # We fix the random seed to that the stratified kfold performed # is the same through the different executions random_seed = 0 # Iterate through each of the dataset and fit the MRC classfier. for j, load in enumerate(loaders): # Loading the dataset X, Y = load() r = len(np.unique(Y)) n, d = X.shape clf = MRC(phi=phi, loss=loss, random_state=random_seed, max_iters=5000, solver='subgrad') # Generate the partitions of the stratified cross-validation n_splits = 5 cv = StratifiedKFold( n_splits=n_splits, random_state=random_seed, shuffle=True ) cvError = list() auxTime = 0 upper = 0 lower = 0 # Paired and stratified cross-validation for train_index, test_index in cv.split(X, Y): X_train, X_test = X[train_index], X[test_index] y_train, y_test = Y[train_index], Y[test_index] # Normalizing the data std_scale = preprocessing.StandardScaler().fit(X_train, y_train) X_train = std_scale.transform(X_train) X_test = std_scale.transform(X_test) # Save start time for computing training time startTime = time.time() # Train the model and save the upper and lower bounds clf.fit(X_train, y_train) upper += clf.get_upper_bound() lower += clf.get_lower_bound() # Save the training time auxTime += time.time() - startTime # Predict the class for test instances y_pred = clf.predict(X_test) # Calculate the error made by MRC classificator cvError.append(np.average(y_pred != y_test)) res_mean = np.average(cvError) res_std = np.std(cvError) # Calculating the mean upper and lower bound and training time upper = upper / n_splits lower = lower / n_splits auxTime = auxTime / n_splits results = results._append( { "dataset": dataName[j], "n_samples": "%d" % n, "n_attributes": "%d" % d, "n_classes": "%d" % r, "error": "%1.2g" % res_mean + " +/- " + "%1.2g" % res_std, "upper": "%1.2g" % upper, "lower": "%1.2g" % lower, "avg_train_time (s)": "%1.2g" % auxTime, }, ignore_index=True, ) return results .. GENERATED FROM PYTHON SOURCE LINES 131-135 .. code-block:: default r1 = runMRC(phi="fourier", loss="0-1") r1.style.set_caption("Using 0-1 loss and fourier feature mapping") .. raw:: html
Using 0-1 loss and fourier feature mapping
  dataset n_samples n_attributes n_classes error upper lower avg_train_time (s)
0 mammographic 961 5 2 0.18 +/- 0.013 0.23 0.21 0.69
1 haberman 306 3 2 0.27 +/- 0.016 0.26 0.24 0.45
2 indian_liver 583 10 2 0.29 +/- 0.0035 0.29 0.28 0.64
3 diabetes 768 8 2 0.25 +/- 0.03 0.29 0.25 0.76
4 credit 690 15 2 0.14 +/- 0.034 0.2 0.15 0.95


.. GENERATED FROM PYTHON SOURCE LINES 136-139 .. code-block:: default r2 = runMRC(phi="fourier", loss="log") r2.style.set_caption("Using log loss and fourier feature mapping") .. raw:: html
Using log loss and fourier feature mapping
  dataset n_samples n_attributes n_classes error upper lower avg_train_time (s)
0 mammographic 961 5 2 0.18 +/- 0.011 0.54 0.44 2.3
1 haberman 306 3 2 0.27 +/- 0.016 0.58 0.5 1.1
2 indian_liver 583 10 2 0.29 +/- 0.0035 0.6 0.59 1.8
3 diabetes 768 8 2 0.24 +/- 0.027 0.6 0.52 2.7
4 credit 690 15 2 0.14 +/- 0.034 0.5 0.39 2.5


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 1 minutes 10.256 seconds) .. _sphx_glr_download_auto_examples_plot_1_example_mrc.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_1_example_mrc.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_1_example_mrc.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_