.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "packages/scikit-learn/auto_examples/plot_variance_linear_regr.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_packages_scikit-learn_auto_examples_plot_variance_linear_regr.py: ================================================== Plot variance and regularization in linear models ================================================== .. GENERATED FROM PYTHON SOURCE LINES 8-15 .. code-block:: default import numpy as np # Smaller figures import matplotlib.pyplot as plt plt.rcParams['figure.figsize'] = (3, 2) .. GENERATED FROM PYTHON SOURCE LINES 16-17 We consider the situation where we have only 2 data point .. GENERATED FROM PYTHON SOURCE LINES 17-21 .. code-block:: default X = np.c_[ .5, 1].T y = [.5, 1] X_test = np.c_[ 0, 2].T .. GENERATED FROM PYTHON SOURCE LINES 22-23 Without noise, as linear regression fits the data perfectly .. GENERATED FROM PYTHON SOURCE LINES 23-29 .. code-block:: default from sklearn import linear_model regr = linear_model.LinearRegression() regr.fit(X, y) plt.plot(X, y, 'o') plt.plot(X_test, regr.predict(X_test)) .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_001.png :alt: plot variance linear regr :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none [] .. GENERATED FROM PYTHON SOURCE LINES 30-31 In real life situation, we have noise (e.g. measurement noise) in our data: .. GENERATED FROM PYTHON SOURCE LINES 31-38 .. code-block:: default np.random.seed(0) for _ in range(6): noisy_X = X + np.random.normal(loc=0, scale=.1, size=X.shape) plt.plot(noisy_X, y, 'o') regr.fit(noisy_X, y) plt.plot(X_test, regr.predict(X_test)) .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_002.png :alt: plot variance linear regr :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 39-47 As we can see, our linear model captures and amplifies the noise in the data. It displays a lot of variance. We can use another linear estimator that uses regularization, the :class:`~sklearn.linear_model.Ridge` estimator. This estimator regularizes the coefficients by shrinking them to zero, under the assumption that very high correlations are often spurious. The alpha parameter controls the amount of shrinkage used. .. GENERATED FROM PYTHON SOURCE LINES 47-57 .. code-block:: default regr = linear_model.Ridge(alpha=.1) np.random.seed(0) for _ in range(6): noisy_X = X + np.random.normal(loc=0, scale=.1, size=X.shape) plt.plot(noisy_X, y, 'o') regr.fit(noisy_X, y) plt.plot(X_test, regr.predict(X_test)) plt.show() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_003.png :alt: plot variance linear regr :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_003.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.109 seconds) .. _sphx_glr_download_packages_scikit-learn_auto_examples_plot_variance_linear_regr.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_variance_linear_regr.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_variance_linear_regr.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_