.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "packages/scikit-learn/auto_examples/plot_california_prediction.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_packages_scikit-learn_auto_examples_plot_california_prediction.py: A simple regression analysis on the California housing data =========================================================== Here we perform a simple regression analysis on the California housing data, exploring two types of regressors. .. GENERATED FROM PYTHON SOURCE LINES 9-13 .. code-block:: default from sklearn.datasets import fetch_california_housing data = fetch_california_housing(as_frame=True) .. GENERATED FROM PYTHON SOURCE LINES 14-15 Print a histogram of the quantity to predict: price .. GENERATED FROM PYTHON SOURCE LINES 15-22 .. code-block:: default import matplotlib.pyplot as plt plt.figure(figsize=(4, 3)) plt.hist(data.target) plt.xlabel('price ($100k)') plt.ylabel('count') plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_001.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 23-24 Print the join histogram for each feature .. GENERATED FROM PYTHON SOURCE LINES 24-33 .. code-block:: default for index, feature_name in enumerate(data.feature_names): plt.figure(figsize=(4, 3)) plt.scatter(data.data[feature_name], data.target) plt.ylabel('Price', size=15) plt.xlabel(feature_name, size=15) plt.tight_layout() .. rst-class:: sphx-glr-horizontal * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_002.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_002.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_003.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_003.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_004.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_004.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_005.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_005.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_006.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_006.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_007.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_007.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_008.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_008.png :class: sphx-glr-multi-img * .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_009.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_009.png :class: sphx-glr-multi-img .. GENERATED FROM PYTHON SOURCE LINES 34-35 Simple prediction .. GENERATED FROM PYTHON SOURCE LINES 35-54 .. code-block:: default from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(data.data, data.target) from sklearn.linear_model import LinearRegression clf = LinearRegression() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 8], [0, 8], '--k') plt.axis('tight') plt.xlabel('True price ($100k)') plt.ylabel('Predicted price ($100k)') plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_010.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_010.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 55-56 Prediction with gradient boosted tree .. GENERATED FROM PYTHON SOURCE LINES 56-73 .. code-block:: default from sklearn.ensemble import GradientBoostingRegressor clf = GradientBoostingRegressor() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 5], [0, 5], '--k') plt.axis('tight') plt.xlabel('True price ($100k)') plt.ylabel('Predicted price ($100k)') plt.tight_layout() .. image-sg:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_011.png :alt: plot california prediction :srcset: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_011.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 74-75 Print the error rate .. GENERATED FROM PYTHON SOURCE LINES 75-79 .. code-block:: default import numpy as np print(f"RMS: {np.sqrt(np.mean((predicted - expected) ** 2))!r} ") plt.show() .. rst-class:: sphx-glr-script-out .. code-block:: none RMS: 0.5314909993118918 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 3.857 seconds) .. _sphx_glr_download_packages_scikit-learn_auto_examples_plot_california_prediction.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_california_prediction.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_california_prediction.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_