{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Importing Libraries and Data\n", "In this cell, we start by importing the necessary libraries. We use Matplotlib for data visualization, NumPy for numerical operations, Pandas for data handling, and scikit-learn for machine learning tools. We then define our data: **X** represents the independent variable, and **Y** represents the dependent variable." ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:51:53.380858Z", "start_time": "2023-11-05T18:51:53.374238Z" } }, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "from sklearn.linear_model import LinearRegression\n", "from sklearn.preprocessing import PolynomialFeatures\n", "\n", "\n", "X = np.array([0, 1, 2, -1, -2]).reshape(-1, 1)\n", "Y = np.array([1, 6, 33, 0, 9])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Polynomial Regression Model Training\n", "In this cell, we calculate the degree of the polynomial regression model, which is one less than the number of data points. We use `PolynomialFeatures` to transform our input data into polynomial features up to the specified degree. Then, we create a linear regression model and train it using the polynomial features." ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:51:55.473985Z", "start_time": "2023-11-05T18:51:55.465488Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "LinearRegression()" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "degree = len(X)-1\n", "poly = PolynomialFeatures(degree=degree)\n", "X_poly = poly.fit_transform(X)\n", "\n", "regressor = LinearRegression()\n", "regressor.fit(X_poly, Y)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Visualization of Polynomial Regression\n", "This cell is responsible for visualizing the polynomial regression results. It generates a range of `x_values` to create a smooth curve and predicts `y_predicted` values. Then, it uses Matplotlib to plot the original data points in red and the predicted curve in blue, providing a visual representation of the model's performance." ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:52:00.190744Z", "start_time": "2023-11-05T18:52:00.065602Z" }, "scrolled": false }, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "x_values = np.linspace(min(X), max(X), 100).reshape(-1, 1)\n", "\n", "x_values_poly = poly.transform(x_values)\n", "y_predicted = regressor.predict(x_values_poly)\n", "\n", "plt.scatter(X, Y, color='red', label='Data')\n", "\n", "plt.plot(x_values, y_predicted, color='blue', label='Predicted')\n", "\n", "plt.title(f'Polynomial Regression Prediction (Degree {degree})')\n", "plt.xlabel('X')\n", "plt.ylabel('Y')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data Table\n", "This cell creates a Pandas DataFrame `df` that contains the original data in the 'X' column, actual 'Y' values in the 'Y' column, and the predicted 'Y' values in the 'Y_pred' column." ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:52:15.416747Z", "start_time": "2023-11-05T18:52:15.394026Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
XYY_pred
0011.000000e+00
1166.000000e+00
22333.300000e+01
3-10-1.532108e-14
4-299.000000e+00
\n", "
" ], "text/plain": [ " X Y Y_pred\n", "0 0 1 1.000000e+00\n", "1 1 6 6.000000e+00\n", "2 2 33 3.300000e+01\n", "3 -1 0 -1.532108e-14\n", "4 -2 9 9.000000e+00" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "y_predicted = regressor.predict(X_poly)\n", "data = {'X': X.ravel(), 'Y': Y, 'Y_pred': y_predicted}\n", "df = pd.DataFrame(data)\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Polynomial Coefficients\n", "In this cell, we extract the coefficients of the polynomial regression model and create a DataFrame `coeff_df`. The DataFrame presents the coefficients alongside the corresponding features, which are powers of 'X'. This helps in understanding the impact of each term on the model's predictions." ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:53:55.286538Z", "start_time": "2023-11-05T18:53:55.277046Z" }, "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
FeatureCoefficient
0X^12.0
1X^21.0
2X^31.0
3X^41.0
\n", "
" ], "text/plain": [ " Feature Coefficient\n", "0 X^1 2.0\n", "1 X^2 1.0\n", "2 X^3 1.0\n", "3 X^4 1.0" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "coefficients = regressor.coef_\n", "coeff_data = {'Feature': [f'X^{i}' for i in range(1, degree + 1)], 'Coefficient': coefficients[1:]}\n", "coeff_df = pd.DataFrame(coeff_data)\n", "coeff_df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Generating LaTeX Equation\n", "In this cell, we use the IPython.display library to present the polynomial equation in a LaTeX format. The equation includes the intercept term and non-zero coefficients, effectively excluding terms with coefficients equal to zero. This representation provides a more concise and meaningful equation for your polynomial regression model." ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:55:37.776422Z", "start_time": "2023-11-05T18:55:37.766784Z" }, "scrolled": true }, "outputs": [ { "data": { "text/latex": [ "$\\displaystyle Our Equation: 1.000 + 2.000X^1 + 1.000X^2 + 1.000X^3 + 1.000X^4$" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from IPython.display import display, Math\n", "\n", "\n", "coefficients = [i for i in regressor.coef_]\n", "\n", "\n", "terms = [f'{coeff:.3f}X^{i}' for i, coeff in enumerate(coefficients) if coeff != 0]\n", "latex_equation = \"$Our Equation: {:.3f} + {}$\".format(\n", " regressor.intercept_,\n", " ' + '.join(terms)\n", ")\n", "\n", "display(Math(latex_equation))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Calculating Polynomial Values\n", "\n", "In this cell, we define a function, `calculate_polynomial_value`, that calculates the polynomial value with provided coefficients, a value, and an intercept term. The function correctly applies the powers of 'value' to each coefficient, and then adds the intercept term to the result.\n", "\n", "The example usage demonstrates how to use the function with the coefficients and intercept from the regressor object. The code calculates and prints the polynomial value with five decimal places for the given value of -2. The result is 9 which is correct according to our initial input output table." ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "ExecuteTime": { "end_time": "2023-11-05T18:56:19.359906Z", "start_time": "2023-11-05T18:56:19.353891Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "6.00000\n" ] } ], "source": [ "def calculate_polynomial_value(coefficients, X, intercept):\n", " result = sum(coeff * (X ** i) for i, coeff in enumerate(coefficients))\n", " return result + intercept\n", "\n", "# Example usage:\n", "coefficients = regressor.coef_\n", "intercept = regressor.intercept_\n", "X = 1\n", "result = calculate_polynomial_value(coefficients, X, intercept)\n", "print(f\"{result:.5f}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 2 }