{ "cells": [ { "cell_type": "markdown", "id": "de4c7922-8621-40b8-8acf-ec4e51c6e377", "metadata": {}, "source": [ "# Linear Regression\n", "\n", "In this notebook, we will explore some more operations, such as calculating the logarithm in numpy and doing a linear regression.\n", "\n", "You have the following isothermic reaction:\n", "$A+\\frac{1}{6}B→\\frac{1}{4}C+\\frac{1}{2}D$\n", "\n", "During a laboratory experiment you measure in a batch reactor with constant volume and the initial concentration of $C_A$ is 25 $mol \\cdot m^{-3}$." ] }, { "cell_type": "code", "execution_count": 1, "id": "13048039-3f7d-49ad-bbde-06353d36dfb0", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import scipy\n", "from scipy import stats\n", "import matplotlib.pyplot as plt\n", "\n", "# define a distribution of CC\n", "cc = np.linspace(0, 4, 11)\n", "time = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20]\n", "\n", "# create a pandas dataframe with the data we simulated \n", "df = pd.DataFrame()\n", "df['Time'] = time\n", "df['Cc'] = cc" ] }, { "cell_type": "code", "execution_count": 2, "id": "c9883c57-fe68-4693-8250-dfbc0a86f8a8", "metadata": {}, "outputs": [], "source": [ "# create a function to calculate CA given CC and CA0\n", "def calculate_CA(C_A0, Cc):\n", " C_A = C_A0-(4*Cc)\n", " return C_A\n", "\n", "# Looking at the stochiometry, we know that CA can be calculated from CC and CA0\n", "df['Ca'] = calculate_CA(25, df['Cc'])" ] }, { "cell_type": "markdown", "id": "24a6a7bd-e3a8-472d-bb4e-a7d07e4002a6", "metadata": {}, "source": [ "## Finding the logarithm of a value in Python\n", "\n", "Here we will use the numpy library to calculate the logarithm of the values defined." ] }, { "cell_type": "code", "execution_count": 3, "id": "0ce6a9cf-25ca-4670-b920-b99660123d06", "metadata": {}, "outputs": [], "source": [ "# find the log of CA\n", "df['ln(Ca)'] = round(np.log(df['Ca']), 2)" ] }, { "cell_type": "markdown", "id": "3d886559-d7cd-4521-a6d9-f86446051a98", "metadata": {}, "source": [ "## Simple linear regression in Python with SciPy library\n", "\n", "Here we calculate a linear least-squares regression for two sets of measurements.\n", "Check the documentation [here](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.linregress.html).\n", "\n", "The unction returns:\n", "- Slope of the regression line\n", "- Intercept of the regression line\n", "- The Pearson correlation coefficient. The square of rvalue is equal to the coefficient of determination\n", "- The p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic. See alternative above for alternative hypotheses\n", "- Standard error of the estimated slope (gradient), under the assumption of residual normality\n", "- Standard error of the estimated intercept, under the assumption of residual normality" ] }, { "cell_type": "code", "execution_count": 4, "id": "ceb896e9-f581-4b1c-8507-a1ffb05d2cab", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Time | \n", "Cc | \n", "Ca | \n", "ln(Ca) | \n", "
---|---|---|---|---|
0 | \n", "0 | \n", "0.0 | \n", "25.0 | \n", "3.22 | \n", "
1 | \n", "2 | \n", "0.4 | \n", "23.4 | \n", "3.15 | \n", "
2 | \n", "4 | \n", "0.8 | \n", "21.8 | \n", "3.08 | \n", "
3 | \n", "6 | \n", "1.2 | \n", "20.2 | \n", "3.01 | \n", "
4 | \n", "8 | \n", "1.6 | \n", "18.6 | \n", "2.92 | \n", "
5 | \n", "10 | \n", "2.0 | \n", "17.0 | \n", "2.83 | \n", "
6 | \n", "12 | \n", "2.4 | \n", "15.4 | \n", "2.73 | \n", "
7 | \n", "14 | \n", "2.8 | \n", "13.8 | \n", "2.62 | \n", "
8 | \n", "16 | \n", "3.2 | \n", "12.2 | \n", "2.50 | \n", "
9 | \n", "18 | \n", "3.6 | \n", "10.6 | \n", "2.36 | \n", "
10 | \n", "20 | \n", "4.0 | \n", "9.0 | \n", "2.20 | \n", "