E. coli Fed-batch fermentation model, including recombinant protein production#
In this Python Script the ODE (Ordinary Differential Equation) of an Ecoli model is solved with the Python ODE Solver odeint.
The equations for E. coli growth have been based on the work of Anane et al, which can be found in https://doi.org/10.1016/j.bej.2017.05.013. The protein production dynamics have been based in the work of Chae et al. which can be found in https://doi.org/10.1002/1097-0290(20000805)69:3<275::aid-bit5>3.0.co;2-y. The combine implementation can be found in the MSc Thesis https://scholar.tecnico.ulisboa.pt/records/uuoC1N4NzN890o9vFCM5lQm9zjqNvBsOY3pp in Section 3.1.
Package import#
This portion of the code handles the import of all the relevant python packages.
# import packages
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
import plotly.express as px
import plotly.graph_objects as go
from scipy.integrate import solve_ivp
from plotly.subplots import make_subplots
import pandas as pd
%matplotlib inline
#%matplotlib qt
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 3
1 # import packages
2 from __future__ import division
----> 3 import numpy as np
4 import matplotlib.pyplot as plt
5 from scipy.integrate import odeint
ModuleNotFoundError: No module named 'numpy'
Define timesteps#
This cell defines the initial and end-time, and the timestops to perform the simulation
# discretizing the time
dt = 0.01
# defining the initial time
t0 = 0
# defining the end-time
T = 25 #h
# generation of the time-points
t = np.linspace(t0, T, int(T/dt)+1)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[2], line 11
8 T = 25 #h
10 # generation of the time-points
---> 11 t = np.linspace(t0, T, int(T/dt)+1)
NameError: name 'np' is not defined
Define initial conditions and process parameters#
In this cell the initial conditions for the monitored variables as well as process parameters regarding the feed strategy. y0 is the vector which contains the initial conditions values.
# initial values at time 0
X0 = 0.025 # Biomass [g/L]
S0 = 20 # Substrate [g/L]
DOT0 = 90 # Dissolved oxygen [%]
A0 = 0 # Acetate [g/L]
V0 = 1400 # Volume [L]
I0 = 0 #Inducer concentration [g/L]
P0 =0 #Protein concentratuin [g/L]
# process parameters
Si = 300
If = 50
Fout = 0
KLa = 600
muset = 0.3
# define initial condition and save it in the y
y0 = [X0, S0, DOT0, A0, V0, I0, P0]
First pass of the model#
The feeding strategy employed is trigered by a decrease in Glucose concentration below a certain value. In the approach implemented here the timepoint at which this happens is determined by running the model a first time, without considering feed, i.e., running the model as if we were studying a batch process. We also extract the Biomass concetration at the end of the batch phase, which is necessary for the feed rate calculations
Note: If using e.g., the solve_ivp solver, a different approach to extract these values can be followed by the use of flags. This would be a more elegant implementation
Define the model equations#
This cell defines the model equation in Batch mode.
The first step is to assign the variable to simulate to the result vector y.
The feed parameters are defined, in this case set to 0, since we consider a batch process
The parameters characteristic for the strain are defined as well as physicochemical constants
The algebraic expressions that describe the kinetics of the system are defined
The ODE system is defined
# define the function; Run the model a first time, with no feeding to determibe tf
def eColi_first_pass(y, t):
X = y[0]
S = y[1]
DOT = y[2]
A = y[3]
V = y[4]
I = y[5]
P = y[6]
# process parameters
F1=0
Fi=0
# parameters describing the characteristics of the strain
qAmax = 1 # max Acetate consumption rate [g/g*h]
Kaq = 0.01 # affinity constant Acetate consumption [g/L]
Ksq = 0.1 # affinity constnat Substrate consumption [g/L]
Yam = 0.2 # yield acetate maintenance [g/g]
Yaresp = 0.2 # yield acetate respiratory [g/g]
Yem = 0.56 # yield excluding maintance [g/g]
qSmax = 1.4 # max glucose uptake rate [g/g*h]
Ks = 0.05 # affinity constant glucose consumption[g/L]
qm = 0.04 # specific maintenance coefficient[g/g*h]
Ko = 1 # Affinity constant, oxygen consumption [g/L]
Yosresp = 1.217 # yield from S to X, respiratory [g/g]
pAmax = 1 # max Acetate production rate [g/g*h]
Kap = 10 # affinity constant intracellular acetate production[g/L]
Yaof = 1 # aceate yield in overflow[g/g]
Yofm = Yem # [g/g]
k1=0.32
k2=0.00044
k3=0.6
Ki=0.55
muset = 0.3 #Growth rate set in feed phase
# physicochemical constants
Cs = 0.391 # ratio of substrate per C in [gC/gS]
Cx = 0.488 # ratio of biomass per C in [gC/gX]
H = 14000 # conversion factor
DOTstar = 90
# algebraic variables
qS = qSmax * S / (S + Ks) * DOT / (DOT + Ko) # substrate uptake
qSof = pAmax * qS / ( qS + Kap ) / Yaof # overflow substrate conversion
pA = pAmax * qS / (qS + Kap) # production of acetate
qSox = qS - qSof # substrate uptake excluding overflow
qSan =(qSox - qm) * Yem * Cx / Cs # anabolic substrate consumption
qsA = qAmax * A / (A + Kaq) * (Ksq / (Ksq + qS)) # acetate consumption
qA = pA - qsA # total acetate equilibrium
qO = Yosresp * (qSox - qSan) + qsA * Yaresp # oxygen uptake
# growth rate equation
my =(qSox - qm) * Yem + qsA * Yam + qSof * Yofm
# differential equation system
dXdt = -F1 / V * X + my * X # biomass growth
dSdt = -F1 / V * S - qS * X + F1 / V * Si # substrate evolution
dDOTdt = KLa * (DOTstar - DOT) - qO * X * H # oxygen dynamics
dAdt = qA * X - F1 / V * A # aetate evolution
dVdt = (F1 - Fout) # volume change
dIdt = -F1/V*I + (Fi/V)*(If-I) #inducer dynamics
dPdt = -F1/V*P + ((k1*my*I)/(Ki+I)+k2)* X - k3 * P #protein dynamics
#the solution of odeint is formulated as a system of an ODE
dydt = [dXdt, dSdt, dDOTdt, dAdt, dVdt, dIdt, dPdt]
return dydt
Extract the relevant variables#
From the first pass on the model solving we extract the values that correspond to S < 0.5
solution_first_pass = odeint(eColi_first_pass, y0, t)
# Extract S and find tf when S drops below 0.5
S = solution_first_pass[:, 1] #Substrate concentraion S
X = solution_first_pass[:, 0] # biomass concentration X
V = solution_first_pass[:, 4] # volume V
tf, Xbf, Vbf = None, None, None
for i in range(len(t)):
if S[i] < 0.5:
tf = t[i] # Time when S drops below 0.5
Xbf = X[i] # Biomass concentration X at that time
Vbf = V[i] # Volume V at that time
break # Exit the loop once we find tf
print(f"tf: {tf}, Xbf: {Xbf}, Vbf: {Vbf}")
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 solution_first_pass = odeint(eColi_first_pass, y0, t)
3 # Extract S and find tf when S drops below 0.5
4 S = solution_first_pass[:, 1] #Substrate concentraion S
NameError: name 'odeint' is not defined
Second pass of the model#
Here the model will be defined including the Fed-batch dynamics
Define the model equations#
This cell defines the model equation in Fed-Batch mode.
The first step is to assign the variable to simulate to the result vector y.
The parameters characteristic for the strain are defined as well as physicochemical constants
The algebraic expressions that describe the kinetics of the system are defined
The feeding strategy is defined, both for the substrate and inductor. The expression used to calculate the substrate feed (F1) is dependednt on the substrate concentration. The expression used to calculate the inducer feed (Fi) is depedend on the biomass and inducer concentration.
The ODE system is defined
# define the function; Solve the model a second time, now including the feed eqs.
def eColi_second_pass(y, t, tf, Xbf, Vbf):
X = y[0]
S = y[1]
DOT = y[2]
A = y[3]
V = y[4]
I = y[5]
P = y[6]
# parameters describing the characteristics of the strain
qAmax = 1 # max Acetate consumption rate [g/g*h]
Kaq = 0.01 # affinity constant Acetate consumption [g/L]
Ksq = 0.1 # affinity constnat Substrate consumption [g/L]
Yam = 0.2 # yield acetate maintenance [g/g]
Yaresp = 0.2 # yield acetate respiratory [g/g]
Yem = 0.56 # yield excluding maintance [g/g]
qSmax = 1.4 # max glucose uptake rate [g/g*h]
Ks = 0.05 # affinity constant glucose consumption[g/L]
qm = 0.04 # specific maintenance coefficient[g/g*h]
Ko = 1 # Affinity constant, oxygen consumption [g/L]
Yosresp = 1.217 # yield from S to X, respiratory [g/g]
pAmax = 1 # max Acetate production rate [g/g*h]
Kap = 10 # affinity constant intracellular acetate production[g/L]
Yaof = 1 # aceate yield in overflow[g/g]
Yofm = Yem # [g/g]
k1=0.32
k2=0.00044
k3=0
Ki=0.55
# physicochemical constants
Cs = 0.391 # ratio of substrate per C in [gC/gS]
Cx = 0.488 # ratio of biomass per C in [gC/gX]
H = 14000 # conversion factor
DOTstar = 90
# algebraic variables
qS = qSmax * S / (S + Ks) * DOT / (DOT + Ko) # substrate uptake
qSof = pAmax * qS / ( qS + Kap ) / Yaof # overflow substrate conversion
pA = pAmax * qS / (qS + Kap) # production of acetate
qSox = qS - qSof # substrate uptake excluding overflow
qSan =(qSox - qm) * Yem * Cx / Cs # anabolic substrate consumption
qsA = qAmax * A / (A + Kaq) * (Ksq / (Ksq + qS)) # acetate consumption
qA = pA - qsA # total acetate equilibrium
qO = Yosresp * (qSox - qSan) + qsA * Yaresp # oxygen uptake
# growht rate equation
my =(qSox - qm) * Yem + qsA * Yam + qSof * Yofm
#Define the feeding eqautions
#no feed if t<tf
if t < tf:
F1 = 0
#We start with exponential feed
else:
if t <= (tf + 3):
F1 = (Xbf * Vbf * muset / (Yosresp * Si)) * np.exp(muset * (t - tf))
else:
#once S exceeds 0.03 we switch to a constant feeding strategy
if S >= 0.03:
F1 = 43.5
if F1 < 0:
F1 = 0
else:
F1 = (Xbf * Vbf * muset / (Yosresp * Si)) * np.exp(muset * (t - tf))
# Handle Fi dynamically based on X > 20; Start induction for rhGH production
if X < 20:
Fi = 0
else:
#We start with a flow rate of 20 to quicly increase IPTG concentration and start protein production
if I < 0.238:
Fi = 20
# Once the desired concentration of 0.238 (1mM) is reached, the flow rate is calculated such that this concentration remains constant
else:
Fi = 0.238 * F1 / (If - 0.238)
# differential equation system
dXdt = -F1 / V * X + my * X # biomass growth
dSdt = -F1 / V * S - qS * X + F1 / V * Si # substrate evolution
dDOTdt = KLa * (DOTstar - DOT) - qO * X * H # oxygen dynamics
dAdt = qA * X - F1 / V * A # aetate evolution
dVdt = (F1 + Fi - Fout) # volume change
dIdt = -F1/V*I + (Fi/V)*(If-I) #inducer dynamics
dPdt = -F1/V*P + (((k1*my*I)/(Ki+I))+k2)* X -k3 *P #protein dynamics
#the solution of odeint is formulated as a system of an ODE
dydt = [dXdt, dSdt, dDOTdt, dAdt, dVdt, dIdt, dPdt]
return dydt
Extract the solution of the ODE system#
# calling the numerical solver to approximate the integral of the differential equation system
y = odeint(eColi_second_pass, y0, t, args=(tf, Xbf, Vbf))
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 2
1 # calling the numerical solver to approximate the integral of the differential equation system
----> 2 y = odeint(eColi_second_pass, y0, t, args=(tf, Xbf, Vbf))
NameError: name 'odeint' is not defined
Generate the plots#
This cell is used to collect all the simulated variables in a dataframe, and add them to 4 different subplots.
def create_plot(t, y):
figure = make_subplots(rows=2, cols=2,specs=[[{},{"secondary_y": True}], [{},{}]]) #make figure with 2 subplots
#assign simulation results to variable for plotting
X = y[:, 0] # Extract data as 1D arrays
S = y[:, 1]
DOT = y[:, 2]
A = y[:, 3]
V = y[:, 4]
I = y[:, 5]
P = y[:, 6]
#collect all variables to plot in 1st subplot in a dataframe
df = pd.DataFrame({'t': t, 'X': X, 'S': S, 'DOT':DOT, 'Acet': A, 'V':V, 'Ind': I, 'P':P})
#add the different traces to 1st subplot
figure.add_trace(go.Scatter(x=df['t'], y=df['X'], name='Biomass'), row=1, col=1)
figure.add_trace(go.Scatter(x=df['t'], y=df['S'], name='Substrate'), row=1, col=1)
figure.add_trace(go.Scatter(x=df['t'], y=df['Acet'], name='Acetate'), row=1, col=2)
figure.add_trace(go.Scatter(x=df['t'], y=df['P'], name='protein'), row=1, col=2)
figure.add_trace(go.Scatter(x=df['t'], y=df['Ind'], name='inducer'), row=1, col=2, secondary_y=True)
figure.add_trace(go.Scatter(x=df['t'], y=df['DOT'], name='DOT'), row=2, col=1)
figure.update_yaxes(title_text="Concentration (g/L)", row=1, col=1) # Subplot (1, 1)
figure.update_yaxes(title_text="Protein and acetate (g/L)", row=1, col=2) # Subplot (1, 2)
figure.update_yaxes(title_text="Inducer (g/L)", row=1, col=2, secondary_y=True) # Subplot (1, 2), secondary y-axis
figure.update_yaxes(title_text="Volume (L)", row=2, col=2) # Subplot (2, 2)
figure.update_yaxes(title_text="DOT (%)", row=2, col=1) # Subplot (2, 1)
#add the title and axes labels
figure.update_layout(title=('Simulaton results'),
xaxis_title='time (h)')
#add trace to the 2nd subplot
figure.append_trace(go.Scatter(x=df['t'], y=df['V'], name='Volume'), row=2, col=2)
return figure
Create the figure#
fig = create_plot(t, y) # Call create_plot with the simulation results
fig.show()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[9], line 1
----> 1 fig = create_plot(t, y) # Call create_plot with the simulation results
3 fig.show()
NameError: name 't' is not defined
E. coli fed-batch simulation Created on Thu Dec 10 12:14:22 2015
@author: Terrance Wilms, Nicolas Cruz, Kevin Stegemann, Rosa Haßfurther Updated Sept 2024
Protein production dynamics added by Mariana Albino (October 2024)