# Introduction to modeling and coding in R

After introducing students to a simple mathematical model describing enzyme kinetics, I introduced them to coding and modeling in R.

Coding is traditionally done by first describing some coding ‘rules’, including ending the line with semicolons or how to define a variable. Personally I found these methods ineffective and frustrating. I took a few classes as an undergrad, and barely scraped by with a C in C++ 1 and a W in C++ 2.

In grad school I taught myself by copying working code from Q&A forums and playing around with it to understand how it worked. I took this approach and provided the students with the following code, separated into sections.
First the necessary packages were installed and loaded.
```install.packages(c("readxl","deSolve"))
library(deSolve)
```

Then the challenging part – importing data. This step is made challenging by students storing their data using different filenames, locations, and formatting thanks to operating systems and software. Because of this, I helped them individually and didn’t try to explain all the tiny irritating reasons why their code needed to be altered.

```# # # ## # # #
# Import Data
# # # ## # # #
my_data <- read_excel("experiment2_template.xlsx", sheet = "Data Entry",range="A2:Z9")
maxconc=max(my_data\$Mean)
averages=my_data\$Mean*(40/maxconc) #
std_dev=my_data\$`Std Dev`*(40/maxconc) #
t=my_data\$TIME

```

This also calculated maxconc, needed to create a ‘fudge’ parameter that will scale their data. Our response is in mM, but we don’t have the concentration of the enzyme and substrate they used in their experiments (simply potato extract). The fudge parameter allows us to ‘convert’ the unknown units of the experimental components into something like mM. I referred to it as a scaling parameter, since we are converting mL of solution into “mM”.

Then the parameters are defined:

```
# # # ## # # #
# Parameters
# # # ## # # #
r = .001 #
f=10 #
S_mL=5 #
E_mL=1 #
ce = E_mL*f #
se = S_mL*f #
tmax = 180 #
```

This model has two parameters – the enzymatic reaction rate r, and the fudge/scaling parameter f.

The equations are defined below, with the starting values yini defined first. We assume no product exists at the beginning of the reaction. However if delay occurred while they ran their experiments, they would have a significant amount of product formed initially. They could change that assumption by altering the starting product from zero to whatever they measured at time zero.

```# # # ## # # #
# Equations
# # # ## # # #
yini=c(S=se,P=0,E=ce) #
derivs = function(t,y,parms){
with(as.list(y),{
dS=-r*E*S #
dP=r*E*S #
dE=0 #
list(c(dS,dP,dE))
})
}

```

Then we solve the equations numerically. head(out) provides us with the first few lines of the solution. We can check if the amount of substrate, product, and enzyme are changing like we think they should be (ie, increasing or decreasing).

```
# # # ## # # #
# Calculate
# # # ## # # #
times = seq(from=0,to=tmax,by=15)
out = ode(y=yini,t=times,func=derivs,parms=c(r))

```

Finally, plot the simulation results against the data with error bars – if it fits well, you have a reaction rate estimate. If not, change r and re-run. Sometimes students had to modify the scaling parameter f as well – meaning our fudging wasn’t quite spot on, as we might expect.

```# # # ## # # #
# Plot
# # # ## # # #
par(bg = 'gray96')
plot(out[,"time"],out[,"P"]*1,type="l",lwd='3',col="bisque4",xlab="label (units)",ylab="label (units)",sub="Figure description",xlim=c(0,tmax),ylim=c(0,60))
lines(out[,"time"],out[,"S"]*1,type="l",lwd='3',col="rosybrown1")
lines(t,averages,type='p')
arrows(t, (averages+std_dev),t, (averages-std_dev), length=0.05, angle=90, code=3)
legend("right",legend=c("1","2"),col=c("bisque4","rosybrown1"),pch=c(1,1))
```

Wherever there was a free “#”, students were instructed to type what the content of that line meant. They had previously collected 6 sets of data in triplicate, which were plotted as means and standard deviations. They had to modify the reaction rate parameter, r, and the arbitrary scaling parameter f until the simulation looked like their data. These results were parameter estimation obtained via hand-fitting. The effect of the experimental conditions on the reaction rate was compared to their results obtained via fitting linear regressions to their data. Example figure with r and f modified to produce a fit.

By the end of the exercise, students had modified most lines of code (including initial product amount ‘yini’ and the figure axes and description). Code is available here.