Code
# create some data
<- x + rnorm(10)
y # fit a linear model
<- lm(y ~ x) model
inputObjects
outputObjects
Eliot McIntire
June 20, 2024
See Barebones R script for the code shown in this chapter
We will start by thinking about metadata: What is metadata?
Slightly modifying the example, we remove the line with x <- rnorm(10)
. This will make the code chunk not work because it needs the x
to run the next line. We can examine the following code chunk. First, we ask: what are the inputs and the outputs?
We use the terms expectsInput
and createsOutput
to describe the inputs and outputs in the metadata. This makes it clear that the metadata do not specify which they will go get; rather, it makes it clear that it doesn’t matter where the inputs are coming from. They could come from one of three sources: a user, another module, or defaults that the developer sets up. Likewise, the module specifies which outputs it creates, without specifying “for what other module”.
The inputs to this chunk are just one: the object x
. This code will not work (i.e., it will cause an error) if x
is not defined. We can say that this code chunk “expects” x
as an input.
The outputs are y
and model
. We can say that this code chunk “creates” y
and model
as outputs. However, we had said in Chapter 4 that we would only be interested in keeping model
. So, we will continue with only one output, model
.
Next, what are the package dependencies?. We call this reqdPkgs
in the SpaDES metadata. We see that there are three functions: rnorm
, lm
and plot
. We don’t know what packages they are in, so we can find out by typing them at the R prompt. At the bottom of the function, it says that the function rnorm
is in the stats
package. Fortunately for us, this is a default (“base”) package in R and it is always pre-loaded. So, nothing to do here.
So, our expectations, creations and dependencies are:
x
model
We will next put them into the correct places in the new SpaDES module.
Make the module again (see Chapter 4). This time we will add sim$
for the x
as we are now interested in the fact that it might be coming from outside this module.
Require::Require(c("reproducible", "SpaDES.core"), repos = c("https://predictiveecology.r-universe.dev", getOption("repos")))
# make a module
modulePath <- "~/SpaDES_book/NewModuleIntro/NewModule"
SpaDES.core::newModule(name = "My_linear_model", path = modulePath, open = FALSE,
events = list(
init = {
y <- sim$x + rnorm(10) # <--------- add sim$ here
# fit a linear model
sim$model <- lm(y ~ sim$x) # <--------- add sim$ here
}
))
newModule()
is not part of the “workflow”
Be aware that every time newModule()
is run with the same name
and path
arguments, it will overwrite your module folder/files.
As such, it is not meant to be part of a workflow. It is meant to provide the user with a tool to create module templates (once), which will then be changed as the module code develops.
Where is this module code? In the previous chapter, we didn’t look or care where the module code was.
newModule
actually creates a new folder, with the name
as provided by the argument, in the folder specified with path
. This folder has several files in it. See ?newModule
for details. For now, run the above and open the My_linear_model.R
script that it creates.
When we make a module, we get a message stating where the module code is. From here, open the file, e.g., by copy-pasting the file path (pick the .R
file NOT the .Rmd
file for now).
You can also set newModule(..., open = TRUE)
to have RStudio open the .R and .Rmd files automatically.
Opening a module file
From here onward, we will need to manually open the module code file.
Every SpaDES module is defined by having at least 1 file that is named with <modulePath>/<moduleName>/<moduleName>.R
. Even though we can do a lot with newModule()
, we will need to get used to opening, examining and changing the code in the module code file.
We will look at a few elements on the module R script in this chapter.
inputObjects
Scroll down to inputObjects
and expectsInputs()
. This is where we will put our inputs
that we noticed in our code chunk. We will declare x
as an “input” by putting it there, like this:
outputObjects
Next, scroll down to outputObjects
and createsOutput()
. We will declare model
as an “output” by putting it there. Don’t forget the comma at the end of each createsOutput()
as each is an argument to bindrows
(unless it is the last one).
Note that each input and output object gets a expectsInput
or createsOutput
entry, respectively.
Recall, we don’t have a value for x
, unlike in the previous chapter where we had x
defined in the init
event. This means that if you run the following, you will get an error.
Just like functions in R, we can supply default values for our module inputs. We put these in a function at the bottom called .inputObjects()
. See Chapter 6 for a model detailed explanation of module inputs and how to deal with them.
Copy this to the module, replacing the template .inputObjects()
function.
!suppliedElsewhere("x", sim)
will check if x
is in sim
and if not, will run the subsequent code lines (see ?suppliedElsewhere
).
After saving the module R script, we can run the module and inspect the output
You may have noticed that the init
event is now placed into a function called doEvent.My_linear_model.init
and it has an argument sim
. If you are familiar with making functions in R
, this is just a named argument "sim"
. This means that we can use sim
inside the function. We don’t know what sim
is yet, and we don’t know how to use it fully yet. But we do know that we have added sim$
to the init
event. We also see that there is a return(sim)
added at the bottom of the event function.
An event occurs in a special function that starts with doEvent.
. But, since it is “in a function”, each event has all the features of a function:
arguments – specifically sim
can be used
something returned – in this case, the simList
. return(sim)
must always be present - it is by default, but don’t delete it!
it can be run– but don’t worry about running it; SpaDES runs it when it is time.
The simList
The simList
is the data structure that is the foundation of SpaDES
. It can be used like a list; accessing objects can be done with $
or [["object_name"]]
, for example, and objects are manipulated as they would normally – e.g. if DF
is a data.frame
one would use sim$DF[1, 3]
to extract the value on the first row and third column.
In any function that has an argument sim
, we pass the simList
, so we can access it with sim$
from inside such a function.
See Chapter 7 for more details about the simList
.
To share objects between modules, we must assign them to the sim
and make sure that return(sim)
is at the end of the function.
Now we have a module that creates one object, model
and puts them inside sim
. This all happens in the event called init
.
Next: add the visualization module with metadata
We remake the second module from last chapter. But this time we will look at and update the metadata.
Here we start to see the “shared” objects. The module we just made above createsOutput
of model
. But this new visualization
module will expectInput
of model
. So, we can copy the same description if it is the same.
Now, we have inputs and outputs defined, our code has been places in 2 spots (events), and we have default value for x
.
We now have a SpaDES module that has metadata, generates random starting data (if the user doesn’t supply an alternative), fits a linear model, outputs that model, and plots the fit.
What are some things we “gained” from putting our simple 3 lines of code into a module?
.plotInitialTime = NA
in the simInitAndSpades
call.What are some things we “lost”?
What if we used an R package that wasn’t in the base packages list?
?defineModule
for all the metadata items. Specifically, see reqdPkgs
.What is the sim
? See ?'.simList-class'
Fill in the metadata from the Challenges you did in previous chapter.
Look at the other elements of the metadata and cross reference them with ?defineModule
Look at the Rmd
file of one of the modules that has been built (recall the message after you call newModule
), where you have filled in the metadata. Try to build it and look at the automatic tables that get built from the metadata.
Some common mistakes/bugs that module developers encounter:
Object doesn’t exist/is NULL. Errors like the ones below are usually the result of not providing default values for an input in the simList
(via .inputObjects()
), the user/another module not providing values for that object, or forgetting to assign an object to the simList
(i.e. sim$y <- <...>
) See Chapter 6.
Error in model.frame.default(formula = y ~ sim$x, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'sim$x'
Error in eval(predvars, data, env) : object 'y' not found
Parsing errors. These happen when simInit
has issues reading the module R script (or other associated scripts) and are usually easy to debug by reading the error message. For instance, the one below indicates that lines 43, 44, have a problem, which is a missing comma after the “)”. Error in parse(filename) : C:<modulePath>/My_linear_model/My_linear_model.R:44:3: unexpected symbol 43: ) 44: outputObjects ^
Environment- and simList
-related error. If the simList
(sim
) is not returned at the end of a function that takes it as an argument – e.g. an event function, the .inputObjects()
function… – it will either be lost or changed into something unexpected. Below are two examples of errors that may announce this problem.
Error in <...>
sim must be a simList
Error in as.environment(pos) : invalid 'pos' argument
Chapter 8 on Modules, Events and Functions
Chapter 6 on how to provide Module Inputs
Chapter 7 on the simList
?defineModule
describes all the metadata entries.
# create some data
y <- x + rnorm(10)
# fit a linear model
model <- lm(y ~ x)
Require::Require(c("reproducible", "SpaDES.core"), repos = c("https://predictiveecology.r-universe.dev", getOption("repos")))
# make a module
modulePath <- "~/SpaDES_book/NewModuleIntro/NewModule"
SpaDES.core::newModule(name = "My_linear_model", path = modulePath, open = FALSE,
events = list(
init = {
y <- sim$x + rnorm(10) # <--------- add sim$ here
# fit a linear model
sim$model <- lm(y ~ sim$x) # <--------- add sim$ here
}
))
out <- simInit(modules = "My_linear_model", paths = list(modulePath = modulePath))
out <- spades(out)
# or
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$model
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$model
newModule("visualize", path = modulePath, open = FALSE,
events = list(
init = {
plot(sim$model)
}
)
)
simInitAndSpades(modules = c("My_linear_model", "visualize"),
paths = list(modulePath = modulePath))