5 Module Files and Metadata
See Barebones R script for the code shown in this chapter
5.1 Continue example – a linear model
We will start by thinking about metadata: What is metadata?
Slightly modifying the example, we remove the line with x <- rnorm(10). This will make the code chunk not work because it needs the x to run the next line. We can examine the following code chunk. First, we ask: what are the inputs and the outputs?
5.2 Input Expectations, Output Creations and Required Packages
5.2.1 Inputs and Outputs
We use the terms expectsInput and createsOutput to describe the inputs and outputs in the metadata. This makes it clear that the metadata do not specify which they will go get; rather, it makes it clear that it doesn’t matter where the inputs are coming from. They could come from one of three sources: a user, another module, or defaults that the developer sets up. Likewise, the module specifies which outputs it creates, without specifying “for what other module”.
The inputs to this chunk are just one: the object x. This code will not work (i.e., it will cause an error) if x is not defined. We can say that this code chunk “expects” x as an input.
The outputs are y and model. We can say that this code chunk “creates” y and model as outputs. However, we had said in Chapter 4 that we would only be interested in keeping model. So, we will continue with only one output, model.
5.2.2 Required Packages
Next, what are the package dependencies?. We call this reqdPkgs in the SpaDES metadata. We see that there are three functions: rnorm, lm and plot. We don’t know what packages they are in, so we can find out by typing them at the R prompt. At the bottom of the function, it says that the function rnorm is in the stats package. Fortunately for us, this is a default (“base”) package in R and it is always pre-loaded. So, nothing to do here.
Code
> rnorm function (n, mean = 0, sd = 1) ...
<environment: namespace:stats>}So, our expectations, creations and dependencies are:
- Inputs:
x - Outputs:
model - Package dependencies: Base packages only
We will next put them into the correct places in the new SpaDES module.
5.3 Module files
Make the module again (see Chapter 4). This time we will add sim$ for the x as we are now interested in the fact that it might be coming from outside this module.
Code
repos <- c("https://predictiveecology.r-universe.dev", getOption("repos"))
options(repos = repos)
if (!require("pak")) install.packages("pak")
pak::pak(c("Require", "SpaDES.project"), ask = FALSE)
library(SpaDES.project)
out <- setupProject(
options = list(repos = repos),
paths = list(projectPath = "~/SpaDES_book/NewModuleIntro")
)Code
library(SpaDES.core)
# make a module
modulePath <- "~/SpaDES_book/NewModuleIntro/NewModule"
SpaDES.core::newModule(name = "My_linear_model", path = modulePath, open = FALSE,
events = list(
init = {
y <- sim$x + rnorm(10) # <--------- add sim$ here
# fit a linear model
sim$model <- lm(y ~ sim$x) # <--------- add sim$ here
}
))newModule() is not part of the “workflow”
Be aware that every time newModule() is run with the same name and path arguments, it will overwrite your module folder/files.
As such, it is not meant to be part of a workflow. It is meant to provide the user with a tool to create module templates (once), which will then be changed as the module code develops.
Where is this module code? In the previous chapter, we didn’t look or care where the module code was.
newModule actually creates a new folder, with the name as provided by the argument, in the folder specified with path. This folder has several files in it. See ?newModule for details. For now, run the above and open the My_linear_model.R script that it creates.
When we make a module, we get a message stating where the module code is. From here, open the file, e.g., by copy-pasting the file path (pick the .R file NOT the .Rmd file for now).
You can also set newModule(..., open = TRUE) to have RStudio open the .R and .Rmd files automatically.
We will look at a few elements on the module R script in this chapter.
5.4 inputObjects
Scroll down to inputObjects and expectsInputs(). This is where we will put our inputs that we noticed in our code chunk. We will declare x as an “input” by putting it there, like this:
Code
inputObjects = bindrows(
expectsInput(objectName = "x", objectClass = "numeric",
desc = "The inputs for the linear model", sourceURL = NA)
)
5.5 outputObjects
Next, scroll down to outputObjects and createsOutput(). We will declare model as an “output” by putting it there. Don’t forget the comma at the end of each createsOutput() as each is an argument to bindrows (unless it is the last one).
Code
outputObjects = bindrows(
createsOutput(objectName = "model", objectClass = "lm",
desc = "A linear model object from the equation (x ~ y)")
)Note that each input and output object gets a expectsInput or createsOutput entry, respectively.
5.6 Default Values
Recall, we don’t have a value for x, unlike in the previous chapter where we had x defined in the init event. This means that if you run the following, you will get an error.
Code
out <- simInit(modules = "My_linear_model", paths = list(modulePath = modulePath))
out <- spades(out)
# or
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$modelJust like functions in R, we can supply default values for our module inputs. We put these in a function at the bottom called .inputObjects(). See Chapter 6 for a model detailed explanation of module inputs and how to deal with them.
Copy this to the module, replacing the template .inputObjects() function.
Code
.inputObjects <- function(sim) {
if (!suppliedElsewhere("x", sim))
sim$x <- rnorm(10, mean = 20, sd = 2)
return(invisible(sim))
}!suppliedElsewhere("x", sim) will check if x is in sim and if not, will run the subsequent code lines (see ?suppliedElsewhere).
After saving the module R script, we can run the module and inspect the output
Code
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$model5.8 Adding a new module: visualization module
We remake the second module from last chapter. But this time we will look at and update the metadata.
5.8.1 Outputs of one module are Inputs of another
Here we start to see the “shared” objects. The module we just made above createsOutput of model. But this new visualization module will expectInput of model. So, we can copy the same description if it is the same.
Code
inputObjects = bindrows(
expectsInput(objectName = "model", objectClass = "lm",
desc = "A linear model object from the equation (x ~ y)", sourceURL = NA)
)5.9 Run the new module
Now, we have inputs and outputs defined, our code has been places in 2 spots (events), and we have default value for x.
Code
simInitAndSpades(modules = c("My_linear_model", "visualize"),
paths = list(modulePath = modulePath))We now have a SpaDES module that has metadata, generates random starting data (if the user doesn’t supply an alternative), fits a linear model, outputs that model, and plots the fit.
5.10 Questions
- What are some things we “gained” from putting our simple 3 lines of code into a module?
- We can turn off plotting easily. Set
.plotInitialTime = NAin thesimInitAndSpadescall.
- What are some things we “lost”?
- More complicated. (overkill for these 3 lines?)
- What if we used an R package that wasn’t in the base packages list?
- See
?defineModulefor all the metadata items. Specifically, seereqdPkgs.
- What is the
sim? See?'.simList-class'
5.11 Try on your own
Fill in the metadata from the Challenges you did in previous chapter.
Look at the other elements of the metadata and cross reference them with
?defineModuleLook at the
Rmdfile of one of the modules that has been built (recall the message after you callnewModule), where you have filled in the metadata. Try to build it and look at the automatic tables that get built from the metadata.
5.12 Common mistakes
Some common mistakes/bugs that module developers encounter:
-
Object doesn’t exist/is NULL. Errors like the ones below are usually the result of not providing default values for an input in the
simList(via.inputObjects()), the user/another module not providing values for that object, or forgetting to assign an object to thesimList(i.e.sim$y <- <...>) See Chapter 6.
Error in model.frame.default(formula = y ~ sim$x, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'sim$x'
Error in eval(predvars, data, env) : object 'y' not found
-
Parsing errors. These happen when
simInithas issues reading the module R script (or other associated scripts) and are usually easy to debug by reading the error message. For instance, the one below indicates that lines 43, 44, have a problem, which is a missing comma after the “)”.
Error in parse(filename) :
C:<modulePath>/My_linear_model/My_linear_model.R:44:3: unexpected symbol
43: )
44: outputObjects
^
-
Environment- and
simList-related error. If thesimList(sim) is not returned at the end of a function that takes it as an argument – e.g. an event function, the.inputObjects()function… – it will either be lost or changed into something unexpected. Below are two examples of errors that may announce this problem.
Error in <...>
sim must be a simList
Error in as.environment(pos) : invalid 'pos' argument
5.13 See also
Chapter 8 on Modules, Events and Functions
Chapter 6 on how to provide Module Inputs
Chapter 7 on the simList
?defineModule describes all the metadata entries.
5.14 Barebones R script
Code
# create some data
y <- x + rnorm(10)
# fit a linear model
model <- lm(y ~ x)
repos <- c("https://predictiveecology.r-universe.dev", getOption("repos"))
options(repos = repos)
if (!require("pak")) install.packages("pak")
pak::pak(c("Require", "SpaDES.project"), ask = FALSE)
library(SpaDES.project)
out <- setupProject(
options = list(repos = repos),
paths = list(projectPath = "~/SpaDES_book/NewModuleIntro")
)
library(SpaDES.core)
# make a module
modulePath <- "~/SpaDES_book/NewModuleIntro/NewModule"
SpaDES.core::newModule(name = "My_linear_model", path = modulePath, open = FALSE,
events = list(
init = {
y <- sim$x + rnorm(10) # <--------- add sim$ here
# fit a linear model
sim$model <- lm(y ~ sim$x) # <--------- add sim$ here
}
))
out <- simInit(modules = "My_linear_model", paths = list(modulePath = modulePath))
out <- spades(out)
# or
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$model
out <- simInitAndSpades(modules = "My_linear_model", paths = list(modulePath = modulePath))
out$model
newModule("visualize", path = modulePath, open = FALSE,
events = list(
init = {
plot(sim$model)
}
)
)
simInitAndSpades(modules = c("My_linear_model", "visualize"),
paths = list(modulePath = modulePath))