9  Scheduling Events

Author

Eliot McIntire

Published

November 20, 2024

See Barebones R script for the code shown in this chapter

Here, we will talk about scheduling events in a SpaDES module using a function called SpaDES.core ::scheduleEvents(). Please see Chapter 4 for how to create a simple module and Chapter 8 for an overview of what events are before attempting this section.

Repeated? When?

When starting to build a module or an event, we ask 2 key questions:

1. Do I need to repeat this operation?

2. If yes, how often and/or when?

Once we answer these, we can use the function scheduleEvents to set them up in whatever sequence we want.

Scheduling events allows us to control when things happen in a sequence. This can be as simple as “this module will just do one thing” to “this module will do several things, some of which repeat”.

Below, we will use a simple example where there is one thing that happens once and one thing that happens several times. In “normal” R-speak, we will write a loop.

We return to the example in Chapter 4, but this time, we will add a predict step from the linear model, with updated values of the y. In this simple example, we will pretend that each prediction is “one year into the future” and that our y will slowly get bigger.

Code
# .inputObjects  #####
x <- rnorm(10)

# Init #####
y <- x + rnorm(10)
# fit a linear model
out <- lm(y ~ x)
pred <- list()

# Prediction #####
startYear <- 2023
years <- startYear:(startYear + 10) + 1
for (year in years - startYear) {
  pred[[year]] <- predict(out, newdata = data.frame(x = rnorm(10) + year))
}

# Plot #####
predictions <- data.frame(year = rep(years, each = 10), prediction = unlist(pred))
# plot the predictions
ggplot(predictions, aes(x = year, y = prediction, colour = year)) + 
  geom_point() + geom_smooth()

9.1 Convert to a module

In the above code, there are 4 sections. These will each be copied into the SpaDES module below, in their respective sections. We have added 1 new section, called “Prediction”.

Code
SpaDES.core::newModule(name = "scheduling", path = "~/SpaDES_book/scheduling")

Open the module file, which will be printed in the console. Scroll down to the Init function and copy the Init, the .inputObjects(), and the Plot parts in the code above into the new module, as we did in Chapter 4.

We must make the same decisions about “local objects” (objects that are OK not to be kept for later) and about “global objects” (those that we want to keep). These global objects get put into sim. Your new module should have these 3 new sections/functions:

Code
Init <- function(sim) {
  y <- sim$x + rnorm(10)
  # fit a linear model
  sim$out <- lm(y ~ sim$x)
  sim$pred <- list()
  
  return(invisible(sim))
}

plotFun <- function(sim) {
  sim$predictions <- data.frame(year = rep(sim$years, each = 10), prediction = unlist(sim$pred))
  # plot the predictions
  sim$gg <- ggplot(sim$predictions, aes(x = year, y = prediction, colour = year)) + geom_point() + geom_smooth()
  print(sim$gg)
  
  return(invisible(sim))
}

.inputObjects <- function(sim) {
  sim$x <- rnorm(10)
  
  return(invisible(sim))
}

9.1.1 Update module metadata

Like in Chapter 5, we also have to update the module metadata… what the inputs, outputs, parameters are. SpaDES will not error if the metadata are not correct, but it will try to point out the problems.

Here’s an example:

Code
createsOutput(objectName = "predictions", objectClass = "data.frame", 
              desc = "Table of predictions and years")

Make sure the inputs (x) and all outputs (predictions, gg, pred and out) are all declared.

9.2 New Event Function – Prediction

The new part is the for loop. This part takes new covariates (new values for y), emulating “predicted” covariates, and derives a newly predicted x, which we then keep as the predicted x values each year.

We’ll use scheduleEvent() below to do this iteration. For now, copy the Prediction part into a new function:

Code
Prediction <- function(sim) {
  startYear <- 2023
  sim$years <- startYear:(startYear + 10) + 1
  for (year in sim$years - startYear) {
    sim$pred[[year]] <- predict(sim$out, newdata = data.frame(x = rnorm(10) + year))
  }
  return(invisible(sim))
}

Note the new output years, which should also be added to the metadata.

9.3 Scheduling

Now we need to tell SpaDES when to execute and when to iterate each of the steps or “events”. The init we do once, at the start. The .inputObjects always only happens once and “before the start” (init) – it is scheduled automatically so nothing to do here. The last part (i.e., the prediction event) must happen 10 times, and it must start after init and before the plot. The plot we do once.

9.3.1 doEvent.*() and scheduleEvent()

There is a special function defined in a SpaDES module called doEvent.*() (it usually is appended with the module name: doEvent.<module_name>()). SpaDES calls this function several times and executes it’s code depending on the values of the eventTime and eventType arguments.

By default the only mandatory event executed by spades is the init1, unless we specifically tell it to do more, using the SpaDES.core::scheduleEvent() function and declaring more values for the event argument (i.e. more events).

Order of event code (not to confuse with event execution)

Events do not have to be defined in any particular order in the doEvent.<module_name>() function, as their order of execution is solely determined by scheduling (scheduleEvent()) or, for some events, imposed by simInit() and spades() (e.g., .inputObjects() and init are always executed first).

Similarly, the order in which functions used by events appear in the module R script is arbitrary.

scheduleEvent() is the “scheduler” function. Using the arguments eventTime and eventType, we define when and what event to schedule, respectively. eventPriority can be specified to determine which event should occur first, when two are scheduled for the same time2. Finally, moduleName and sim always have to be passed.

We place scheduleEvent() anywhere we want inside each event’s code in doEvent.*(). In this example, we will place it in a few places:

9.3.1.1 In the init event

We look at the init event (code inside event = "init") and we see 2 scheduleEvent calls that come with the template. We will use those to learn. In each “event”, we often see one part that “does something” and another part that “schedules something”.

In the default template, we “do” the Init function and we schedule plot and save. We can delete the save event, as we won’t do that here. But we will replace it with “schedule prediction”.

See ?start, ?scheduleEvent, ?P

Code
sim <- Init(sim)

# schedule future event(s)
sim <- scheduleEvent(sim, P(sim)$.plotInitialTime, "scheduling", "plot")

# here we create a new event type == "predictions" and we schedule it for `start(sim)`
sim <- scheduleEvent(sim, start(sim), "scheduling", "predictions") 

Copy the above code chunk to the init section so that it looks like this:

Code
switch(
  eventType,
  init = {
    sim <- Init(sim)
    
    # schedule future event(s)
    sim <- scheduleEvent(sim, P(sim)$.plotInitialTime, "scheduling", "plot")
    
    # here we create a new event type == "predictions" and we schedule it for `start(sim)`
    sim <- scheduleEvent(sim, start(sim), "scheduling", "predictions") 
  },
  ...
)
Match scheduleEvent() arguments to event names

The name of an event is defined as argument values for doEvent.<module_name>(..., eventType):

Code
doEvent.scheduling = function(sim, eventTime, eventType) {
  switch(
    eventType,
    init = {    # <----- "init" is the eventType/name
      <...>
    },
    predictions = {  # <----- "predictions" is another eventType/name
      <...>
    },
    warning(noEventWarning(sim))
  )
  return(invisible(sim))
}

It must match what is provided to scheduleEvent(..., eventType).

9.3.2 The new events

We will add code to make two new events, called "predictions" and "plot" (the name must match what we wrote in the scheduleEvent() above). In these events, we “do” the prediction, schedule the next prediction at “time plus 1”, and plot.

See ?SpaDES.core::times().

Time in SpaDES

It is useful to understand time in SpaDES. Time has units (e.g., “year”) which is defined by the modules metadata timeunit. So, if time(sim) is 1 or 2024, that would be in year units, if the module is in year. To know what “time” it is in a module, the easiest function is time(sim). That will give you the “time”.

Time, in SpaDES, is a way of ordering events. If one event happens at time 2024, and another event happens at 2025, then the 2024 event will happen first.

The other two useful functions are start(sim) and end(sim), which will give you the starting time of the spades call and ending time respectfully. These would have been specified in simInit.

Code
switch(
  eventType,
  init = {...},
  predictions = {
    # do the Prediction
    sim <- Prediction(sim)
    
    # schedule the next Prediction
    sim <- scheduleEvent(sim, time(sim) + 1, "scheduling", "predictions") 
  },
  plot = {
    # do the plot
    sim <- plotFun(sim)
  },
  ...
)

Because the "predictions" event schedules itself we will have to adapt the Prediction() function, as the for-loop is no longer necessary:

Code
Prediction <- function(sim) {
  
  # need "as.character" because start(sim) is 0, so time(sim) is 0 at the start:
  #  R cannot index a list with a 0 (zero)
  sim$pred[[as.character(time(sim))]] <- predict(sim$out, newdata = data.frame(x = rnorm(10) + time(sim)))
  
  return(invisible(sim))
}
Top-down vs Bottom-up workflow sequencing

In a “top-down” workflow, we could imagine something like:

Code
scheduleEvent(sim, time(sim) + 1, "predictions")
scheduleEvent(sim, time(sim) + 2, "predictions")
scheduleEvent(sim, time(sim) + 3, "predictions")

This would schedule 3 predictions, during the next three “time steps”. This may be very clear, but it would be difficult to do this for dozens of years (could be done with a for-loop).

A “bottom up” approach is to schedule the predictions event once during init, then each time that predictions event occurs, reschedule it:

Code
init = {
  ...
  scheduleEvent(sim, start(sim), "predictions")   # schedule first occurrence of "predictions"
},
predictions = {
  ...
  scheduleEvent(sim, time(sim) + 1, "predictions")  # re-schedule "predictions"
}

Each time the predictions event happens, it will reschedule itself forever! Very simple. The workflow sequence emerges from the self-scheduling.

9.3.3 Double check existing events

If we ran the above changes, we would see an error because plotFun was scheduled to run at the time that is specified by .plotInitialTime, which by default is start(sim). We have no data yet at that time. So, we will choose to schedule plotFun to run at end(sim):

Code
scheduleEvent(sim, end(sim), "scheduling", "plot")
Events and event functions – not to be confused

The name of an event and of the function that an event calls (the “event function”) need not (and should not, to avoid confusion) be the same.

In the examples above the predictions event called the Prediction() event function and the plot event called the plotFun() function.

Conventionally, event names have been all lowercase; but this highly depends on developer coding style and the module.

9.4 Run

The resulting module can now be run.

Code
SpaDES.core::simInitAndSpades(modules = "scheduling", paths = list(modulePath = "~/SpaDES_book/scheduling"))

A complete version of this new module can be found at scheduling.R. You can copy this over top of the new module that you created here, i.e., keep the same folder structure.

9.5 Questions

  1. how are some ways that scheduleEvent is different from a “loop”?
  • We can change the “step” length between events.

  • We can interweave events, e.g., a fire event then a tree growth event

  1. When is a loop better than scheduling?
  • Loops are much faster than scheduling.

  • But really, avoid loops in R if you can help it

  1. Chances are that the first versions of this module didn’t work. How do you fix it?

9.6 Try on your own

  • Try creating a new parameter that controls the frequency of predictions.

Here’s a clue:

Code
predictions = {
scheduleEvent(sim, time(sim) + P(sim)$predFreq, "predictions")
}    

9.7 Common mistakes

  • Undefined event type. This is a warning, rather than an error, that is issued when an event is being scheduled with scheduleEvent(), but has not been defined in doEvent.<module_name>(). Check spelling and make sure all scheduled events appear in doEvent.<module_name>().

  • Event not being scheduled/executed. If an event doesn’t appear printed in spades() messages during workflow execution or in the event queues shown by events() or completed(), double check:

  • whether the first occurrence of the event is being scheduled by the init event;

  • if the chosen time for the first occurrence is correct;

  • whether the event is rescheduling itself appropriately;

  • whether the name of the event and module are spelled correctly in scheduleEvent();

  • whether the event has been defined in doEvent.<module_name>().

9.8 See also

Chapter 8 on Modules, Events and Functions

?scheduleEvent

?P

?time

?start

?end

?switch

9.9 More learning on this subject

My First Module

Creating a seed dispersal module

9.10 Barebones R script

Code
# .inputObjects  #####
x <- rnorm(10)

# Init #####
y <- x + rnorm(10)
# fit a linear model
out <- lm(y ~ x)
pred <- list()

# Prediction #####
startYear <- 2023
years <- startYear:(startYear + 10) + 1
for (year in years - startYear) {
  pred[[year]] <- predict(out, newdata = data.frame(x = rnorm(10) + year))
}

# Plot #####
predictions <- data.frame(year = rep(years, each = 10), prediction = unlist(pred))
# plot the predictions
ggplot(predictions, aes(x = year, y = prediction, colour = year)) + 
  geom_point() + geom_smooth()

SpaDES.core::newModule(name = "scheduling", path = "~/SpaDES_book/scheduling")























SpaDES.core::simInitAndSpades(modules = "scheduling", paths = list(modulePath = "~/SpaDES_book/scheduling"))

predictions = {
scheduleEvent(sim, time(sim) + P(sim)$predFreq, "predictions")
}    





defineModule(sim, list(
  name = "scheduling",
  description = "",
  keywords = "",
  authors = structure(list(list(given = c("First", "Middle"), family = "Last", role = c("aut", "cre"), email = "email@example.com", comment = NULL)), class = "person"),
  childModules = character(0),
  version = list(scheduling = "0.0.0.9000"),
  timeframe = as.POSIXlt(c(NA, NA)),
  timeunit = "year",
  citation = list("citation.bib"),
  documentation = list("NEWS.md", "README.md", "scheduling.Rmd"),
  reqdPkgs = list("SpaDES.core (>= 2.1.5.9003)", "ggplot2"),
  parameters = bindrows(
    #defineParameter("paramName", "paramClass", value, min, max, "parameter description"),
    defineParameter("length", "integer", 10L, 5L, 100L,
                    "Number of values to generate"),
    defineParameter(".plots", "character", "screen", NA, NA,
                    "Used by Plots function, which can be optionally used here"),
    defineParameter(".plotInitialTime", "numeric", start(sim), NA, NA,
                    "Describes the simulation time at which the first plot event should occur."),
    defineParameter(".plotInterval", "numeric", NA, NA, NA,
                    "Describes the simulation time interval between plot events."),
    defineParameter(".saveInitialTime", "numeric", NA, NA, NA,
                    "Describes the simulation time at which the first save event should occur."),
    defineParameter(".saveInterval", "numeric", NA, NA, NA,
                    "This describes the simulation time interval between save events."),
    defineParameter(".studyAreaName", "character", NA, NA, NA,
                    "Human-readable name for the study area used - e.g., a hash of the study",
                          "area obtained using `reproducible::studyAreaName()`"),
    ## .seed is optional: `list('init' = 123)` will `set.seed(123)` for the `init` event only.
    defineParameter(".seed", "list", list(), NA, NA,
                    "Named list of seeds to use for each event (names)."),
    defineParameter(".useCache", "logical", FALSE, NA, NA,
                    "Should caching of events or module be used?")
  ),
  inputObjects = bindrows(
    expectsInput(objectName = "x", objectClass = "numeric", 
                 desc = "A vector of values for x in your linear model", sourceURL = NA)
  ),
  outputObjects = bindrows(
    createsOutput(objectName = "predictions", objectClass = "data.frame", 
                  desc = "Table of predictions and years")
  )
))

doEvent.scheduling = function(sim, eventTime, eventType) {
  switch(
    eventType,
    init = {
      sim <- Init(sim)
      scheduleEvent(sim, start(sim), "predictions")   # schedule first occurrence of "predictions"
    },
    predictions = {
      sim <- Prediction(sim)
      scheduleEvent(sim, time(sim) + 1, "predictions")  # re-schedule "predictions"
    }
    warning(noEventWarning(sim))
  )
  return(invisible(sim))
}

Prediction <- function(sim) {
  
  sim$pred[[as.character(time(sim))]] <- predict(sim$out, newdata = data.frame(x = rnorm(10) + time(sim)))
  
  return(invisible(sim))
}

Init <- function(sim) {
  y <- sim$x + rnorm(10)
  # fit a linear model
  sim$out <- lm(y ~ sim$x)
  sim$pred <- list()
  
  return(invisible(sim))
}

plotFun <- function(sim) {
  sim$predictions <- data.frame(year = rep(sim$years, each = 10), prediction = unlist(sim$pred))
  # plot the predictions
  sim$gg <- ggplot(sim$predictions, aes(x = year, y = prediction, colour = year)) + geom_point() + geom_smooth()
  print(sim$gg)
  
  return(invisible(sim))
}

.inputObjects <- function(sim) {
  sim$x <- rnorm(10)
  
  return(invisible(sim))
}

9.11 Suggested solution all together

This might be a possible solution for your module:

Code
defineModule(sim, list(
  name = "scheduling",
  description = "",
  keywords = "",
  authors = structure(list(list(given = c("First", "Middle"), family = "Last", role = c("aut", "cre"), email = "email@example.com", comment = NULL)), class = "person"),
  childModules = character(0),
  version = list(scheduling = "0.0.0.9000"),
  timeframe = as.POSIXlt(c(NA, NA)),
  timeunit = "year",
  citation = list("citation.bib"),
  documentation = list("NEWS.md", "README.md", "scheduling.Rmd"),
  reqdPkgs = list("SpaDES.core (>= 2.1.5.9003)", "ggplot2"),
  parameters = bindrows(
    #defineParameter("paramName", "paramClass", value, min, max, "parameter description"),
    defineParameter("length", "integer", 10L, 5L, 100L,
                    "Number of values to generate"),
    defineParameter(".plots", "character", "screen", NA, NA,
                    "Used by Plots function, which can be optionally used here"),
    defineParameter(".plotInitialTime", "numeric", start(sim), NA, NA,
                    "Describes the simulation time at which the first plot event should occur."),
    defineParameter(".plotInterval", "numeric", NA, NA, NA,
                    "Describes the simulation time interval between plot events."),
    defineParameter(".saveInitialTime", "numeric", NA, NA, NA,
                    "Describes the simulation time at which the first save event should occur."),
    defineParameter(".saveInterval", "numeric", NA, NA, NA,
                    "This describes the simulation time interval between save events."),
    defineParameter(".studyAreaName", "character", NA, NA, NA,
                    "Human-readable name for the study area used - e.g., a hash of the study",
                          "area obtained using `reproducible::studyAreaName()`"),
    ## .seed is optional: `list('init' = 123)` will `set.seed(123)` for the `init` event only.
    defineParameter(".seed", "list", list(), NA, NA,
                    "Named list of seeds to use for each event (names)."),
    defineParameter(".useCache", "logical", FALSE, NA, NA,
                    "Should caching of events or module be used?")
  ),
  inputObjects = bindrows(
    expectsInput(objectName = "x", objectClass = "numeric", 
                 desc = "A vector of values for x in your linear model", sourceURL = NA)
  ),
  outputObjects = bindrows(
    createsOutput(objectName = "predictions", objectClass = "data.frame", 
                  desc = "Table of predictions and years")
  )
))

doEvent.scheduling = function(sim, eventTime, eventType) {
  switch(
    eventType,
    init = {
      sim <- Init(sim)
      scheduleEvent(sim, start(sim), "predictions")   # schedule first occurrence of "predictions"
    },
    predictions = {
      sim <- Prediction(sim)
      scheduleEvent(sim, time(sim) + 1, "predictions")  # re-schedule "predictions"
    }
    warning(noEventWarning(sim))
  )
  return(invisible(sim))
}

Prediction <- function(sim) {
  
  sim$pred[[as.character(time(sim))]] <- predict(sim$out, newdata = data.frame(x = rnorm(10) + time(sim)))
  
  return(invisible(sim))
}

Init <- function(sim) {
  y <- sim$x + rnorm(10)
  # fit a linear model
  sim$out <- lm(y ~ sim$x)
  sim$pred <- list()
  
  return(invisible(sim))
}

plotFun <- function(sim) {
  sim$predictions <- data.frame(year = rep(sim$years, each = 10), prediction = unlist(sim$pred))
  # plot the predictions
  sim$gg <- ggplot(sim$predictions, aes(x = year, y = prediction, colour = year)) + geom_point() + geom_smooth()
  print(sim$gg)
  
  return(invisible(sim))
}

.inputObjects <- function(sim) {
  sim$x <- rnorm(10)
  
  return(invisible(sim))
}

  1. although others appear in the template .R generated by newModule() .↩︎

  2. by default, all events get equal relative priority, and the one scheduled first will happen first.↩︎