23  AI-assisted module development

Author

Eliot McIntire

Published

June 15, 2026

See the Glossary for definitions of recurring terms (e.g. SpaDES module, SpaDES event, interoperable model, modular workflow).

23.1 AI as an accelerator for a reusable module ecosystem

The hardest part of growing a SpaDES ecosystem isn’t writing any one module – it’s keeping the interfaces consistent across many of them so they actually compose. A module that calls a covariate pixelGroupMap plugs into the same pipeline as Biomass_core; one that calls it pgMap does not, even if the data are identical. The metadata (expectsInput, createsOutput, parameter names, timeunit) is what turns a clever script into a piece of shared infrastructure.

That metadata is largely mechanical to write – and exactly the kind of work an AI assistant handles well. Given a linear R script and a few existing modules to mimic, a current LLM can translate the script into a defineModule + doEvent + scheduling skeleton in seconds, matched to the conventions used by LandR, scfm, RSFpredict, and the rest. The role of the scientist shifts from typing the boilerplate to checking that the boilerplate is right – which inputs are needed, which outputs are produced, which parameters belong to which event.

There is a network effect: the more existing modules conform to a shared pattern, the better an LLM can produce a new module that fits. Today’s Biomass_core becomes the template for tomorrow’s Biomass_disturbance; today’s RSFpredict becomes the template for the next species’ SDM-as-module. The ecosystem builds itself faster, and stays consistent because nobody has to remember the conventions – the AI re-applies them on every conversion.

This chapter is a worked example. We will take the linear R script we built in Forecasting wildlife habitat 18 and walk through the prompts that turn it into the RSFpredict module. The goal isn’t to show off the prompts – it’s to show that the conversion is now a short conversation, not a refactor project.

23.2 Worked example: turning an RSF forecast into a SpaDES module

We start from the linear script in Converting habitat model prediction code into a SpaDES module 22: one setup section, one annual loop, no sim$, no doEvent. Four prompts get us to a working module: one to scaffold the module’s directory, then three to fill in metadata, init, and the recurring events.

23.2.1 Prompt 0 – scaffold the module

Before any AI work, generate the standard module directory layout so the AI has a file to edit rather than a clean slate to author from scratch. SpaDES.core::newModule() writes the template:

Code
SpaDES.core::newModule(
  name = "RSFpredict",
  path = "~/SpaDES_book/RSF_forecast/modules",
  open = FALSE
)

The skeleton has # ! ----- EDIT BELOW ----- ! markers showing where each section goes – the AI uses these as anchors and the diffs become trivially reviewable. You could ask the AI to generate the whole file in one shot instead, but starting from newModule()’s scaffold gives you a stable target that already conforms to SpaDES conventions, and keeps each prompt narrow.

23.2.2 Prompt 1 – fill in the metadata

Here is a linear R script that forecasts a Resource Selection Function across a simulated landscape, year by year. I already ran SpaDES.core::newModule("RSFpredict", path = ...) so the scaffold exists at modules/RSFpredict/RSFpredict.R. Replace the placeholder defineModule() block in that file with a real one. Include:

  • the required packages (reproducible, terra, glmmTMB, plus the ones the script uses);
  • parameters for every value I had in the params list (numBins, simulationProcess, predictionInterval, predictStartYear, predictLastYear, ts_else, .studyAreaName) – declare type and a sensible default for each;
  • inputObjects for the variables the script assumed were in scope: model (glmmTMB), studyArea (SpatVector), modelLand, cohortData, pixelGroupMap, timeSinceFire, rasterToMatch (all SpatRaster);
  • outputObjects for pred, binMap, simLand, simPred, simBinMap.

Use timeunit = "year". No event code yet – just the metadata.

The AI’s output is essentially the defineModule(...) block we showed at Chapter 22 – a declaration of what the module needs, what it produces, and what its parameters are. The single most important thing to check is that the inputObjects names match what the upstream modules (Biomass_core, scfm) produce. If they don’t, the project won’t wire up; if they do, you get composition for free.

23.2.3 Prompt 2 – fill in the init event + baseline computation

Now edit the doEvent.RSFpredict() stub newModule() already wrote. Keep the switch() shell but replace the init branch contents. The init event should:

  • schedule a buildBaselineRSFmap event at time(sim) (runs once, immediately);
  • if P(sim)$simulationProcess == "dynamic", also schedule simLayers and simRSFmap events at P(sim)$predictStartYear;
  • if P(sim)$predictLastYear is TRUE, schedule a final simLayers at end(sim).

Then write the buildBaselineRSFmap event – that’s the “setup section” of my linear script (the part before the for (thisYear in years) loop), translated to use sim$X instead of bare variables and P(sim)$X instead of params$X.

Don’t write the simLayers / simRSFmap events yet.

What you should check in the AI’s response:

  • Every sim$X that’s read corresponds to a declared expectsInput, and every sim$X that’s written corresponds to a declared createsOutput. Mismatches here are the most common source of “object not found” errors at run time.
  • Every parameter accessor is P(sim)$X (or Par$X in newer SpaDES.core) – not params$X, not a hard-coded literal. If the AI inlined 10 for numBins, push back: “use P(sim)$numBins instead”.

23.2.4 Prompt 3 – add the recurring events

Now add two new branches to the switch() in doEvent.RSFpredict(): simLayers and simRSFmap. Each should be the body of one iteration of the original for (thisYear in years) loop – simLayers is the part that builds the year’s covariate stack, simRSFmap is the part that predicts + classifies + writes GeoTIFFs. Replace thisYear with as.integer(time(sim)) and replace the bare lists simLand / simPred / simBinMap with the simList versions sim$simLand[[key]], etc.

At the end of each event, reschedule the same event at time(sim) + P(sim)$predictionInterval so it recurs.

Also: write the GeoTIFFs under file.path(outputPath(sim), paste0(P(sim)$.studyAreaName, "_sims")), creating that directory if needed.

The check after this prompt is whether the self-rescheduling trailing line is present in both events. Without it the events fire once and stop; the dynamic forecast becomes a static one and the bug is silent – you get some years of output, not a wrong-shaped error.

23.2.5 What you end up with

The three prompts together produce a 200-300 line RSFpredict.R roughly identical to the JWTurn/RSFpredict module we used in Chapter 18. Drop it under <projectPath>/modules/RSFpredict/RSFpredict.R and put any helper functions (like reclassifyCohortData()) under <projectPath>/modules/RSFpredict/R/.

23.3 Testing the module

The fastest sanity check is a minimal setupProject() that runs just enough of the upstream chain to populate the inputs RSFpredict needs (cohortData, pixelGroupMap, timeSinceFire). Two years are enough to know whether simLayers + simRSFmap fire correctly:

Code
repos <- c("https://predictiveecology.r-universe.dev", getOption("repos"))
options(repos = repos)
if (!require("pak")) install.packages("pak")
pak::pak(c("Require", "SpaDES.project"), ask = FALSE)

library(SpaDES.project)

out <- setupProject(
  paths   = list(projectPath = "~/SpaDES_book/RSFpredict_test"),
  modules = c(
    "PredictiveEcology/Biomass_borealDataPrep@main",
    "PredictiveEcology/Biomass_core@main",
    file.path("PredictiveEcology/scfm@development/modules",
              c("scfmDataPrep", "scfmIgnition", "scfmEscape",
                "scfmSpread", "scfmDiagnostics")),
    "JWTurn/RSFpredict@main"
  ),
  params  = list(
    .globals   = list(.studyAreaName = "dehchoN", .useCache = c(".inputObjects", "init")),
    RSFpredict = list(simulationProcess = "dynamic", predictionInterval = 1)
  ),
  times   = list(start = 2020, end = 2022),
  packages = c("LandR", "climateData"),
  studyArea = reproducible::prepInputs(
    url = "https://drive.google.com/file/d/1ma5qRk2NNidLhrQoiLIzd7W5ogeTaH5-/view",
    fun = "terra::vect"),
  model = reproducible::prepInputs(
    url = "https://drive.google.com/file/d/1ILE0WmePubjHiwynSjV_EyetnUhuQIJU/view",
    fun = "readRDS")
)

results <- SpaDES.core::simInitAndSpades2(out)
SpaDES.core::completed(results)

completed(results) should show every scheduled event having fired: buildBaselineRSFmap once at year 2020, then simLayers and simRSFmap at every year in 2020 ... 2022 because we set predictionInterval = 1. If you only see init and buildBaselineRSFmap, the recurring events aren’t rescheduling themselves – which is exactly the regression the AI’s self-scheduling line was supposed to prevent.

23.4 When things break: iterating with AI

Workflow-level errors fall into a few recognisable shapes. Each one maps to a short follow-up prompt.

23.4.1 “Object X not supplied”

When I run simInitAndSpades2(out), it errors with object 'pixelGroupMap' not supplied. RSFpredict declares it as an expectsInput, and Biomass_core declares it as a createsOutput. What’s wrong?

Three usual answers from the AI: (a) Biomass_core isn’t actually in the modules list (typo or missing); (b) it is in the list, but listed after RSFpredict, so init of RSFpredict runs before Biomass_core creates the object – fix by reordering; (c) the class declared in expectsInput doesn’t match what the upstream module declares in createsOutput, so the wiring is rejected. The AI will usually tell you which.

23.4.2 “CRS mismatch” / “extents don’t match”

The simLayers event errors in terra::resample with “CRS does not match”. Both rasters say EPSG:3978 when I print them. What else could it be?

This is almost always slight numerical drift in the proj4 string between two layers built by different modules. The fix is one line: terra::crs(b) <- terra::crs(a) before the resample. The AI will suggest this, or point at reproducible::postProcess(to = a) as the SpaDES-idiomatic version.

23.4.3 “Predictions only run for some years”

completed(results) shows simLayers fired at year 2020 but never after. Loop should run every 5 years until 2075. What’s wrong?

This is the missing self-reschedule line. Specifically: at the end of the simLayers event there must be a sim <- scheduleEvent(sim, time(sim) + P(sim)$predictionInterval, "RSFpredict", "simLayers"). The AI will spot this immediately if you paste the event body.

23.4.4 “I want predictions every 10 years instead of 5”

You don’t change the module – you change a parameter at the setupProject() call site:

Code
params = list(RSFpredict = list(predictionInterval = 10))

Which is exactly the point of putting the value in defineParameter() in Prompt 1. The AI will remind you of this if you ask “how do I change the prediction interval”; if you find yourself editing the module to change the value, that’s a hint the parameter wasn’t exposed.

23.5 See also

Converting habitat model prediction code into a SpaDES module 22

Forecasting wildlife habitat 18

Module Files and Metadata 5

?sec-modulesAndEvents