parallelism_in_calm
Victor Navarro
Source:vignettes/parallelism_in_calm.Rmd
parallelism_in_calm.Rmd
Running experiments in parallel
With the advent of time-based models, version 0.51 of
calm
uses the future
package to parallelize
some operations. Thanks to the design philosophy of future
,
running things in parallel takes a single line of code.
Why run things in parallel?
In many situations we find ourselves having to run a model over many iterations, either because our design contains enough kinds of trials so that order effects are a worry, or because we want to run the same model with different parameters.
Let’s run the HeiDI model (Honey et al.,
2020) over a long, random design. Let’s also enable verbosity via
calm_verbosity
, which uses the cool progressr
package.
library(calm)
# enables progress bars (try it on your computer)
# calm_verbosity(TRUE)
pav_inhib <- data.frame(
group = "group",
phase1 = "50(US)/50AB/50#A",
rand1 = TRUE
)
# set options to introduce more randomness
pars <- get_parameters(pav_inhib, model = "HDI2020")
exp <- make_experiment(pav_inhib,
parameters = pars,
model = "HDI2020",
iterations = 100,
miniblocks = FALSE
)
# time it
start <- proc.time()
pav_res <- run_experiment(exp)
end <- proc.time() - start
end
#> user system elapsed
#> 4.892 0.063 3.611
You can see the timings above, under the elapsed
column.
Let’s try parallelizing now.
Running an experiment in parallel
To run the same experiment, but in parallel, you need to enable a
future
plan. A “plan” is one of many ways the
future
package can parallelize things (you should consult
their documentation). Regardless, if you are running calm
on a single computer, you’ll be using
plan(multisession)
library(future)
plan(multisession)
start <- proc.time()
pav_res <- run_experiment(exp)
end <- proc.time() - start
end
#> user system elapsed
#> 0.718 0.124 3.351
# go back to non-parallel evaluations
plan(sequential)
In this case, the parallel evaluation was slower. The
future
package trades off ease of use for bulkier
overheads. As those overheads tend to be constant, the parallelization
will have a better payoff once you run more iterations.