Use case: minimise execution times by performing long-running tasks concurrently in separate processes.
Multiple long computes (model fits etc.) can be performed in parallel on available computing cores.
Use mirai()
to evaluate an expression asynchronously in a separate, clean R process.
The following mimics an expensive calculation that eventually returns a random value.
x <- list(time = 2L, mean = 4)
m <- mirai({Sys.sleep(time); rnorm(5L, mean)}, time = x$time, mean = x$mean)
The mirai expression is evaluated in another process and hence must be self-contained, not referring to variables that do not already exist there.
Above, the variables time
and mean
are passed as part of the mirai()
A ‘mirai’ object is returned immediately - creating a mirai never blocks the session.
Whilst the async operation is ongoing, attempting to access a mirai’s data yields an ‘unresolved’ logical NA.
#> < mirai [] >
#> 'unresolved' logi NA
To check whether a mirai remains unresolved (yet to complete):
#> [1] TRUE
To wait for and collect the return value, use the mirai’s []
#> [1] 5.585340 5.414438 2.800432 4.279327 4.478866
As a mirai represents an async operation, it is never necessary to wait for it - other code can continue to be run.
Once it completes, the return value automatically becomes available at $data
#> < mirai [$data] >
#> [1] 5.585340 5.414438 2.800432 4.279327 4.478866
For easy programmatic use of mirai()
, ‘.expr’ accepts a pre-constructed language object, and also a list of named arguments passed via ‘.args’.
So, the following would be equivalent to the above:
expr <- quote({Sys.sleep(time); rnorm(5L, mean)})
args <- list(time = x$time, mean = x$mean)
m <- mirai(.expr = expr, .args = args)
#> [1] 4.592024 4.552095 3.791351 3.828170 6.445934
Use case: ensure execution flow of the main process is not blocked.
High-frequency real-time data cannot be written to file/database synchronously without disrupting the execution flow.
Cache data in memory and use mirai()
to perform periodic write operations concurrently in a separate process.
Below, ‘.args’ is used to pass environment()
, which is the calling environment.
This provides a convenient method of passing in existing objects.
x <- rnorm(1e6)
file <- tempfile()
m <- mirai(write.csv(x, file = file), .args = environment())
A ‘mirai’ object is returned immediately.
may be used in control flow statements to perform actions which depend on resolution of the ‘mirai’, both before and after.
This means there is no need to actually wait (block) for a ‘mirai’ to resolve, as the example below demonstrates.
# unresolved() queries for resolution itself so no need to use it again within the while loop
while (unresolved(m)) {
cat("while unresolved\n")
#> while unresolved
#> while unresolved
cat("Write complete:", is.null(m$data))
#> Write complete: TRUE
Now actions which depend on the resolution may be processed, for example the next write.
Use case: isolating code that can potentially fail in a separate process to ensure continued uptime.
As part of a data science / machine learning pipeline, iterations of model training may periodically fail for stochastic and uncontrollable reasons (e.g. buggy memory management on graphics cards).
Running each iteration in a ‘mirai’ isolates this potentially-problematic code such that even if it does fail, it does not bring down the entire pipeline.
run_iteration <- function(i) {
if (runif(1) < 0.1) stop("random error\n", call. = FALSE) # simulates a stochastic error rate
sprintf("iteration %d successful\n", i)
for (i in 1:10) {
m <- mirai(run_iteration(i), environment())
while (is_error_value(call_mirai(m)$data)) {
m <- mirai(run_iteration(i), environment())
#> iteration 1 successful
#> iteration 2 successful
#> iteration 3 successful
#> iteration 4 successful
#> iteration 5 successful
#> iteration 6 successful
#> iteration 7 successful
#> iteration 8 successful
#> Error: random error
#> iteration 9 successful
#> iteration 10 successful
Further, by testing the return value of each ‘mirai’ for errors, error-handling code is then able to automate recovery and re-attempts, as in the above example. Further details on error handling can be found in the section below.
The end result is a resilient and fault-tolerant pipeline that minimises downtime by eliminating interruptions of long computes.
Daemons, or persistent background processes, may be set to receive ‘mirai’ requests.
This is potentially more efficient as new processes no longer need to be created on an ad hoc basis.
Daemons inherit the default system configuration and read in the relevant ‘.Renviron’ and ‘.Rprofile’ etc. on startup.
They also load the default packages.
To instead only load the base
package (which cuts out more than half of R’s startup time), the environment variable R_SCRIPT_DEFAULT_PACKAGES=NULL
may be set prior to launching daemons.
Call daemons()
specifying the number of daemons to launch.
#> [1] 6
To view the current status, status()
provides the number of active connections, the URL daemons connect to, and a named vector showing the number of awaiting, executing and completed tasks:
number of tasks queued for execution at dispatcherassigned
number of tasks sent to a daemon for executioncomplete
number of tasks for which the result has been received (either completed or cancelled)status()
#> $connections
#> [1] 6
#> $daemons
#> [1] "abstract://d50d377409643074e3ba74b8"
#> $mirai
#> awaiting executing completed
#> 0 0 0
The default dispatcher = TRUE
creates a dispatcher()
background process that connects to individual daemon processes on the local machine.
This ensures that tasks are dispatched efficiently on a first-in first-out (FIFO) basis to daemons for processing.
Tasks are queued at dispatcher and sent to a daemon as soon as it can accept the task for immediate execution.
Dispatcher uses synchronisation primitives from nanonext
, waiting upon rather than polling at intervals for tasks, which is efficient both in terms of consuming no resources while waiting, and also being fully synchronised with events (having no latency).
#> [1] 0
Set the number of daemons to zero to reset. This reverts to the default of creating a new background process for each ‘mirai’ request.
Alternatively, specifying dispatcher = FALSE
, the background daemons connect directly to the host process.
daemons(6, dispatcher = FALSE)
#> [1] 6
Requesting the status now shows 6 connections, along with the host URL at $daemons
#> $connections
#> [1] 6
#> $daemons
#> [1] "abstract://e6b1473a5598094c8308a67f"
This implementation sends tasks immediately, and ensures that tasks are evenly-distributed amongst daemons. This means that optimal scheduling is not guaranteed as the duration of tasks cannot be known a priori. As an example, tasks could be queued at a daemon behind a long-running task, whilst other daemons are idle having already completed their tasks.
The advantage of this approach is that it is low-level and does not require an additional dispatcher process. It is well-suited to working with similar-length tasks, or where the number of concurrent tasks typically does not exceed available daemons.
may be used to evaluate an expression on all connected daemons and persist the resultant state, regardless of a daemon’s ‘cleanup’ setting.
The above keeps the DBI
package loaded for all evaluations.
Other types of setup task may also be performed, including making a common resource available, such as a database connection:
file <- tempfile()
everywhere(con <<- dbConnect(RSQLite::SQLite(), file), file = file)
By super-assignment, the conenction ‘con’ will be available in the global environment of all daemon instances. Subsequent mirai calls may then make use of ‘con’.
m <- mirai(capture.output(str(con)))
#> [1] "Formal class 'SQLiteConnection' [package \"RSQLite\"] with 8 slots"
#> [2] " ..@ ptr :<externalptr> "
#> [3] " ..@ dbname : chr \"/tmp/RtmpsF24Di/file13e131bed651d\""
#> [4] " ..@ loadable.extensions: logi TRUE"
#> [5] " ..@ flags : int 70"
#> [6] " ..@ vfs : chr \"\""
#> [7] " ..@ ref :<environment: 0x5650701ad1e0> "
#> [8] " ..@ bigint : chr \"integer64\""
#> [9] " ..@ extended_types : logi FALSE"
Disconnect from the database everywhere, and set the number of daemons to zero to reset.
#> [1] 0
has a with()
method, which evaluates an expression with daemons created for the duration of the expression and automatically torn down upon completion.
It was designed for the use case of running a Shiny app with the desired number of daemons.
with(daemons(4), shiny::runApp(app))
Note: in the above case, it is assumed the app is already created.
Wrapping a call to shiny::shinyApp()
would not work as runApp()
is implicitly called when the app is printed, however printing occurs only after with()
has returned, hence the app would run outside of the scope of the with()
In the case of a Shiny app, all mirai calls will be executed before the app returns. In the case of other expressions, be sure to call the results (or collect the values) of all mirai within the expression so that daemons are not reset before they have all completed.
The daemons interface may also be used to send tasks for computation to remote daemon processes on the network.
Call daemons()
specifying ‘url’ as a character string such as: ‘tcp://’ at which daemon processes should connect.
Alternatively, use host_url()
to automatically construct a valid URL.
The host / dispatcher listens at this address, utilising a single port.
IPv6 addresses are also supported and must be enclosed in square brackets []
to avoid confusion with the final colon separating the port.
For example, port 5555 on the IPv6 address ::ffff:a6f:50d
would be specified as tcp://[::ffff:a6f:50d]:5555
For options on actually launching the daemons, please see the next section.
Below, calling host_url()
without a port value uses the default of ‘0’.
This is a wildcard value that will automatically cause a free ephemeral port to be assigned:
daemons(url = host_url())
#> [1] 0
The actual assigned port may be queried at any time via status()
#> $connections
#> [1] 0
#> $daemons
#> [1] "tcp://hostname:38115"
#> $mirai
#> awaiting executing completed
#> 0 0 0
Dispatcher automatically adjusts to the number of daemons actually connected. Hence it is possible to dynamically scale up or down the number of daemons according to requirements.
To reset all connections and revert to default behaviour:
#> [1] 0
Closing the connection causes the dispatcher to exit automatically, and in turn all connected daemons when their respective connections with the dispatcher are terminated.
To launch remote daemons, supply a remote launch configuration to the ‘remote’ argument of daemons()
when setting up daemons, or launch_remote()
at any time afterwards.
may be used to generate a remote launch configuration if there is SSH access to the remote machine.
Otherwise remote_config()
provides a flexible method for generating a configuration involving a custom resource manager / application.
This method is appropriate for internal networks and in trusted, properly-configured environments where it is safe for your machine to accept incoming connections on certain ports. In the examples below, the remote daemons connect back directly to port 5555 on the local machine.
In these cases, using TLS is often desirable to provide additional security to the connections.
The first example below launches 4 daemons on the machine (using the default SSH port of 22 as this was not specified), connecting back to the host URL:
n = 4,
url = host_url(tls = TRUE, port = 5555),
remote = ssh_config("ssh://")
The second example below launches one daemon on each of and using the custom SSH port of 222:
n = 1,
url = host_url(tls = TRUE, port = 5555),
remote = ssh_config(c("ssh://", "ssh://"))
Use SSH tunnelling to launch daemons on any machine you are able to access via SSH, whether on the local network or the cloud. SSH key-based authentication must already be in place, but no other configuration is required.
This provides a convenient way to launch remote daemons without them needing to directly access the host. Firewall configurations or security policies often prevent opening a port to accept outside connections. In these cases, SSH tunnelling creates a tunnel once the initial SSH connection is made. For simplicity, the implementation in mirai uses the same tunnel port on both the host and daemon.
To use tunnelling, supply a URL with hostname of ‘’ to ‘url’ for the daemons()
local_url(tcp = TRUE)
does this for you.For example, if local_url(tcp = TRUE, port = 5555)
is specified, the tunnel is created using port 5555 on each machine.
The host listens to
on its side, and the daemons each dial into
on their own respective machines.
The below example launches 2 daemons on the remote machine using SSH tunnelling:
n = 2,
url = local_url(tcp = TRUE),
remote = ssh_config("ssh://", tunnel = TRUE)
may be used to run a command to deploy daemons using a resource manager.
Taking Slurm as an example, the following uses sbatch
to launch a daemon on the cluster, with some additional arguments to sbatch
specifying the resource allocation:
n = 2,
url = host_url(),
remote = remote_config(
command = "sbatch",
args = c("--mem 512", "-n 1", "--wrap", "."),
rscript = file.path(R.home("bin"), "Rscript"),
quote = TRUE
As an alternative to automated launches, calling launch_remote()
without specifying ‘remote’ may be used to return the shell commands for deploying daemons manually.
The printed return values may be copy / pasted directly to a remote machine.
daemons(url = host_url())
#> [1] 0
#> [1]
#> Rscript -e 'mirai::daemon("tcp://hostname:46387",dispatcher=TRUE,rs=c(10407,-750570516,-899706883,2027103898,-1599751341,-896714632,-1754509287))'
#> [2]
#> Rscript -e 'mirai::daemon("tcp://hostname:46387",dispatcher=TRUE,rs=c(10407,-1273243929,-450408256,-2121905579,1494473326,-1556967427,669883809))'
#> [1] 0
TLS is available as an option to secure communications from the local machine to remote daemons.
An automatic zero-configuration default is implemented.
Simply specify a secure URL using the scheme tls+tcp://
when setting daemons, or use host_url(tls = TRUE)
, for example:
daemons(url = host_url(tls = TRUE))
#> [1] 0
Single-use keys and certificates are automatically generated and configured, without requiring any further intervention. The private key is always retained on the host machine and never transmitted.
The generated self-signed certificate is available via launch_remote()
This function conveniently constructs the full shell command to launch a daemon, including the correctly specified ‘tls’ argument to daemon()
#> [1]
#> Rscript -e 'mirai::daemon("tls+tcp://hostname:44215",dispatcher=TRUE,tls=c("-----BEGIN CERTIFICATE-----
#> ADCCAgoCggIBALJm+Q+mbBnwTUDHNBgsDwNAi8mbxv6uuFsZepQyDN7dvKLCdsz3
#> gnBABWv+0l58DZ1QkKnFZqDZdb8yvEZ1ECbSZuXXe7EklqV5kn6zrFxeBdaJ2VP7
#> ZEo6TcSuK5CLRssZLS70nkVS+uCEUIZFqeaQrV4f/cyJxZMey2flKNMHur08+MJ1
#> vvO9UJvzqWLGYUZtR0MuY8XUY1LQPUZhjFXvJS2YDi1pfRLvfeXSv2zr8+XDfDaj
#> URLCy1mc5Mg4y1gHM8ddMJyKjQKSgpiPYKfIFJPImiomvcDPBUsdf7tkgT5RKxEQ
#> opdBVXuhSKxazuGSUZaVkIfrzUUW23CwbQGOoJu6IJLvsiQtCi3dXu0X+Yoh50Op
#> S6IbYrGEDNfBPiT3QpLPdDqf6wf8KHr9XXpAFuuOKLOdDgsA/5apXLTDjv8/ILg8
#> xjDd4HGF5EKqENoV8dkmkge5lYuzIgBhVJcN7Rv/9erOvAjsiiHGuf+OXzQUCdCA
#> RVklcXZ3YynDL8teOs7MM8Uz/yTH21pJxywM6qz9/fM3PGan0BbnSkPQomxjlVJk
#> mPqqm9WoGznXKCgyjrBYRMt38FU8XzZXi1ank3Us8XSFtDzkBJkGVs1vrWXoPlEK
#> S/g8q6EFCc5u35AmET+vATNFH9LNBTfZT2dwC1SeIcUSp0UCuC+cn5rtAgMBAAGj
#> DQEBCwUAA4ICAQA4c+lz6wVI8FTvhdx9k7WWIlrTjWa7r/xK+NcQmDlQAZz9Jtal
#> DIMaNGxLUdhDKgjj2+kCAHjepSJQIwwyVjImCrjmv6vSs1rRhyh4Eohn+xYCeNJf
#> 7fTuKL/r2tgEoxu/XEoljMeaXbUL162698L4gVVB8xeKGT6x/weiPptRGkIGYqJW
#> 0p80w9DBdWWOKXqg3UBrH9epythB0jQ8t1LwBf/EBemZJ+GYtlLuAlJKHs2V0kWL
#> 4f9YePl6/eVs5dfKu4szMV3enNqGzssEP6ZMSJb3uEfqEkLsxCgbW76AuKNYCjZ4
#> TvkdBo4RQh0zgur1wy/ggqGG/OVLQb4gYv9QfpPJQq2TsEIXqtX6+eUohLfd2+P3
#> JOHvfPUv+y2FPA3G2X+GJwX+jLXPaAC3vRfcTXMxlyEprN0CFeTBM6b6SULxczmI
#> 1lidGnB8DKaMfc7mDn/Fk5a5nC2CB1zxG/lkwcz8k6Is/OhXxsbbgzqHQpMJpKSa
#> q00EfC9AAQmjGjwWkQG06zYC5nVdtw6q0b3uEIAAPUQIY0Jyhlj2vwMdhrwGgUug
#> HUjLmN5CqWoYSft4EVkLp+o6f/1N4j28VVLBEwghWCStvxHtZp9fJuHrHWao7BVf
#> RNb3SwGI7jI1iQn3WtnpreZ1hgbgn/Vumg0kpCOKY8LQUtd3lnvFrO8d7Q==
#> -----END CERTIFICATE-----
#> ",""),rs=c(10407,668596725,-181167566,-1822882805,-2021143600,954713489,473076254))'
The printed value may be deployed directly on a remote machine.
As an alternative to the zero-configuration default, a certificate may also be generated via a Certificate Signing Request (CSR) to a Certificate Authority (CA). The CA may be a public CA or internal to an organisation.
and -----END CERTIFICATE-----
. Make sure to request the certificate in the PEM format. If only available in other formats, the TLS library used should usually provide conversion utilities.-----BEGIN PRIVATE KEY-----
and -----END PRIVATE KEY-----
and key
respectively, then the ‘tls’ argument may be specified as the character vector c(cert, key)
or launch_remote()
and -----END CERTIFICATE-----
markers. The first one should be the newly-generated TLS certificate, the same supplied to daemons()
, and the final one should be a CA root certificate.certchain
, then the character vector comprising this and an empty character string c(certchain, "")
may be supplied to the relevant ‘tls’ argument.The daemons()
interface also allows the specification of compute profiles for managing tasks with heterogeneous compute requirements:
Simply specify the argument .compute
when calling daemons()
with a profile name (which is ‘default’ for the default profile).
The daemons settings are saved under the named profile.
To create a ‘mirai’ task using a specific compute profile, specify the ‘.compute’ argument to mirai()
, which defaults to the ‘default’ compute profile.
Similarly, functions such as status()
, launch_local()
or launch_remote()
should be specified with the desired ‘.compute’ argument.
If execution in a mirai fails, the error message is returned as a character string of class ‘miraiError’ and ‘errorValue’ to facilitate debugging.
may be used to test for mirai execution errors.
m1 <- mirai(stop("occurred with a custom message", call. = FALSE))
#> 'miraiError' chr Error: occurred with a custom message
m2 <- mirai(mirai::mirai())
#> 'miraiError' chr Error in mirai::mirai(): missing expression, perhaps wrap in {}?
#> [1] TRUE
#> [1] TRUE
A full stack trace of evaluation within the mirai is recorded and accessible at $stack.trace
on the error object.
f <- function(x) if (x > 0) stop("positive")
m3 <- mirai({f(-1); f(1)}, f = f)
#> 'miraiError' chr Error in f(1): positive
#> [[1]]
#> stop("positive")
#> [[2]]
#> f(1)
Elements of the original error condition are also accessible via $
on the error object.
For example, additional metadata recorded by rlang::abort()
is preserved:
f <- function(x) if (x > 0) stop("positive")
m4 <- mirai(rlang::abort("aborted", meta_uid = "UID001"))
#> 'miraiError' chr Error: aborted
#> [1] "UID001"
If a daemon instance is sent a user interrupt, the mirai will resolve to an object of class ‘miraiInterrupt’ and ‘errorValue’.
may be used to test for such interrupts.
m4 <- mirai(rlang::interrupt()) # simulates a user interrupt
#> [1] TRUE
If execution of a mirai surpasses the timeout set via the ‘.timeout’ argument, the mirai will resolve to an ‘errorValue’ of 5L (timed out). This can, amongst other things, guard against mirai processes that have the potential to hang and never return.
m5 <- mirai(nanonext::msleep(1000), .timeout = 500)
#> 'errorValue' int 5 | Timed out
#> [1] FALSE
#> [1] FALSE
#> [1] TRUE
tests for all mirai execution errors, user interrupts and timeouts.
Native R serialization is used for sending data between host and daemons. Some R objects by their nature cannot be serialized, such as those accessed via an external pointer. In these cases, performing ‘mirai’ operations on them would normally error.
Using the arrow
package as an example:
library(arrow, warn.conflicts = FALSE)
#> [1] 1
x <- as_arrow_table(iris)
m <- mirai(list(a = head(x), b = "some text"), x = x)
#> 'miraiError' chr Error: Invalid <Table>, external pointer to null
#> [1] 0
However, serial_config()
can be used to create custom serialization configurations, specifying functions that hook into R’s native serialization mechanism for reference objects (‘refhooks’).
This configuration may then be passed to the ‘serial’ argument of a daemons()
cfg <- serial_config(
class = "ArrowTabular",
sfunc = arrow::write_to_raw,
ufunc = function(x) arrow::read_ipc_stream(x, as_data_frame = FALSE)
daemons(1, serial = cfg)
#> [1] 1
m <- mirai(list(a = head(x), b = "some text"), x = x)
#> $a
#> Table
#> 6 rows x 5 columns
#> $Sepal.Length <double>
#> $Sepal.Width <double>
#> $Petal.Length <double>
#> $Petal.Width <double>
#> $Species <dictionary<values=string, indices=int8>>
#> See $metadata for additional Schema metadata
#> $b
#> [1] "some text"
#> [1] 0
It can be seen that this time, the arrow table is seamlessly handled in the ‘mirai’ process. This is the case even when the object is deeply nested inside lists or other structures.
Different serialization functions may be registered for different compute profiles.
As an example, the ‘polars’ profile can be set up to use polars
, a ‘lightning fast’ dataframe library written in Rust (requires polars
>= 0.16.4).
n = 1,
serial = serial_config(
class = "RPolarsDataFrame",
sfunc = function(x) polars::as_polars_df(x)$to_raw_ipc(),
ufunc = polars::pl$read_ipc
.compute = "polars"
#> [1] 1
x <- polars::as_polars_df(iris)
m <- mirai(list(a = head(x), b = "some text"), x = x, .compute = "polars")
#> $a
#> shape: (6, 5)
#> ┌──────────────┬─────────────┬──────────────┬─────────────┬─────────┐
#> │ Sepal.Length ┆ Sepal.Width ┆ Petal.Length ┆ Petal.Width ┆ Species │
#> │ --- ┆ --- ┆ --- ┆ --- ┆ --- │
#> │ f64 ┆ f64 ┆ f64 ┆ f64 ┆ cat │
#> ╞══════════════╪═════════════╪══════════════╪═════════════╪═════════╡
#> │ 5.1 ┆ 3.5 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.9 ┆ 3.0 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 4.7 ┆ 3.2 ┆ 1.3 ┆ 0.2 ┆ setosa │
#> │ 4.6 ┆ 3.1 ┆ 1.5 ┆ 0.2 ┆ setosa │
#> │ 5.0 ┆ 3.6 ┆ 1.4 ┆ 0.2 ┆ setosa │
#> │ 5.4 ┆ 3.9 ┆ 1.7 ┆ 0.4 ┆ setosa │
#> └──────────────┴─────────────┴──────────────┴─────────────┴─────────┘
#> $b
#> [1] "some text"
daemons(0, .compute = "polars")
#> [1] 0
The ‘vec’ argument to serialization()
may be specified as TRUE
if the serialization functions are vectorized and take lists of objects, as is the case for safetensors
, used for serialization in torch
Please refer to the torch vignette for further examples.
performs asynchronous parallel/distributed map using mirai
This function is similar to purrr::map()
, but returns a ‘mirai_map’ object.
It is also more advanced as it allows multiple map over the rows of a dataframe or matrix - and can in fact be used to implement all map variations from that package.
The results of a mirai_map x
may be collected using x[]
This waits for all asynchronous operations to complete if still in progress.
package. Chunking cannot take into account varying or unpredictable compute times over the indices. It can be optimal to rely on mirai
for scheduling instead. This is demonstrated in the example below.library(mirai)
cl <- make_cluster(4)
#> [1] 4
vec <- c(1, 1, 4, 4, 1, 1, 1, 1)
system.time(mirai_map(vec, Sys.sleep)[])
#> user system elapsed
#> 0.005 0.004 4.007
system.time(parLapply(cl, vec, Sys.sleep))
#> user system elapsed
#> 0.007 0.004 8.011
#> [1] 0
is used to specify further constant arguments to .f
- the ‘mean’ and ‘sd’ in the example below:
daemons(3, dispatcher = FALSE),
mirai_map(1:3, rnorm, .args = list(mean = 20, sd = 2))[]
#> [[1]]
#> [1] 23.66101
#> [[2]]
#> [1] 16.24596 18.88544
#> [[3]]
#> [1] 21.08590 20.19166 18.56426
Use ...
to further specify objects referenced but not defined in .f
- the ‘do’ in the anonymous function below:
#> [1] 4
ml <- mirai_map(
c(a = 1, b = 2, c = 3),
function(x) do(x, as.logical(x %% 2)),
do = nanonext::random
#> < mirai map [0/3] >
#> $a
#> [1] "2a"
#> $b
#> [1] 5b aa
#> $c
#> [1] "45f24b"
Use of mirai_map()
requires that daemons()
have previously been set, and will error if not.
When collecting the results, optionally specify arguments to []
collects and flattens the results, checking that they are of the same type to avoid coercion.x[.progress]
collects results using a cli
progress bar, if available, showing completion percentage and ETA, or else a simple text progress indicator of parts completed of the total. If the map operation completes quickly, the cli
progress bar may not show at all, and this is by design.x[.stop]
collects the results applying early stopping, which stops at the first failure and cancels remaining computations. If the cli
package is available, it will be used for displaying the error message.Combinations of the above may be supplied in the fashion of x[.stop, .progress]
mirai_map(list(a = 1, b = "a", c = 3), function(x) exp(x))[.stop]
#> Error in `mirai_map()`:
#> ℹ In index: 2.
#> ℹ With name: b.
#> Caused by error in `exp()`:
#> ! non-numeric argument to mathematical function
daemons(4, dispatcher = FALSE, .compute = "sleep"),
mirai_map(c(0.1, 0.2, 0.3), Sys.sleep, .compute = "sleep")[.progress, .flat]
#> [1] 0
Multiple map is performed over the rows of a dataframe or matrix, as this is most often the desired behaviour.
This allows map over 2 or more arguments by specifying a dataframe. One of those may be an index value for indexed map.
The function .f
must take as many arguments as there are columns, either explicitly or via ...
fruit <- c("melon", "grapes", "coconut")
# create a dataframe for indexed map:
df <- data.frame(i = seq_along(fruit), fruit = fruit)
daemons(3, dispatcher = FALSE),
mirai_map(df, sprintf, .args = list(fmt = "%d. %s"))[.flat]
#> [1] "1. melon" "2. grapes" "3. coconut"
As a dataframe often contains columns of differing type, it is unusual to want to map over the columns, however this is possible by simply transforming it beforehand into a list using as.list()
Similarly, the behaviour of lapply()
or purrr::map()
on a matrix is the same as that for a vector.
on the other hand does take into account the fact that the matrix has dimensions, and maps over its rows, consistent with the behaviour for dataframes.
If instead, mapping over the columns is desired, simply take the transpose of the matrix beforehand using t()
mirai as a framework is designed to support completely transparent and inter-operable use within packages. A core design precept of not relying on global options or environment variables minimises the likelihood of conflict between use by different packages.
There are hence few requirements of package authors, but a few important points to note nevertheless:
settings should almost always be left to end-users.
Consider re-exporting daemons()
in your package as a convenience.daemons()
if you use mirai_map()
or a function that wraps it such as purrr::map(.paralellel = TRUE)
This is important to ensure that there is no unintentional recursive creation of daemons on the same machine.daemons()
call may be appropriate is for async operation using only one dedicated daemon.
A representative examaple of this usage pattern is logger::appender_async()
, where the logger package’s ‘namespace’ concept maps directly to mirai’s ‘compute profile’.status()
call must not be relied upon, as this user interface is subject to change at any time.nextget()
, for querying values such as ‘urls’ described in the function documentation.
Note: only the specifically-documented values are supported interfaces.unresolved()
, is_error_value()
, is_mirai_error()
, and is_mirai_interrupt()
should be used to test for the relevant state of a mirai or its value.dispatcher = FALSE
) to ensure that only one additional process is used.