% The builder pattern
Some data structures are complicated to construct, due to their construction needing:
- a large number of inputs
- compound data (e.g. slices)
- optional configuration data
- choice between several flavors
which can easily lead to a large number of distinct constructors with many arguments each.
If T
is such a data structure, consider introducing a T
builder:
- Introduce a separate data type
TBuilder
for incrementally configuring aT
value. When possible, choose a better name: e.g.Command
is the builder forProcess
. - The builder constructor should take as parameters only the data required to
make a
T
. - The builder should offer a suite of convenient methods for configuration,
including setting up compound inputs (like slices) incrementally.
These methods should return
self
to allow chaining. - The builder should provide one or more "terminal" methods for actually building a
T
.
The builder pattern is especially appropriate when building a T
involves side
effects, such as spawning a thread or launching a process.
In Rust, there are two variants of the builder pattern, differing in the treatment of ownership, as described below.
In some cases, constructing the final T
does not require the builder itself to
be consumed. The follow variant on
std::process::Command
is one example:
// NOTE: the actual Command API does not use owned Strings;
// this is a simplified version.
pub struct Command {
program: String,
args: Vec<String>,
cwd: Option<String>,
// etc
}
impl Command {
pub fn new(program: String) -> Command {
Command {
program: program,
args: Vec::new(),
cwd: None,
}
}
/// Add an argument to pass to the program.
pub fn arg<'a>(&'a mut self, arg: String) -> &'a mut Command {
self.args.push(arg);
self
}
/// Add multiple arguments to pass to the program.
pub fn args<'a>(&'a mut self, args: &[String])
-> &'a mut Command {
self.args.push_all(args);
self
}
/// Set the working directory for the child process.
pub fn cwd<'a>(&'a mut self, dir: String) -> &'a mut Command {
self.cwd = Some(dir);
self
}
/// Executes the command as a child process, which is returned.
pub fn spawn(&self) -> std::io::Result<Process> {
...
}
}
Note that the spawn
method, which actually uses the builder configuration to
spawn a process, takes the builder by immutable reference. This is possible
because spawning the process does not require ownership of the configuration
data.
Because the terminal spawn
method only needs a reference, the configuration
methods take and return a mutable borrow of self
.
By using borrows throughout, Command
can be used conveniently for both
one-liner and more complex constructions:
// One-liners
Command::new("/bin/cat").arg("file.txt").spawn();
// Complex configuration
let mut cmd = Command::new("/bin/ls");
cmd.arg(".");
if size_sorted {
cmd.arg("-S");
}
cmd.spawn();
Sometimes builders must transfer ownership when constructing the final type
T
, meaning that the terminal methods must take self
rather than &self
:
// A simplified excerpt from std::thread::Builder
impl ThreadBuilder {
/// Name the thread-to-be. Currently the name is used for identification
/// only in failure messages.
pub fn named(mut self, name: String) -> ThreadBuilder {
self.name = Some(name);
self
}
/// Redirect thread-local stdout.
pub fn stdout(mut self, stdout: Box<Writer + Send>) -> ThreadBuilder {
self.stdout = Some(stdout);
// ^~~~~~ this is owned and cannot be cloned/re-used
self
}
/// Creates and executes a new child thread.
pub fn spawn(self, f: proc():Send) {
// consume self
...
}
}
Here, the stdout
configuration involves passing ownership of a Writer
,
which must be transferred to the thread upon construction (in spawn
).
When the terminal methods of the builder require ownership, there is a basic tradeoff:
-
If the other builder methods take/return a mutable borrow, the complex configuration case will work well, but one-liner configuration becomes impossible.
-
If the other builder methods take/return an owned
self
, one-liners continue to work well but complex configuration is less convenient.
Under the rubric of making easy things easy and hard things possible, all
builder methods for a consuming builder should take and returned an owned
self
. Then client code works as follows:
// One-liners
ThreadBuilder::new().named("my_thread").spawn(proc() { ... });
// Complex configuration
let mut thread = ThreadBuilder::new();
thread = thread.named("my_thread_2"); // must re-assign to retain ownership
if reroute {
thread = thread.stdout(mywriter);
}
thread.spawn(proc() { ... });
One-liners work as before, because ownership is threaded through each of the
builder methods until being consumed by spawn
. Complex configuration,
however, is more verbose: it requires re-assigning the builder at each step.