On Collecting Result Types in Rust
Posted on
There's a good chance that today's topic being somewhat complex, I will end up explaining some stuff poorly. If it is and leaves something important off the discussion, I apologize beforehand. As someone new to Rust, I'm trying not to get too overwhelmed but at the same time squeezing out all the fun I'm able to. So I'm not exploring too deep for now so that I don't get lost in confusion. For readers who are also fairly new to Rust, I hope this post helps your understanding, maybe piques your interest too, without casting an ominous shadow of fear about the language.
So... iterators...
Iterators are awesome! They're really handy when you're looping over a sequence of data. Iterators play a huge role in Rust and chances are that even in a trivial Rust program that you've written, there are iterators.
Rust has only one trait that a type needs to implement so that it's possible to iterate over a sequence of data belonging to those types.
Meet the Iterator
trait:
pub trait Iterator {
type Item;
// Lots of methods below...
}
It has a bunch of functionalities. We'll look at only one of these: collect()
.
Let's look at the signature:
fn collect<B: FromIterator<Self::Item>>(self) -> B
where
Self: Sized,
{
FromIterator::from_iter(self)
}
The signature, using trait bounds, tells us that this method works on any type B
provided that it implements the FromIterator
trait. Under the hood, it calls the from_iter
function of the FromIterator
.
This trait only has one method:
pub trait FromIterator<A> {
pub fn from_iter<T>(iter: T) -> Self
where
T: IntoIterator<Item = A>;
}
This method is widely used to "collect" data that we're iterating over into a collection. If you read between the lines, you'll understand that you can use it to convert one collection into another type of collection.
Take the following example from the docs where it iterates over an array (a collection) and turns it into a vector (another type of collection):
let a = [1, 2, 3];
let doubled: Vec<i32> = a.iter()
.map(|&x| x * 2)
.collect();
assert_eq!(vec![2, 4, 6], doubled);
Collecting Result
s
What about collections that are not your run-of-the-mill collections, like arrays and vectors? What about a vector of Result
types?
collect()
can also create instances of types that are not typical collections. For example, aString
can be built from chars, and an iterator ofResult<T, E>
items can be collected intoResult<Collection<T>, E>
...
According to the docs, you can do that, because Result
types implement the FromIterator
trait.
Down the rabbit hole...
Now let's turn our eyes to the example that's at the heart of today's rant here:
let results = [Ok(1), Err("nope"), Ok(3), Err("bad")];
let result: Result<Vec<_>, &str> = results.iter().cloned().collect();
// gives us the first error
assert_eq!(Err("nope"), result);
let results = [Ok(1), Ok(3)];
let result: Result<Vec<_>, &str> = results.iter().cloned().collect();
// gives us the list of answers
assert_eq!(Ok(vec![1, 3]), result);
Here we have an array of Result
types, containing both Ok
and Err
variants. Using collect()
, we're trying to collect that into a Result
...
Hmm, that feels.. awkward...
We're not collecting into a vector of Result
s, but rather into a Result
that has a Vector
variant.
So what? It does implement FromIterator
, so collect()
should work. Period.
Let's take a step back and try to deduce from what we know so far:
Result
is an enum. So aResult
type can only have one variant at a time, eitherOk
, orErr
.- So after the operation on line 2, the
result
variable should contain only one variant, either theVec
variant (like the secondresult
variable in the example ) or the&str
variant representing theErr
.
It does end up having only one variant in each case, (the Ok
variant with the vector, and the Err
variant with an &str
), but how is it only yielding the first Err
variant?
Searching for answers
The magic happens inside the implementation of FromIterator
for Result
types.
As I said at the beginning, I'm not going to dive too deep into the mystery behind this magic because that'd be something out of my depth (it is... for now). But we can go pretty far at understanding what's happening under the hood.
Let's take a look at the relevant part of the source code of that implementation:
impl<A, E, V: FromIterator<A>> FromIterator<Result<A, E>> for Result<V, E> {
/// Takes each element in the `Iterator`: if it is an `Err`, no further
/// elements are taken, and the `Err` is returned. Should no `Err` occur, a
/// container with the values of each `Result` is returned.
fn from_iter<I: IntoIterator<Item = Result<A, E>>>(iter: I) -> Result<V, E> {
iter::process_results(iter.into_iter(), |i| i.collect())
}
}
The comments mention the behavior we've just observed in the previous example. At the first encounter of an Err
variant, it returns that variant and stops collecting. Otherwise, it collects all the values into a container (collection) and returns that instead.
We already know that FromIterator
has a single method that a type needs to define in order to implement the trait, and that's from_iter
; every type that implements it does so in different ways.
For Result
, we see that it calls another utility function called process_results
that does the actual job. process_results
is tailored to work on Result
type values.
Let's try to break it down as much as we can. Only keep in mind for now that we're iterating over Result
types.
- From the signature, the single argument in
from_iter
has to implement theIntoIterator
trait. If you scroll back a bit, we calledcloned()
before callingcollect()
on the array. And the return value type ofcloned()
implementsIntoIterator
.cloned()
creates a new iterator that clones (makes copies of) the underlying elements, i.e. theResult
types. The clones aren't of type&T
but typeT
.
process_results
takes two arguments: a closure and an iterator. Sincecloned()
yields typeT
,into_iter()
is used oniter
so that it's possible to iterate overT
types (refer to the docs on theiter
module if you're confused; it summarizes the types of iteration existing in Rust).
The one who processes Results
Here's the whole process_results
function:
fn process_results<I, T, E, F, U>(iter: I, mut f: F) -> Result<U, E>
where
I: Iterator<Item = Result<T, E>>,
for<'a> F: FnMut(ResultShunt<'a, I, E>) -> U,
{
let mut error = Ok(());
let shunt = ResultShunt { iter, error: &mut error };
let value = f(shunt);
error.map(|()| value)
}
Looks like the closure works on a type called ResultShunt
... what the heck is that?
The iterator that wraps another iterator
ResultShunt
is another iterator that wraps the first iterator iter
passed to the function. Here's how ResultShunt
looks like:
/// An iterator adapter that produces output as long as the underlying
/// iterator produces `Result::Ok` values.
///
/// If an error is encountered, the iterator stops and the error is
/// stored.
struct ResultShunt<'a, I, E> {
iter: I,
error: &'a mut Result<(), E>,
}
At this point, let's think through a couple of things:
- The closure is iterating over
ResultShunt
types. That could only mean thatResultShunt
type implements theIterator
trait. - The doc comments describe the behavior we're investigating.
- Quite intuitively, the
error
field only cares about theE
type of aResult
since it has to stop at the first encounter of anErr
variant.
- Quite intuitively, the
The return type of process_results
is a Result<U,E>
type, whereas the closure it takes as a parameter returns only the Ok
value U
.
That's how the collect
method in the closure ends up collecting the Ok
variants if there are no errors.
We don't need to go over all of this implementation. For now, notice the type parameters that show up in the impl
signature:
impl<I, T, E> Iterator for ResultShunt<'_, I, E>
where
I: Iterator<Item = Result<T, E>>,
{
type Item = T;
...
Here I
is the old iterator over Result
types. But notice the associated type here:
...
type Item = T;
...
This indicates the type of elements being iterated over. So collect
is actually collecting this T
type, which already has a FromIterator
implementation (in our case, it's an integer). That explains how it yielded the vector containing all Ok
variants in our example.
Finally getting there...
The implementation has a try_fold method. Again, only the following bit of code is relevant for now:
Err(e) => {
*error = Err(e);
ControlFlow::Break(try { acc })
}
We can disregard the feeling of "understanding everything", which is impossible, in favor of how much we've been able to dig up incrementally and make a sense out of. And seeing the above bit of code will make a click sound in your brain...
- At the encounter of an
Err
, its value is stored in theerror
field of theResultShunt
type. Then it just stops iterating, as advertised. process_results
makes a return with either success values or the error value (the last two lines):
let value = f(shunt);
error.map(|()| value)
This is the last stretch of explanation that we need to wrap this up. Let's break it down:
- Remember the closure signature we observed earlier in
process_results
. It returns theOk
value. So thevalue
above is just that (a string, integer whatever, but specifically in the example of ours it's an integer). - Next we're
map
ping over theerror
variable, which is aResult
type. This is howmap
works onResult
types:- It doesn't touch the
Err
value, but applies the closure to theOk
value, transforming the initialResult
into another. The closure argument is the "unit" type, meaning it's taking no argument and mapping theOk
values tovalue
.
- It doesn't touch the
- Now if the program encountered an error, the
error
variable already contains it, and themap
operation above wouldn't matter because it only touches theOk
values. That's how we getErr("nope")
in the first case of our example. - Otherwise,
collect
collects theOk
values into a collection, and we're gettingOk(vec![1, 3])
in the second case of our example.
Phew! 😌 That was a wild ride but we've managed to solve much of the mystery! I'm going to scream with joy into my pillow and then pet my cat... 🎉🎉💪
Parting words
Thanks for reading up to this point and descent into a bit of madness with me. Feel free to hit me up on Twitter with any feedback that you might have. I hope you find it useful in some way.
This article could not have been possible without the following fine people and communities:
- The amazing community of Togglebit where I started the initial discussion around the topics.
- Jake Goulding for jumping in on the Rust Discord server so quickly and being patient with my questions.