Sakib's blog

On Collecting Result Types in Rust

Posted on

There's a good chance that today's topic being somewhat complex, I will end up explaining some stuff poorly. If it is and leaves something important off the discussion, I apologize beforehand. As someone new to Rust, I'm trying not to get too overwhelmed but at the same time squeezing out all the fun I'm able to. So I'm not exploring too deep for now so that I don't get lost in confusion. For readers who are also fairly new to Rust, I hope this post helps your understanding, maybe piques your interest too, without casting an ominous shadow of fear about the language.

So... iterators...

Iterators are awesome! They're really handy when you're looping over a sequence of data. Iterators play a huge role in Rust and chances are that even in a trivial Rust program that you've written, there are iterators.

Rust has only one trait that a type needs to implement so that it's possible to iterate over a sequence of data belonging to those types.

Meet the Iterator trait:

pub trait Iterator {
    type Item;
    // Lots of methods below...

It has a bunch of functionalities. We'll look at only one of these: collect().

Let's look at the signature:

fn collect<B: FromIterator<Self::Item>>(self) -> B
	Self: Sized,

The signature, using trait bounds, tells us that this method works on any type B provided that it implements the FromIterator trait. Under the hood, it calls the from_iter function of the FromIterator.

This trait only has one method:

pub trait FromIterator<A> {
    pub fn from_iter<T>(iter: T) -> Self
        T: IntoIterator<Item = A>;

This method is widely used to "collect" data that we're iterating over into a collection. If you read between the lines, you'll understand that you can use it to convert one collection into another type of collection.

Take the following example from the docs where it iterates over an array (a collection) and turns it into a vector (another type of collection):

let a = [1, 2, 3];

let doubled: Vec<i32> = a.iter()
                         .map(|&x| x * 2)

assert_eq!(vec![2, 4, 6], doubled);

Collecting Results

What about collections that are not your run-of-the-mill collections, like arrays and vectors? What about a vector of Result types?

collect() can also create instances of types that are not typical collections. For example, a String can be built from chars, and an iterator of Result<T, E> items can be collected into Result<Collection<T>, E>...

According to the docs, you can do that, because Result types implement the FromIterator trait.

Down the rabbit hole...

Now let's turn our eyes to the example that's at the heart of today's rant here:

let results = [Ok(1), Err("nope"), Ok(3), Err("bad")];

let result: Result<Vec<_>, &str> = results.iter().cloned().collect();

// gives us the first error
assert_eq!(Err("nope"), result);

let results = [Ok(1), Ok(3)];

let result: Result<Vec<_>, &str> = results.iter().cloned().collect();

// gives us the list of answers
assert_eq!(Ok(vec![1, 3]), result);

Here we have an array of Result types, containing both Ok and Err variants. Using collect(), we're trying to collect that into a Result...

Hmm, that feels.. awkward...

We're not collecting into a vector of Results, but rather into a Result that has a Vector variant.

So what? It does implement FromIterator, so collect() should work. Period.

Let's take a step back and try to deduce from what we know so far:

It does end up having only one variant in each case, (the Ok variant with the vector, and the Err variant with an &str), but how is it only yielding the first Err variant?

Searching for answers

The magic happens inside the implementation of FromIterator for Result types.

As I said at the beginning, I'm not going to dive too deep into the mystery behind this magic because that'd be something out of my depth (it is... for now). But we can go pretty far at understanding what's happening under the hood.

Let's take a look at the relevant part of the source code of that implementation:

impl<A, E, V: FromIterator<A>> FromIterator<Result<A, E>> for Result<V, E> {
    /// Takes each element in the `Iterator`: if it is an `Err`, no further
    /// elements are taken, and the `Err` is returned. Should no `Err` occur, a
    /// container with the values of each `Result` is returned.
    fn from_iter<I: IntoIterator<Item = Result<A, E>>>(iter: I) -> Result<V, E> {
        iter::process_results(iter.into_iter(), |i| i.collect())

The comments mention the behavior we've just observed in the previous example. At the first encounter of an Err variant, it returns that variant and stops collecting. Otherwise, it collects all the values into a container (collection) and returns that instead.

We already know that FromIterator has a single method that a type needs to define in order to implement the trait, and that's from_iter; every type that implements it does so in different ways.

For Result, we see that it calls another utility function called process_results that does the actual job. process_results is tailored to work on Result type values.

Let's try to break it down as much as we can. Only keep in mind for now that we're iterating over Result types.

The one who processes Results

Here's the whole process_results function:

fn process_results<I, T, E, F, U>(iter: I, mut f: F) -> Result<U, E>
    I: Iterator<Item = Result<T, E>>,
    for<'a> F: FnMut(ResultShunt<'a, I, E>) -> U,
    let mut error = Ok(());
    let shunt = ResultShunt { iter, error: &mut error };
    let value = f(shunt);|()| value)

Looks like the closure works on a type called ResultShunt... what the heck is that?

The iterator that wraps another iterator

ResultShunt is another iterator that wraps the first iterator iter passed to the function. Here's how ResultShunt looks like:

/// An iterator adapter that produces output as long as the underlying
/// iterator produces `Result::Ok` values.
/// If an error is encountered, the iterator stops and the error is
/// stored.
struct ResultShunt<'a, I, E> {
    iter: I,
    error: &'a mut Result<(), E>,

At this point, let's think through a couple of things:

The return type of process_results is a Result<U,E> type, whereas the closure it takes as a parameter returns only the Ok value U.

That's how the collect method in the closure ends up collecting the Ok variants if there are no errors.

We don't need to go over all of this implementation. For now, notice the type parameters that show up in the impl signature:

impl<I, T, E> Iterator for ResultShunt<'_, I, E>
    I: Iterator<Item = Result<T, E>>,
    type Item = T;

Here I is the old iterator over Result types. But notice the associated type here:

type Item = T;

This indicates the type of elements being iterated over. So collect is actually collecting this T type, which already has a FromIterator implementation (in our case, it's an integer). That explains how it yielded the vector containing all Ok variants in our example.

Finally getting there...

The implementation has a try_fold method. Again, only the following bit of code is relevant for now:

Err(e) => {
    *error = Err(e);
    ControlFlow::Break(try { acc })

We can disregard the feeling of "understanding everything", which is impossible, in favor of how much we've been able to dig up incrementally and make a sense out of. And seeing the above bit of code will make a click sound in your brain...

let value = f(shunt);|()| value)

This is the last stretch of explanation that we need to wrap this up. Let's break it down:

Phew! 😌 That was a wild ride but we've managed to solve much of the mystery! I'm going to scream with joy into my pillow and then pet my cat... 🎉🎉💪

Parting words

Thanks for reading up to this point and descent into a bit of madness with me. Feel free to hit me up on Twitter with any feedback that you might have. I hope you find it useful in some way.

This article could not have been possible without the following fine people and communities: