Notes on Errors and Flow in Rust
I am a big fan
of Go Considered Harmful.
One of the biggest takeaways I got is how important it is to have a well-defined control flow.
Ideally, someone calling your function can just look at the signature of the function and know
it will probably not do something absolutely crazy beyond well-understood and broad rules.
If this is the case, we can then treat the function as a black box and our code will be much easier to reason about.
Errors #
.unwrap
is harmful #
If a function code calls .unwrap()
, .expect()
, or anything panic-related, it is breaking the contract of the
function.
This is because instead of a function just returning a result, it is now returning a result or panicking.
Panicking is not part of the function signature, so it is not part of the contract of the function.
However, even if it were, it is nice to allow the callee of the function to decide how to handle the error.
If code panics, to prevent the program from crashing, the caller of the function will have to wrap the function call in
a catch_unwind
block,
which is not very ergonomic.
Result<T,E>
where E
is concrete can be harmful #
Generally, the first thing I see Rust beginners do after being told to avoid panic!
is to replace all of their code
with Result
s. However, this comes with several problems:
1. No backtrace #
If an unexpected error occurs, the error will propagate all the way up to the top of the call stack before it is
printed out or the code panics. This is an issue because it makes it hard/(nearly possible) to debug the error.
2. Complicated Types and boilerplate #
There are no anonymous enums in Rust you have to create a new enum for every error type. This is a lot of
boilerplate. For instance if function foo
returns error FooError
which has variants A
and
B
, and function bar
returns error BarError
which has variants B
and C
,
then the function foobar
which calls foo
and bar
will have to define a new enum FooBarError
which has
variants A
, B
, and C
. In addition, to be able to use the ?
operator, the error type must implement
From<FooError>
and From<BarError>
. This is a lot of boilerplate.
In a perfect world with anonymous enums, however, this would be an advantage—we could easily match
over all
error types.
Box<dyn Error>
is better but still not great #
We can solve the second problem by using Box<dyn Error>
as the error type.
However, this is still not great because it is still not clear what errors can be returned from a function.
We can also use io::Error
as the error type, but this is not great because we cannot match
over the only types
of errors that can be returned from a function.
Manual, Global Error Types #
A perhaps better (but still meh) solution is to define a global error type that can be used in all functions.
This is a good solution because it allows us to easily match
over all errors that can be returned from a function,
and we do not have to define a new error type for every function. For instance, suppose we are using the libraries
we can define a global error type as follows:
type Result<T> = std::result::Result<T, Error>;
enum Error {
Serde(serde::Error),
Reqwest(reqwest::Error),
Tokio(tokio::Error),
}
impl From<serde::Error> for Error {
fn from(e: serde::Error) -> Self {
Self::Serde(e)
}
}
impl From<reqwest::Error> for Error {
fn from(e: reqwest::Error) -> Self {
Self::Reqwest(e)
}
}
impl From<tokio::Error> for Error {
fn from(e: tokio::Error) -> Self {
Self::Tokio(e)
}
}
Suppose we have a function foo
that does some serde stuff, a function bar
that does some reqwest
stuff, and a
function foobar
that calls foo
and bar
. We can easily match
over all errors that can be returned
from foo
, bar
, and foobar
:
fn foo() -> Result<()> {
}
fn bar() -> Result<()> {
}
fn foobar() -> Result<()> {
foo()?;
bar()?;
Ok(())
}
In addition, suppose we are calling foobar
. We can easily decide how to handle errors as we know the type of error.
fn main() {
loop {
match foobar() {
Err(Error::Serde(e)) => {
return;
}
Err(Error::Reqwest(e)) => {
}
_ => {}
}
}
}
Yet we still have the first problem: no backtrace. And furthermore, often the exact context of the error is not
apparent. For instance, if we are calling foobar
in a loop, we do not know which iteration of the loop the error
occurred on.
anyhow
#
anyhow
is a crate that solves both of these problems. It is a library that provides a global error type
and a macro that allows you to easily add context to errors.
In addition, it also provides the ability to downcast errors to their original type. For instance, suppose we are
using the libraries serde
, reqwest
, and tokio
and we have a function foobar
that calls foo
and bar
.
use anyhow::Result;
fn foo() -> Result<()> {
}
fn bar() -> Result<()> {
}
fn foobar() -> Result<()> {
foo()?;
bar()?;
Ok(())
}
Matching over errors #
We can easily match
over all errors that can be returned from foo
, bar
, and foobar
:
fn main() {
loop {
match foobar() {
Err(e) if e.downcast_ref::<serde::Error>().is_some() => {
return;
}
Err(e) if e.downcast_ref::<reqwest::Error>().is_some() => {
}
_ => {}
}
}
}
Backtraces #
Perhaps one of the most useful features of anyhow
is that it records the backtrace of the error.
This allows us to easily debug the error. For instance, suppose we are calling foobar
in a loop, we can easily
determine which iteration of the loop the error occurred on.
fn main() {
loop {
match foobar() {
Err(e) => {
eprintln!("error: {}", e);
eprintln!("backtrace: {}", e.backtrace());
return;
}
_ => {}
}
}
}
the backtrace is automatically printed out if the error is passed to panic!
further up the call stack.
Adding context #
anyhow
also provides utilities for adding context to errors. For instance, suppose we are calling foobar
in a loop,
and we want to know which iteration of the loop the error occurred on.
fn run() -> Result<()> {
for i in 0.. {
foobar().with_context(|| format!("iteration {}", i))?;
}
Ok(())
}
fn main() {
if let Err(e) = run() {
eprintln!("error: {}", e);
eprintln!("backtrace: {}", e.backtrace());
return;
}
}
Different approach for public libraries #
anyhow
is great for applications, but it is not great for libraries that specifically
are not managed by your team.
If you are confident, the library is sound all errors produced by the library will not be internal errors that need
to be debugged, but instead will be external errors that need to be handled. Therefore, having a backtrace and context
string will not be very useful.
Note, I specifically am talking about libraries that are not managed by your team. If you are managing the library,
then you can just use anyhow
and be done with it. The errors produced by the library are much more likely to be
internal errors that need to be debugged.
tracing
#
However, few libraries are perfect. To help debug errors, it can be useful to instead use tracing
to provide
error log messages in the instance there truly is an error. In addition, the tracing
library is increadibly helpful
for reasoning about the flow of your program. Often for async
programs, it is difficult to reason about the flow
because of the async
nature of the program. tracing
(even when used with correctly programmed libraries) can
still help the consumers of that library reason about the flow of the program.