Month of Rust Update 2: Error Handling Concerns

2020-05-03 in LANGUAGE EXPERIMENTS • SE COMMENTARY • SOFTWARE ENGINEERING

rust software engineering

6 min read

I’ve spent the last few days somewhat diligently playing around with Rust. That’s mostly been studiously reading The Rust Language Book and doing some of the examples. I’m quickly tiring of that and will have to move on to koans, tutorials, or just some projects. However each day I’m learning a bit more about rust. There is a little more insight each day, mostly positive, but one area I am having some concerns is the area of error handling. Specifically I’m concerned about their lack of any traditional exception handling and in its place only returning error objects or panicking (crashing) the whole program.

I understand some of the logic they spell out for why they are moving down the path they are with error handling. I agree with them that there should probably be multiple error control paths. They break theirs down to “recoverable” and “unrecoverable”. That is so obvious it is a given. Rust actually codifies that very explicitly. For unrecoverable errors, like a bad array index access, there are panics. Panics shut the entire program down just like a segmentation fault would have in a traditional C/C++ program. For recoverable errors Rust leverages their enum type architecture to pass back results or error codes. I’m actually very excited about the latter part. While there are things like Optionals in languages which allow us to replace passing around values or nulls, none of them seem as flexible as the Rust enum system. Enum systems are leveraged for non error conditions as well. The Rust standard types even include Optional types. It allows us to write cleaner one line of code with chaining in the same way we would with an Elvis operator in Kotlin. It allows us to get the performance of fine grained control of errors through return types but without having nulls or some hack multiple values. So where’s my problem?

The first problem I see is that error handling now has to be done at each incremental stage without a stack jumping exception system. That isn’t necessarily bad. In some ways it fits the ethos of the Rust system of encouraging good code hygiene on the developer. As I wrote above there are many cases where we want to do that. However one has no choice but to do it with Rust. Let’s say I’m going to be opening a file on the disk. I know that trying to open a file with a bad path or one I lack permissions will throw an error. In a language with exception handling I may throw caution to the wind since something above me in the call stack should eventually catch the exceptions, but it’s not for me to adjudicate. In Rust if I don’t set it up or handle it correctly it is not just my code that will fail to execute but the entire application will crash. There isn’t even a chance for some overarching exception handling at the root to handle it and prevent the process from crashing. If it’s preventing laziness that’s fine but there are many instances where the adjudication of what to do on an error should be at a higher layer not at where the error is produced. I can still get that sort of behavior by having my own method return back Result objects which have a success response. This is how we did things in old fashioned C by passing back an integer response code. The enums make it so we don’t have to choose whether we sacrifice our return with an error but still we have the situation where whichever code called my method has to explicitly handle the error condition. Whichever code called that has to as well, and so on. In a very simple flat application that isn’t too bad. What about a very complex long lived one, or worse a library?

I believe the guidance from the language book on when to use panic versus error code goes very much awry when it comes to libraries. They write it is often appropriate to call a panic if you got data you weren’t expecting or you called into external code and it returns an invalid state that you have no way of fixing. For an application I may buy that. If you are writing a third party library though I can’t imagine that ever makes sense. As a developer I now have no idea without thoroughly inspecting the library whether I need to worry about it panicking and under what conditions. Sure their API documentation may specify when the method they coded panics, but what about calls into the library it calls? What about the library two steps over from that (and so on)? What’s worse is I don’t even have a recourse if one of libraries or its dependencies actually panics. I just have a dead program. I therefore can’t imagine a case where it makes sense for a third party library to legitimately panic since it brings the entire program down. That’s like a C library calling exit with a parameter if it “got into a bad state.” Sadly they have already started using it on one of the most used data structures: the Vector, their dynamically resizable array structure.

One of the biggest problems in C/C++ is accidentally walking off the end of an array or string. The native array type has no knowledge of its length therefore when you ask for the 11th element of a 10 element array it’s going to do something undefined. Practically, it’s going to be doing the pointer arithmetic to point to a piece of memory it shouldn’t. Reading that value is bad. Writing that value is catastrophic, often with a segmentation fault. The whole point of managed languages, a language like Rust, or even the STL array or vector data types is that it knows its size and won’t let you access a value outside of it. What do all of these languages and structures (not std::array) do if you attempt this operation? They throw an exception. What does Rust do? It panic quits the program. Now fortunately Rust has provided more verbose syntax for the vector to get around this so:

    let v = vec![1, 2, 3];
    
    //returns None
    println!("{:?}",v.get(99));
    
    //panics!!!
    println!("{:?}",v[99]);

As it states, the first line, with the far more verbose syntax actually runs without error and the result returned and printed is None. The type of syntax we’d prefer to use however has the potential to totally crash the system. In many ways that feels like a giant step backwards.

It’s not that panics aren’t a good idea for extreme cases. It’s also not that I don’t like the passing around of error codes as returns that leverage the power of the enum types. It’s just that it seems that there are many cases where exception handling and percolating is not only a legitimate but preferred route. It also seems like panics at any point in a third party library could lead to a very onerous path of validation to make sure that a program can’t be brought down accidentally by something in the dependency tree. It therefore feels like there is a gap left by the total absence of exception handling for cases where error code handling isn’t appropriate but neither is crashing the whole program. Whether that’s one step forward and two steps backwards, a neutral thing, or perhaps still better than the alternative due to the enforced hygiene and flexibility with return types I can’t say yet. I guess that will come with more practical experience.