Thursday, September 8, 2011

The rule of the [3] 4: swap

Object oriented programming focuses on bundling together the data with the operations that can be applied to them. Access specifiers allow you to control the invariants of the type, and deterministic destruction of objects provides the glue that makes management of all types of resources.

But homogeneous does not mean trivial, and attention has to be paid. In a language with exceptions special care needs to be taken so that resources don't leak, but resources are not the only problem, when we model a domain, we need to maintain the invariants on top of which our algorithms are built.

Managing resources: The Rule of the Three

One of the widely known rules for resource management is the rule of the three, that according to the wikipedia article says:

[...] if a class defines one of the following it should probably explicitly define all three: destructor, copy constructor, copy assignment operator

The rule is based on the principle that if any of the three needs to be implemented, it is most probably because the class encapsulates management of a resource for which you need to provide sensible copy-semantics and release in the destructor.

Optionally if you encapsulate each resource in it's own RAII manager, as you should specially if you have to manage more than one resource, then you can avoid the addition of the destructor.

Whenever I have encountered the rule of the three, my next thought was for swap, not the one implemented in the std namespace, but rather a handcrafted swap function that provides the no throw exception guarantee and is cheap to run. The main point of providing swap is implementing exception safety by means of the copy-and-swap idiom.

The copy and swap idiom

This is a common idiom to implement exception safety in general, but more importantly in types that manage resources. The idea is that instead of executing the operations directly on the object we can copy the object, perform the operation aside and then, and only after the operation has completed successfully, swap the contents of the two objects. We provide the operation as a single transaction: either it succeeds and modifies the object or it fails and leaves the object as it was before the operation started.

Consider for example the number that we sketched the other day, and assume that it is a big number, usually implemented with a dynamic array of digits1. Addition of a big number to our existing big number (operator+=) may require growing of the array to accomodate a potentially larger number, and performing the operation2 on the bigger array.

If while performing the actual operations anything failed and an exception is thrown (there aren't many things that can go wrong with the addition of the individual digits but bear with me), we would have to take care of ensuring that there are no memory leaks and that the object is left in a consistent state. To ensure that no memory is leaked, we should manage the memory using RAII, but our number already implements RAII for the buffer, so why not reuse it?

Instead of managing the memory through a suitable smart pointer we can create a number with automatic storage that is big enough to hold the result. Then we operate on that local temporary3, and if all goes right we exchange the contents of the object and the temporary with a swap function that offers the no throw guarantee.

If an exception is thrown during the operation, the stack will be unwound, the local object will be destroyed and that will release the memory. As a nice side effect, because the operation has not modified the current object at all, we provide the strong exception guarantee.

class number {
    int * data;
    std::size_t digits;
public:
    // other code omitted 
    std::size_t size() const {
        return digits;
    }
    number& operator+=( number const & rhs );
    number& swap( number& rhs ) throw();
private:
    number( std::size_t size ) : data( new int[size]), digits(size) {}
};
number& number::swap( number& rhs ) throw() {
    using std::swap;
    swap( data, rhs.data );
    swap( digits, rhs.digits );
    return *this;
}
number& number::operator+=( number const & rhs ) {
    number tmp( std::max( size(), rhs.size() ) + 1 );
    // actual operations here, might throw
    return swap( tmp );
}
// Offer a free function for commodity
void swap( number & lhs, number & rhs ) {
   lhs.swap( rhs );
} 

When swap meets the three

The rule of the three focuses on providing appropriate value semantics to your types while ensuring that resources won't leak. On the other hand, the copy-and-swap idiom focuses on exception safety, on how to implement operations so that they either succeed or fail in a graceful way (i.e. leaving the state of the original objects intact). They do share a common ground: both of them are important to handle exceptions correctly, so you might want to think on implementing both in your user defined types.

The good news are that the cost of implementing both idioms for your type is not greater than implementing the rule of the three alone. Instead of implementing the assignment operator, you can move that effort into implemeting swap, and the assignment operator will come for free:

class number {
    int * data;
    std::size_t digits;
public:
    // Rule of the three: copy constructor
    number( number const & rhs ) : data( new int[ rhs.digits ] ), digits( rhs.digits ) {
        std::copy( rhs.data, rhs.data+rhs.digits, data );
    }
    // Rule of the three: assignment operator
    number& operator=( number rhs ) {
        return swap( rhs );
    }
    // Rule of the three: destructor
    ~number() {
        delete [] data;
    }
};

If we had not implemented swap, then the assignment operator would have been much more complex than our swap, and harder to make it right. Given a choice, implement swap, it will not add development cost, and it will help in many ways.

When not to swap

It is always a good exercise to think before starting to code. While implementing a no-throw swap is in general a good idea, you should consider whether it does make sense in your particular problem.

Copy and swap is a simple generic solution to provide the strong exception guarantee, but it adds the cost of creating a copy of the original object on which to perform the operation. If none of the steps in the algorithm can throw, then you can avoid the cost altogether. Sometimes you cannot provide a swap function that offers the no-throw guarantee, or the swap operation is too expensive. While you should never sacrifice correctness for performance, it might be worth thinking on ad-hoc ways of provide the safety. As always with performance, measure first, and decide whether you even need to consider optimizing later.

He who sacrifices correctness for performance deserves neither

But if you can implement an efficient swap that does not throw, then that should be the very first function to write.

Move semantics

There is quite a bit of fuzz going on with the new standard and rvalue-references, and in particular with one of the two primary uses: move semantics. While native support for moving was not present in the C++98/03 standards, people have been moving all along: the copy-and-swap idiom is moving.

The function swap has a stricter set of requirements than moving. When you move from an object a to an object b, the semantics only require that the state of b after the move operation is equivalent to the state of a before the operation and that a is at the very least destructible. On the other hand, swap states a stronger guarantee, the state of each object after the operation is equivalent to the state of the other object before the operation, which by definition fulfills all the guarantees of moving.

In the whole discussion, we have not mentioned the state in which the temporary object is left, and the only operation that is applied to that object is the destructor. We don't care about that temporary at all, we only use it as the source from which to move the result of the operation that was performed aside.

Besides exception safety, as discussed above, swap can also be used to improve performance in your code. For an illustrative example, we can consider performance of the implementation of operator+ mentioned here together with the discussions on copy elision and the final question of whether we could do better. To refresh the memory, the signature of that operator would be:

number operator+( number lhs, number const & rhs );

And the question is Why define it as that if the compiler cannot elide copying the argument to the returned object?. The answer is that this signature allows the compiler to optimize where you cannot: avoiding temporaries passed as arguments or a returned object used for initialization cannot be controlled by the programmer. It then leaves us with the opportunity to improve on the only copy that is left by moving from lhs to the returned object:

number operator+( number lhs, number const & rhs ) {
    lhs += rhs;
    number tmp;         // [1]
    swap( tmp, rhs );   // [2]
    return tmp;
}

The code is a bit more cumbersome now, so is it worthy? Well, the first thing is whether there is something to improve and for that copying must be expensive, expensive enough to compensate the two operations with which we are replacing it. In line [2], swap must be cheap and provide the no throw guarantee (or at least the same guarantee that copy contruction, remember: never optimize at the cost of correctness!), and the construction in [1] must be relatively cheap. There are still two objects in the code, but the cost of keeping those two objects might be much smaller than letting the compiler copy for us.

For our number implementation, we could implement the value 0 as a number that holds no memory (data is a null pointer, digits is 0), which will make the construction in [1] equivalent to just two assignments. Then we can use swap to move lhs into tmp, which will require another 6 assignments, that overall amount to no cost. We don't even need to modify the destructor to take into account this particular state, as the delete [] data; will be a no-op when data is null.

This might seem like a forced example just to show how cool swap can be, but it has been implemented in production code, and you might have even used it unknowingly. In the Dinkumware implementation of the STL4 types are tagged with a type trait _Move_operation_category that can be used with SFINAE to optimize operations based on this knowledge. In particular, when a std::vector needs to grow, it will allocate the new buffer in memory, and if the contained type has an efficient swap, it will default-initialize the elements in the new buffer, and then use swap to move the values. The effect is that with a std::vector<std::vector<T> > the cost of growing is proportional to the size of the outer vector, regardless of the sizes of the contained vectors, converting an O( N*M ) operation into O(N) (where N is the size of the outer vector, and M the average size of the contained vectors)

swap and moving in the future

The recommendation of implementing swap for your types will probably fall in disgrace in the near future. As developers embrace the upcoming standard they will start using move-constructors and move-assignment instead of resorting to using swap as a poor man's move operation, taking swap away from the spotlight that it deserved but did not really enjoy.

Not only will swap not be used for moving, but move semantics will be used to trivially implement efficient swapping of resources. The two lines marked with [1] and [2] in the operator+= above disappear and yet we maintain the same behavior of the code. In C++0x, even std::swap will use rvalue-references when the type implements move semantics. Interestingly, the simplest implementation of move-assignment will probably be that implementation of the swap function we will not write.

But all that is a story for another day, this post is already too long and you don't want to get bored...

P.S.: It's taken me a long time to write this post, and it will take me some time to write the next as I will be doing some traveling (There I go NY!). I hope this huge post makes up for the delay and hope to see you around!

----
1 Here digit does not denote a decimal or hexadecimal digit, but in most cases a much larger entity. A common approach in big number libraries uses half of the bits in the native memory word (16bit in a 32bit architecture) for a digit. That ensures that we can apply any basic operation to a digit and the result will fit in a word. Then we can post-process all the digits normalizing the data to maintain the invariant that each element in the array never uses more than half the bits. That is, we can process all digits separately without overflows, and then fix the carryovers in a single pass.

2 A dynamic array? Manually implemented? Really? In production code, this would be implemented with a std::vector<int>. Using a vector will ease some of the problems of manual management, but many of the things in the post still apply when using a vector, but it seems more graphical with manual memory management.

3 Not in the C++ sense, only a temporary in the sense that its a short lived object (only within the scope of this operation)

4 But who exactly is Dinkumware? Dinkumware is the company that licenses the STL implementation that is shipped with Visual Studio, so if you have used Visual Studio 2003 or later, they are the implementors of the STL you used.

4 comments:

  1. Its a wonderful thing that people like you share your knowledge so selflessly and passionate about blogging things which can be useful to others ...Regards

    ReplyDelete
  2. Why is that you don't blog anymore? . I really miss articles from your side :(

    ReplyDelete
    Replies
    1. Long story... changed job, city, country, got married and have a kid. Writing takes time, and that is something I don't have much now. I have a couple of things I'd like to write about, lookup will require a number of entries but should be an interesting topic.

      Delete
  3. Typo in your "operator+" example:

    swap( tmp, rhs );

    ...should read...

    swap ( tmp, lhs );

    The description in the subsequent text is correct.

    ReplyDelete