Smallpearl - Practices to reduce memory leaks in C++

One of the fundamental differences between C++ and the more modern and if I dare say popular programming languages such as Java and C#, is the lack of a garbage collector in C++. Often this is used as an example to portray how C++ has not kept up with the times and how code written with the language is a potential minefield for memory leaks.

While the debate about the merits of one language over the other continues (and in all possibility would continue for many more years), there are a few steps one can take, as a programmer, that can pretty much minimize, if not eliminate, the chances of a memory leak.

I would like to list some of the techniques that I have employed over the years and have worked for me pretty well.

Use RAII

Use objects to abstract concepts

Use std::auto_ptr

Turn on memory tagging

Use std::vector\<> to allocate memory

Declare base class destructor as virtual

[Use RAII]

Use RAII (Resoure Acquisition Is Initialization) programming paradigm: Perhaps the best technique that one can employ to minimize memory leaks.

RAII is a term coined by Stroustrup to refer to the technique where the behavior of the constructor and automatic destructor of objects is employed to manage resources. In C++, there are two ways to allocate objects -- from the stack and from the heap. Unike Java and C#, where all objects are created by allocating memory from the heap, objects in C++ declared with local scope (within a function or meothod) are allocated from the stack and when these objects go out of scope, as that happens when a function returns, they objects are automatically destroyed.

This behavior can be exploited to make local resource management very simple. For example:

void foo()
{
    FILE* fp = fopen(...);
    if (!fp)
        return;
    // ...
    // do some other operation
    bakeacake()
    // call bar() giving the file as argument
    bar(fp);
    fclose(fp); // close the file
}

In this example, consider what happens if either bakeacake() or bar() throws an exeption. Control would immediately be transferred to the caller of foo() (and if it doesn’t handle the exception, to a higher level caller in the call chain that hopefully traps the exception) leaving the file resource dangling.

Given the above scenario, you can always write this function as

void foo()
{
    FILE* fp = fopen(...);
    if (!fp)
        return;
    try {
        // ...
        // do some other operation
        bakeacake()
        // call bar() giving the file as argument
        bar(fp);
    } catch (...) {     // catch the error
        fclose(fp); // remember to close
        throw;
    }
    fclose(fp); // close the file
}

and solve the problem of potential resource leak. However, this has repetitive code, something that is firstly, hard to maintain (as any changes in resource management would require updating two different places in the code) and secondly requires the programmer to exercise discipline while writing code (to remember or find out the various control paths and all the resources allocated so that they all can be released).

Another way to write this function would be to use a local object (allocated on the stack of foo()) that provides the required file operations. Consider this example:

void foo()
{
    File f(“a.out”);
    // ...
    // do some other operation
    bakeacake()
    // call bar() giving the file as argument
    bar(f);
}

The above approach will only work if you have access to the code of bar() and you can make the necessary changes to it so that it can accept a reference to File object rather than a FILE*. On the downside, now you have to change the implementation of another function to fit your new programming style.

Now consider the scenario where such a File object does not exist (and for some reason it is difficult to build one yourself) or you’re not at liberty to modify bar() as that function is used in many other modules code for which you do not have access to, you may try this approach:

class FilePtr {
    FilePtr(FilePtr& const);
    FilePtr& operator=(FilePtr& const);
public:
    FilePtr(char *const name, char* const mode) : fp_(0)
    { fp_ = fopen(name, mode); }
    FilePtr(FILE* fp) : fp_(fp)
    {}
    ~FilePtr()
    { 
        if (fp_) fclose(fp_);
    }
    operator FILE*() // crucial to making this work!
    { 
        return fp_;
    }
protected:
    FILE* fp_;
};

In this example, we first create a small wrapper class to whom we delegate the responsibility of managing the FILE* resource. When foo() returns, since the FilePtr object is allocated from the stack, it would be automatically destroyed resulting in its destructor being called which would close the FILE* gracefully.

There are two key points here that merit highlighting.

bar() function does not have to be modified. This is because FilePtr declares a FILE* operator, which will be automatically invoked by the compiler, when it cannot find an implementation of bar() that takes FilePtr (or FilePtr&) as its argument.

FILE* automatically gets released when the function returns, either through an error and consequent exception or when bar() returns.

[Use objects to abstract concepts]

Use objects to abstract concepts as much as possible rather than using procedural style code.

One of the strengths of C++ is also its weakness. C++ allows you to mix C coding style (where program logic is expressed purely as procedural functions) into C++ code. While this provides immense flexibility and backward compatibility, it also leads you to maintain the same coding style that you’re used to (assuming that you spent considerable time coding in C and then eventually migrated to C++) preventing you from exploiting the language to its fullest.

Coupled with the RAII explained before, this approach will yield excellent results. Following are a few examples of scnearios where this approach will payoff rich dividends:

1. Synchronisation locks

2. Files as in the previous example

3. Temporary buffers (we’ll examine this soon)

4. Any other form of OS resources that have local scope

A few examples:

void a::foo()
{
    Guard g(mutex_)
    // code below, until the function exits, is
    // guaranteed to be safe across all threads
    // that access the resources protected by
    // the mutex mutex_.
    // update shared resources
    status_ = 1;
}

Here the constructor of Guard class acquires the mutex (waiting for it indefinitely) and its destructor automatically releases the same mutex.

void b::OnPaint()
{
    CPaintDC dc(this);    // RAII
    dc.SetBkColor(RGB(127, 127, 127));
    CRect rc; dc.GetClientRect(&rc);
    dc.FillSolidRect(&rc, RGB(0, 0, 0));
}

This is a classic MFC code. In Windows, drawing to a window in response to a WM_PAINT message requires one to call the BeginPaint API and once the drawing has been completed call the EndPaint API. Both these calls are encapsulated in the constructor and destructor of CPaintDC class respectively insulating the programmer from the explicit call to these APIs.

[Use std::auto_ptr]

Despite its destructive copy semantics, auto_ptr is still a very useful template that can be effectively employed to automatically release dynamic objects.

An example scenario where auto_ptr can be effective is in UI widgets that implement optional behaviors. For instance, imaging a window widget class that allows the user to set a background image as an option. Until the user sets this option, the widget draws the background using a solid color.

class Window {
    ...
    std::auto_ptr<Background> background_;
    ...
public:
    ...
    void setBackground(const wchar_t* imagefilename) throw(std::exception)
    {
        background_.reset(new     
        Background(imagefilename));
    }
    void Paint()
    {
        ...
        if (background_.get()) {
            // paint the background
        }
    }
};

Objects of the Window class would, by default, have an empty embedded Background object. However, when a user sets the background, a Background object would be created which will be tested for NULLness in the window paint routine. If the member variable points to a valid object (and hence is is not NULL), the image is painted as the background of the window.

The key point here is that the programmer does not have to remember to explicitly do anything to manage the Background object. When the window is destroyed which will result in a consequent deletion of the corresponding Window object, the associated Background object’s destructor will automatically be called where all the resources used to paint the background image can be released.

[Caveat 1:] As mentioned in the beginning, auto_ptr has destructive copy semantics making any object containing a member variable of this type not safe for copy constructor or operator assignment. So if your object requires either of these two constructs, you should stay away from auto_ptr. Alternative is boost::shared_ptr\<>.

[Caveat 2:] For the above reason, C++0x deprecates auto_ptr and instead provides a new template std::unique_ptr.

[Caveat 3:] std::auto_ptr is also not capable of managing array types whereas std::unique_ptr does.

[Turn on memory tagging]

Turn on tagging of heap memory allocated blocks in debug builds and compare heap memory snapshots between program startup and termination to ensure that all memory blocks are being released.

In Windows one can use the _CrtMemCheckpoint() and _CrtMemDifference() CRT calls to check if a function leaks any memory. Visual C++ also provides other Heap State Reporting Functions that can be used to monitor memory leaks.

I believe in LINUX, more specifically gcc, provides mtrace() which logs all memory allocation requests to a logfile which can then be parsed my mtrace utility to report if there are any leaks.

[Use std::vector\<char> instead of malloc.]

std::vector\<> uses RAII to acquire its memory and its destructor releases the memory. This way you don’t have to remember to release the memory you allocated at all of your return points.

For example:

void foo()
{
    char* buf = new char[1024];
    if (!buf)
        return;

    // dummy frame buffer
    volatile ulong* fb = 0x00040000;
    memcpy((void*)buf, (const void*)fb, 1024);

    if (cond1) {
        dosomething(buf);
        delete[] buf;
        return;
    }
    if (cond2) {
        dosomethingelse(buf);
        delete[] buf;
        return;
    }
    // do some other processing
    doyetanotherthing(buf);
    delete[] buf;
}

You’ll notice that that three different locations from where the function returns to the caller and hence three different calls to release the memory allocated at the beginning of the function. As mentioned in the first point, this requires the programmer to exercise discipline to remember to release the memory from all paths of return. Secondly, if any of dosomething(), dosomethingelse() or doyetanotherthing() throws an exception, the memory would never be released.

Now let’s look at how the same function, but now using std::vector for its memory allocation.

void foo() throw(std::exception)
{
    std::vector<char> buf(1024);

    // dummy frame buffer
    volatile ulong* fb = 0x00040000;
    memcpy((void*)&buf[0], (const void*)fb, 1024);

    if (cond1) {
        dosomething(buf);
        return;
    }
    if (cond2) {
        dosomethingelse(buf);
        return;
    }

    doyetanotherthing(buf);
}

Admittedly this is not such a great example as the control paths can be combined to eventually lead to a single point of return where the delete can be placed. But I hope you get the idea.

[Declare base class destructors as virtual.]

An often repeated mistake especially as the project scope expands and the classes that were designed and implemented in earlier versions are reused by inheriting from them to alter their behavior.

When deriving from a class whose destructor is not declared as virtual, and then subsequently deleting the object using a pointer to the base class would only invoke the destructor of the base class and not the derived class. Thus, the derived class resources would not be cleaned up.