Using Valgrind to Find Memory Leaks and Invalid Memory Use


By Alex Allain
Note: Valgrind is Linux only. If you aren't running Linux, or want a tool designed from the start to make debugging segfaults and memory issues easier, check out Cee Studio, a fully online C and C++ development environment from our sponsor. Cee Studio provides instant and informative feedback on memory issues.

Valgrind is a multipurpose code profiling and memory debugging tool for Linux when on the x86 and, as of version 3, AMD64, architectures. It allows you to run your program in Valgrind's own environment that monitors memory usage such as calls to malloc and free (or new and delete in C++). If you use uninitialized memory, write off the end of an array, or forget to free a pointer, Valgrind can detect it. Since these are particularly common problems, this tutorial will focus mainly on using Valgrind to find these types of simple memory problems, though Valgrind is a tool that can do a lot more.





Alternatively, for Windows users who want to develop Windows-specific software, you might be interested in IBM's Purify, which has features similar to Valgrind for finding memory leaks and invalid memory accesses. A trial download is available.

Getting Valgrind

If you're running Linux and you don't have a copy already, you can get Valgrind from the Valgrind download page.

Installation should be as simple as decompressing and untarring using bzip2 (XYZ is the version number in the below examples)
bzip2 -d valgrind-XYZ.tar.bz2
tar -xf valgrind-XYZ.tar
which will create a directory called valgrind-XYZ; change into that directory and run
./configure
make
make install
Now that you have Valgrind installed, let's look at how to use it.

Finding Memory Leaks With Valgrind

Memory leaks are among the most difficult bugs to detect because they don't cause any outward problems until you've run out of memory and your call to malloc suddenly fails. In fact, when working with a language like C or C++ that doesn't have garbage collection, almost half your time might be spent handling correctly freeing memory. And even one mistake can be costly if your program runs for long enough and follows that branch of code.

When you run your code, you'll need to specify the tool you want to use; simply running valgrind will give you the current list. We'll focus mainly on the memcheck tool for this tutorial as running valgrind with the memcheck tool will allow us to check correct memory usage. With no other arguments, Valgrind presents a summary of calls to free and malloc: (Note that 18490 is the process id on my system; it will differ between runs.)
% valgrind --tool=memcheck program_name
...
=18515== malloc/free: in use at exit: 0 bytes in 0 blocks.
==18515== malloc/free: 1 allocs, 1 frees, 10 bytes allocated.
==18515== For a detailed leak analysis,  rerun with: --leak-check=yes
If you have a memory leak, then the number of allocs and the number of frees will differ (you can't use one free to release the memory belonging to more than one alloc). We'll come back to the error summary later, but for now, notice that some errors might be suppressed -- this is because some errors will be from standard library routines rather than your own code.

If the number of allocs differs from the number of frees, you'll want to rerun your program again with the leak-check option. This will show you all of the calls to malloc/new/etc that don't have a matching free.

For demonstration purposes, I'll use a really simple program that I'll compile to the executable called "example1"
#include <stdlib.h>
int main()
{
    char *x = malloc(100); /* or, in C++, "char *x = new char[100] */
    return 0;
}
% valgrind --tool=memcheck --leak-check=yes example1
This will result in some information about the program showing up, culminating in a list of calls to malloc that did not have subsequent calls to free:
==2116== 100 bytes in 1 blocks are definitely lost in loss record 1 of 1
==2116==    at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
==2116==    by 0x804840F: main (in /home/cprogram/example1)
This doesn't tell us quite as much as we'd like, though -- we know that the memory leak was caused by a call to malloc in main, but we don't have the line number. The problem is that we didn't compile using the -g option of gcc, which adds debugging symbols. So if we recompile with debugging symbols, we get the following, more useful, output:
==2330== 100 bytes in 1 blocks are definitely lost in loss record 1 of 1
==2330==    at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
==2330==    by 0x804840F: main (example1.c:5)
Now we know the exact line where the lost memory was allocated. Although it's still a question of tracking down exactly when you want to free that memory, at least you know where to start looking. And since for every call to malloc or new, you should have a plan for handling the memory, knowing where the memory is lost will help you figure out where to start looking.

There will be times when the --leak-check=yes option will not result in showing you all memory leaks. To find absolutely every unpaired call to free or new, you'll need to use the --show-reachable=yes option. Its output is almost exactly the same, but it will show more unfreed memory.

Finding Invalid Pointer Use With Valgrind

Valgrind can also find the use of invalid heap memory using the memcheck tool. For instance, if you allocate an array with malloc or new and then try to access a location past the end of the array:
char *x = malloc(10);
x[10] = 'a';
Valgrind will detect it. For instance, running the following program, example2, through Valgrind
#include <stdlib.h>

int main()
{
    char *x = malloc(10);
    x[10] = 'a';
    return 0;
}
with
valgrind --tool=memcheck --leak-check=yes example2
results in the following warning
==9814==  Invalid write of size 1
==9814==    at 0x804841E: main (example2.c:6)
==9814==  Address 0x1BA3607A is 0 bytes after a block of size 10 alloc'd
==9814==    at 0x1B900DD0: malloc (vg_replace_malloc.c:131)
==9814==    by 0x804840F: main (example2.c:5)
What this tell us is that we're using a pointer allocated room for 10 bytes, outside that range -- consequently, we have an 'Invalid write'. If we were to try to read from that memory, we'd be alerted to an 'Invalid read of size X', where X is the amount of memory we try to read. (For a char, it'll be one, and for an int, it would be either 2 or 4, depending on your system.) As usual, Valgrind prints the stack trace of function calls so that we know exactly where the error occurs.

Detecting The Use Of Uninitialized Variables

Another type of operation that Valgrind will detect is the use of an uninitialized value in a conditional statement. Although you should be in the habit of initializing all variables that you create, Valgrind will help find those cases where you don't. For instance, running the following code as example3
#include <stdio.h>

int main()
{
    int x;
    if(x == 0)
    {
        printf("X is zero"); /* replace with cout and include 
                                iostream for C++ */
    }
    return 0;
}
through Valgrind will result in
==17943== Conditional jump or move depends on uninitialised value(s)
==17943==    at 0x804840A: main (example3.c:6)
Valgrind is even smart enough to know that if a variable is assigned the value of an uninitialized variable, that that variable is still in an "uninitialized" state. For instance, running the following code:
#include <stdio.h>

int foo(int x)
{
    if(x < 10)
    {
        printf("x is less than 10\n");
    }
}

int main()
{
    int y;
    foo(y);
}
in Valgrind as example4 results in the following warning:
==4827== Conditional jump or move depends on uninitialised value(s)
==4827==    at 0x8048366: foo (example4.c:5)
==4827==    by 0x8048394: main (example4.c:14)
You might think that the problem was in foo, and that the rest of the call stack probably isn't that important. But since main passes in an uninitialized value to foo (we never assign a value to y), it turns out that that's where we have to start looking and trace back the path of variable assignments until we find a variable that wasn't initialized.

This will only help you if you actually test that branch of code, and in particular, that conditional statement. Make sure to cover all execution paths during testing!

What else will Valgrind Find

Valgrind will detect a few other improper uses of memory: if you call free twice on the same pointer value, Valgrind will detect this for you; you'll get an error:
Invalid free()
along with the corresponding stack trace.

Valgrind also detects improperly chosen methods of freeing memory. For instance, in C++ there are three basic options for freeing dynamic memory: free, delete, and delete[]. The free function should only be matched with a call to malloc rather than a call to, say, delete -- on some systems, you might be able to get away with not doing this, but it's not very portable. Moreover, the delete keyword should only be paired with the new keyword (for allocation of single objects), and the delete[] keyword should only be paired with the new[] keyword (for allocation of arrays). (Though some compilers will allow you to get away with using the wrong version of delete, there's no guarantee that all of them will. It's just not part of the standard.)

If you do trigger one of these problems, you'll get this error:
  Mismatched free() / delete / delete []
which really should be fixed even if your code happens to be working.

What Won't Valgrind Find?

Valgrind doesn't perform bounds checking on static arrays (allocated on the stack). So if you declare an array inside your function:
int main()
{
    char x[10];
    x[11] = 'a';
}
then Valgrind won't alert you! One possible solution for testing purposes is simply to change your static arrays into dynamically allocated memory taken from the heap, where you will get bounds-checking, though this could be a mess of unfreed memory.

A Few More Caveats

What's the drawback of using Valgrind? It's going to consume more memory -- up to twice as much as your program normally does. If you're testing an absolutely huge memory hog, you might have issues. It's also going to take longer to run your code when you're using Valgrind to test it. This shouldn't be a problem most of the time, and it only affects you during testing. But if you're running an already slow program, this might affect you.

Finally, Valgrind isn't going to detect every error you have -- if you don't test for buffer overflows by using long input strings, Valgrind won't tell you that your code is capable of writing over memory that it shouldn't be touching. Valgrind, like another other tool, needs to be used intelligently as a way of illuminating problems.

Summary

Valgrind is a tool for the x86 and AMD64 architectures and currently runs under Linux. Valgrind allows the programmer to run the executable inside its own environment in which it checks for unpaired calls to malloc and other uses of invalid memory (such as ininitialized memory) or invalid memory operations (such as freeing a block of memory twice or calling the wrong deallocator function). Valgrind does not check use of statically allocated arrays.

Related articles

Dynamic Memory Allocation, Part 1: Advanced Memory Management

Dynamic Memory Allocation, Part 2: Dynamic Memory Allocation and Virtual Memory

Dynamic Memory Allocation, Part 3: Customized Allocators with Operator New and Operator Delete

Dynamic Memory Allocation, Part 4: Common Memory Management Problems in C++

Understanding Pointers

Using auto_ptr to avoid memory leaks