Login
 

Maze for Linux

  • Increase font size
  • Default font size
  • Decrease font size

Detecting a race condition

In this simple example the main thread creates two threads, waits while each thread increments the counter count by 1, and prints the final value of the counter.

#include <assert.h>
#include <stdio.h>
#include <pthread.h>

static void * increment(void *);


int main()
{
int count = 0; // initialize the counter

const int num_threads = 2;
pthread_t tid[num_threads];
int error;
int i;

for(i = 0; i < num_threads; i++) // create all threads
{
error = pthread_create(&tid[i], 0, &increment, (void *)(&count));
assert(error == 0);
}

for(i = 0; i < num_threads; i++) // wait for every thread
{
error = pthread_join(tid[i], NULL);
assert(error == 0);
}

printf("%d\n", count); // print the counter

return 0;
}

static void * increment(void * count)
{
(*((int *)count))++; // increment the counter

return NULL;
}

Compile the code and run it several times:

$ gcc -g race_condition.c -o race_condition -lpthread

$ ./race_condition

In all probability the value printed by race_condition is 2 every time. Indeed, while main thread is busy creating the second thread, the first thread has a plenty of time to increment the counter. By the time the second thread gets to the counter, its value is already 1.

The 2nd thread reads counter after its value has being incremented


Yet there is still a possibility that the second thread reads the counter before the first thread assigns it the incremented value. In this case the final value printed by race_condition must be 1.

The 2nd thread reads counter before its value has being incremented

 

Maze must be able to reveal this behavior.

Let maze test the program with 5000 TEST mode runsTip and examine the output.

$ maze ./race_condition -r 5000 > /dev/nullTip &


If maze pid was, for instance, 315, there are 5000 ASCII files in the directory .maze/315/: 1.out, 2.out, ..., 5000.out, containing count values printed by race_condition during the corresponding runs.

Most of the files contain 2 – the most probable counter value. Check if any count values are different:

$ grep -v 2 .maze/315/*.out

The result may look like this:

.maze/315/1392.out:1
.maze/315/2613.out:1

 

 

In other words, in the runs #1392 and #2613 the final value of the counter was 1.

At this point a reasonable question to ask is “Why bother with maze, instead of simply running the program 5000 times?” The most important reasons are:

  • In real life the thread scheduling is affected by the threads' priorities and the system-specific features, such as the number of CPU on your machine. This bias may further reduce the probability of the corner case eventsTip. Maze generates diverse and unbiased schedule.
  • If a process blocks (i. e. “hangs”) during a regular run, the user has to address the problem before proceeding further. This is not very practical, especially for the overnight regression runs. Maze does not block when the application process does. Instead, maze prints the diagnostic information and proceeds with the next run.
  • Even if a problem is detected during a “regular” run, there is no reliable way to reproduce it on request. Maze can reproduce the exact behavior of the process during every single run.


As a demonstration, reproduce run #1392 from the TEST mode session.

$ maze -R 1392 -p 315

The result is the same:
.maze/315/1392.new_pid.out is identical to .maze/315/1392.out, with the counter value equal to 1.