PDA

View Full Version : pthread_cond_timedwait issues?



Marco Alanen
09-11-2007, 08:37 AM
Are there any known issues with pthread_cond_timedwait triggering way too early in 8.2 and 8.3B? My program happily waits for X seconds when running outside of Totalview, but in Totalview it instantly claims that the condition didn't trigger in time (tried to wait from 1-50 seconds).

If any further information is needed I can provide that, and possibly a small test case if I can get one up and running with the same problem.

Edit: This is on Linux x86 2.6.22-11 with gcc 4.1.2 and glibc 2.4 using NPTL.

Josh-TotalView-Tech
09-12-2007, 07:08 AM
TotalView just like other debuggers use signals (SIGCONT,SIGSTOP) to do things like halt and continue target threads. Some system calls will return an error, errno=EINTR, should the thread be interrupted with a signal. The pthread library makes system calls and it is possible that one of the calls is getting interrupted. Looking at the NPTL man page for pthread_cond_timedwait (ie. man 3p pthread_cond_timedwait), it states the following:


If a signal is delivered to a thread waiting for a condition variable, upon return from the signal han-
dler the thread resumes waiting for the condition variable as if it was not interrupted, or it shall
return zero due to spurious wakeup.


So I guess it isn't really defined what happens when a thread gets interrupted with a signal ... the call to pthread_cond_timedwait either continues or it doesn't [:(]. So what can you do about it? I would suggest checking abstime against the current time of day and see if the time has passed or not, if it hasn't passed call pthread_cond_timedwait.

You may also want to see this related http://forum.totalviewtech.com/gforum.cgi?post=44

One other thing that is worthy to note, if this is in fact the issue I think it is considered a bug in the application. The same sort of issue can occur should you use standard shell job control functions.

I hope this helps.

Marco Alanen
09-12-2007, 07:14 AM
Thanks for the reply. I'll have to make sure that I haven't done any obvious errors (can't get a small test example to behave in the same way in TV) and then I'll start experimenting with different kernel- and glibc-versions. I can't recall having this problem before (in TV <=8.2) so it must be due to some recent changes in my system.

At least I know where to start digging now. Thanks!

Josh-TotalView-Tech
09-12-2007, 07:19 AM
I would suggest trying your small test case by running it at a shell prompt, using ^Z to suspend it which will throw a SIGTSTP signal at the process and then using the command fg command which will throw a SIGCONT at the process, and see what happens with the pthread_cond_timedwait call.