PDA

View Full Version : Workaround failure with EINTR



acheroua
12-06-2006, 06:02 AM
Hi,

platform:

# uname -a
Linux wokingham 2.6.9-42.EL #1 SMP Wed Jul 12 23:25:09 EDT 2006 ia64 ia64 ia64 GNU/Linux
totalview: v7.2.0


The application I'm trying to debug its memory using Totalview is exiting on non handled EINTR failure once Totalview has stopped the process and "dgo" command line issued to resume the execution.

The code where the EINTR issue occurs is within the core of an application server and not at custom level. To workaround this issue, we have implemented an interceptor library (ibfixaccept.so.1) which catches and retries accept calls that return EINTR. (fixaccept.c source code copied below).

Before starting the application and Totalview, we set the following in the environment:

export LD_PRELOAD=/<fullpath>/libfixaccept.so.1


1- Application started
The library libfixaccept.so.1 does the job since mesages like debug message from fixaccept.c is sent to stderr:
fixaccept: accept(5, 0x200000020b2fd800, 187686624)
2- start Totalview
3 - Bring the process under totalview control

d1.<> dattach <executable> <process id>

5- - Totalview pauses the process
d1.<> dstatus
1 (1303) Stopped [/opt/DynamicEngine]
1.1 (1303/2305843009226806688) Stopped PC=0xa000000000010641
1.2 (1303/14566814304) Stopped PC=0xa000000000010641
...
...
1.25 (1303/9479123552) Stopped PC=0xa000000000010641
1.26 (1303/9468637792) Stopped PC=0xa000000000010641
1.27 (1303/9424597600) Stopped PC=0xa000000000010641

- use dgo to resume the execution
d1.<> dgo
Process 1 has exited

So, after resuming the execution of the process, somehow, the accept from libfixaccept.so.1 is no longer used but the one from standard library.

The questions:

Is there any explanation to the fact that library where the accept() has been overridden is no longer the one called but the one from the standard library ?
Does Totalview take into consideration the environment variable LD_PRELOAD and other ones ?

Please note that this occuring with both version of Totalview (command line and GUI).

Thanks.

# cat fixaccept.c

#include <stdio.h>
#include <errno.h>
#include <dlfcn.h>
#include <sys/types.h>
#include <sys/socket.h>

typedef int (*ACCEPT_FUNC)(int s, struct sockaddr *addr, socklen_t *addrlen);
static ACCEPT_FUNC _map_accept;

int accept(int s, struct sockaddr *addr, socklen_t *addrlen)
{
int ret;

if (_map_accept == 0)
{
_map_accept = (ACCEPT_FUNC)dlsym(RTLD_NEXT, "accept");
}
if (_map_accept == 0)
{
fprintf(stderr, "fixaccept: dlsym failed: %s\n", dlerror());
abort();
}

#if defined(SW_DEBUG)
fprintf(stderr, "fixaccept: accept(%d, %p, %d)\n", s, addr, addrlen);
#endif

while (1)
{
ret = _map_accept(s, addr, addrlen);
if (ret < 0 && errno == EINTR)
{
#if defined(SW_DEBUG)
fprintf(stderr, "fixaccept: accept returned EINTR\n");
#endif
continue;
}
break;
}

return ret;
}


# cat makefile
all: libfixaccept.so.1

libfixaccept.so.1: fixaccept.c
gcc -pthread -g -DSW_DEBUG=1 -fPIC \
-D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -D_GNU_SOURCE \
fixaccept.c -c
gcc -shared -Wl,--enable-new-dtags -pthread fixaccept.o \
-ldl -lc -o libfixaccept.so
mv libfixaccept.so libfixaccept.so.1

Josh-TotalView-Tech
12-08-2006, 07:24 AM
This should certainly work. TotalView should recognize all libraries that are linked, preloaded, or opened dynamically. This should be noticable by the library being visible in the Program Browser (Tools > Program Browser). Also if you do an 'f' on accept (View > Lookup Function) it should bring up your source code.

Also, TotalView recognizing the library should be independent of what is actually called. When you run the program (ie do a dgo), TotalView is not interfering with the runtime bindings.

I am speculating that one of the following two things is occurring:

1. accept really is being interposed. The best method to find out is to set a breakpoint on accept

2. LD_PRELOAD is not set in the environment where TotalView was invoked and you restart the program. In this case the LD_PRELOAD setting will be lost and if you want to restart the program (after the attachment) you will need to set LD_PRELOAD in the Process > Startup Parameters dialog.

Does this help?

BTW, other than the fact that you want to use the memory debugger I'm not sure how this issue question relates to memory debugging other than the fact that TotalView does use LD_PRELOAD to interpose the memory manager of your programs.

acheroua
12-08-2006, 07:42 AM
Thanks for the reply.
Actually, the interceptor library should have been updated to wrap both accept() and poll(), the accept() was there but not the poll().
Now it's OK. The original problem was resolved in the core.