Robbie
09-18-2007, 09:22 AM
Hi,
Recently I just received my enterprise licencse. But after installing the valid license, I face some confusing problems when launching the MPI debug session with latest Totalview.8.2.0-1. I installed TVD and the license correctly and the TVD can work well with a serial problem.
My platform is ia64-linux SMP. And I use MPICH2-1.0.5. When MPICH is configured, the location of totalview lies in the "PATH" environmental variable. So the "mtv" module is compiled and linked. The folowing is the configuration options of my MPICH2 1.0.5:
./configure --prefix=/data2/jiangjie/mpich2-svr --with-device=ch3:ssm --enable-fast --enable-mpe=yes CC=/opt/intel/compiler91/bin/icc CFLAGS=-O3 CXX=/opt/intel/compiler91/bin/icc F77=/opt/intel/compiler91/bin/ifort FFLAGS=-O3
My test program is compiled by mpicc (which calls "icc" internally) with "-g -O0" options. It works well with" mpirun -np 2 ./testMPI"
When run with: mpirun -tv -np 2 ./testMPI, a popup window appreas and ask me to whether to stop the parallel job.After answering "YES", there are three processes in the Root window in the following status:
B python(1 active threads)
T python<testMPI>.0(1 active threads)
T python<testMPI>.1(1 active threads)
From this we can see that this MPI job has been stopped and controled by TVD.
Then, if I double-clicked the first process (with status of "B"), the Process window shows that this process is stopped at function MPIR_Breakpoint in mtv.c and the stack trace window list all the stack frames from "start" to "MPIR_Breakpoint".
However, when I double-clicked the second item in the Root window, there is no source code (testMPI.c) appearing in the source code window as expected, only assemble instructions of the process I selected.
Note that the executable lies in the same location with its source code. Even after specifying the search path of source code in the "File>Seach Path...", there is still no source code appears in the source code window.Then I have no means to set a breakpoint. If I press the "GO" button, the parallel job runs completely without any stop and no debug operation is possible.
However, I can open the source file using "File>Open Source..." after selecting the second item in the Root window. After this, I can set a breakpoint or a barrier point. NOTE here the breakpoint or barrier point can take effective only if it is set after "MPI_Init". If it is set before or even on MPI_Init, the process will not stop.
Here the main questions are:
1. Why is the source file not opened automatically after the process item in the Root window has been selected? Instead, I have to open it manually.
2. It seems that the parallel job has been stopped in the middle of "MPI_Init" (most possbily, on "MPIR_Breakpoint"), why not stop it on the entry of the "main" or the first line of the main routine? Is this related to the implementation, configuration or installation of my MPICH? Or is it beceause of my setting up of TVD?
Recently I just received my enterprise licencse. But after installing the valid license, I face some confusing problems when launching the MPI debug session with latest Totalview.8.2.0-1. I installed TVD and the license correctly and the TVD can work well with a serial problem.
My platform is ia64-linux SMP. And I use MPICH2-1.0.5. When MPICH is configured, the location of totalview lies in the "PATH" environmental variable. So the "mtv" module is compiled and linked. The folowing is the configuration options of my MPICH2 1.0.5:
./configure --prefix=/data2/jiangjie/mpich2-svr --with-device=ch3:ssm --enable-fast --enable-mpe=yes CC=/opt/intel/compiler91/bin/icc CFLAGS=-O3 CXX=/opt/intel/compiler91/bin/icc F77=/opt/intel/compiler91/bin/ifort FFLAGS=-O3
My test program is compiled by mpicc (which calls "icc" internally) with "-g -O0" options. It works well with" mpirun -np 2 ./testMPI"
When run with: mpirun -tv -np 2 ./testMPI, a popup window appreas and ask me to whether to stop the parallel job.After answering "YES", there are three processes in the Root window in the following status:
B python(1 active threads)
T python<testMPI>.0(1 active threads)
T python<testMPI>.1(1 active threads)
From this we can see that this MPI job has been stopped and controled by TVD.
Then, if I double-clicked the first process (with status of "B"), the Process window shows that this process is stopped at function MPIR_Breakpoint in mtv.c and the stack trace window list all the stack frames from "start" to "MPIR_Breakpoint".
However, when I double-clicked the second item in the Root window, there is no source code (testMPI.c) appearing in the source code window as expected, only assemble instructions of the process I selected.
Note that the executable lies in the same location with its source code. Even after specifying the search path of source code in the "File>Seach Path...", there is still no source code appears in the source code window.Then I have no means to set a breakpoint. If I press the "GO" button, the parallel job runs completely without any stop and no debug operation is possible.
However, I can open the source file using "File>Open Source..." after selecting the second item in the Root window. After this, I can set a breakpoint or a barrier point. NOTE here the breakpoint or barrier point can take effective only if it is set after "MPI_Init". If it is set before or even on MPI_Init, the process will not stop.
Here the main questions are:
1. Why is the source file not opened automatically after the process item in the Root window has been selected? Instead, I have to open it manually.
2. It seems that the parallel job has been stopped in the middle of "MPI_Init" (most possbily, on "MPIR_Breakpoint"), why not stop it on the entry of the "main" or the first line of the main routine? Is this related to the implementation, configuration or installation of my MPICH? Or is it beceause of my setting up of TVD?