# Thread: Runtime issue with UMIAH

1. ## Runtime issue with UMIAH

I have written a Fortran 90 program to calculate maximum likelihood estimates for the parameters of a specified distribution using the UMIAH subroutine. I am simulating my data from known parameters and generate 10,000 samples (with 2,000 bootstrap samples for each main sample).

For the most part, everything works well. However, some of my samples (I believe they are primarily the bootstrap samples) cause the UMIAH subroutine to "hang up" and the program becomes unresponsive. This continues until I either cancel the program (if running it in a live mode) or exceed the amount of time allocated for the program (if running it in a batch mode).

Is there any way to determine what is causing the program to hang up? I have determined that the particular sample that first causes this issue varies based on the initial values that I input into the program. For example, if I specify that all initial estimates are 50% of the actual value, then the program hangs up on sample 25, but if I change the initial estimates to equal the true values, then the program hangs up on sample 4987.

Any help would be appreciated.

Thanks.

2. The speed of an optimization calculation depends mainly on three properties: the nature of the function being optimized, the availability of a good estimate of the optimiser, and the properties of the algorithm being used.

Since UMIAH makes repeated calls to your functions, a problem in your function, such as a memory leak, can cause behavior that matches what you described.

The first thing to do to debug the code is to arrange so that you can run, say, sample 4987 as the first and only sample, and arrange for more diagnostic printout. To do so, you need to note the state of the RNG at the creation of sample 4987 in your present code. Then, you start the debugging code with the same state.

3. Alternatively, you might be able to avoid programming the algorithm yourself by calling the IMSL subroutine MLE, recently added to chapter 17 of IMSL Stat, if it supports the distribution(s) you are interested in.

MLE: Calculates maximum likelihood estimates for the parameters of one of several univariate probability distributions.

hth,
-w

4. mecej4,

I appreciate the comments. I modified my code so that it outputs both the current value of the seed as well as the data set just before it crashes. When I recompiled the program and used the final value of the seed from the first attempt as the new initial value, the program continued to run for a little while before experiencing the same problem. I believe this supports your assertion that is some sort of problem with the underlying subroutines, rather than the particular data set.

I did some internet searches into "Fortran 90" and "memory leaks" and the only common topic that appeared related to the use of allocatable arrays, which I am not using. So it appears that the problem must be elsewhere.

As a side note that may or may not be related, I am employing at least one bad programming practice (per my now somewhat dated programming notes) in that I use global variables to pass the data into the subroutines. Unlike the well-defined Rosenbrock function given as an example in the reference material, I am trying to solve maximum likelihood equations. Therefore, the function I am minimizing is a function of the data, and it changes with every random sample that I generate. Might this be the problem? Alternatively, is there a better way to program this?

Thanks!

5. omega,

I am currently using version 6.0, which doesn't appear to have the MLE subroutine. Either way, my problem is a little bit more difficult. The underlying distribution is a Weibull distribution (so no problem there), but it is an accelerated degradation model where the shape parameter is constant, but the scale parameter is a function of both time and stress.

I appreciate your help. I'll have to look at the documentation and see if there are any other changes in the new version that might help me out.

6. You may also consider experimenting with an IMSL optimization routine(s) other than UMIAH. Although the alternatives may provide more functionality than your require (e.g. bounds & constraints), a different minimization algorithm may provide some insight into the source of the hanging you are seeing.

Alternatives to consider include LCONF/LCONG and BCONF/BCONG/BCODH/BCOAH. Note, if you do not have bounds or constraints, you can either specify very generous bounds, or set the number of bounds & constraints to zero when calling the optimization routine(s).

hth,
-w

7. Originally Posted by steven.alferink
mecej4,

When I recompiled the program and used the final value of the seed from the first attempt as the new initial value, the program continued to run for a little while before experiencing the same problem.
If in the "checkpoint/restart" run the freeze-up did not occur at the same program state as in the original run, I suspect that there are array overruns or usage of uninitialized variables.

Are you able to post the mathematical description and the source code for the function being minimized?

As a side note that may or may not be related, I am employing at least one bad programming practice (per my now somewhat dated programming notes) in that I use global variables to pass the data into the subroutines. Unlike the well-defined Rosenbrock function given as an example in the reference material, I am trying to solve maximum likelihood equations. Therefore, the function I am minimizing is a function of the data, and it changes with every random sample that I generate. Might this be the problem? Alternatively, is there a better way to program this?

Thanks!
I don't understand how you have global variables in a Fortran program. As to the other questions, I am unable to respond until I understand the problem better.

8. mecej4,

This code is a little bit long, but I'll go ahead and post it. Looks like it exceeds the number of characters for a message...

Code:
```!
! ARHN_MLE_NEG_LOGLF
!

subroutine arhn_mle_neg_loglf(num_parm,mle_theta,neg_loglf)

! Define the output variables

double precision, intent(out)     :: neg_loglf

! Define the input variables

integer, intent(in)               :: num_parm
double precision, intent(in)      :: mle_theta(num_parm)

! Define the local variables

double precision                  :: loglf
double precision                  :: loc_alpha(0:2)
double precision                  :: loc_beta
double precision                  :: loc_theta(1:num_ntot)

! Calculate the local variables

loc_alpha(0:2) = mle_theta(1:3)
loc_beta  = mle_theta(4)
loc_theta = exp(loc_alpha(0) - loc_alpha(1) * vect_t * exp(loc_alpha(2) * vect_v))

! Calculate the log likelihood function

loglf = (sum(n_dp) * log(loc_beta)) - (loc_beta * sum(loc_alpha(0) - loc_alpha(1) * vect_t * exp(loc_alpha(2) * vect_v))) + &
& ((loc_beta - 1d0) * sum(log(vect_x))) - (sum((vect_x / loc_theta)**loc_beta))

! Calculate the negative log likelihood function

neg_loglf = -loglf

endsubroutine arhn_mle_neg_loglf

!
! ARHN_MLE_NEG_LOGLF_1DER
!

subroutine arhn_mle_neg_loglf_1der(num_parm,mle_theta,neg_loglf_1der)

! Define the output variables

double precision, intent(out)     :: neg_loglf_1der(num_parm)

! Define the input variables

integer, intent(in)               :: num_parm
double precision, intent(in)      :: mle_theta(num_parm)

! Define the local variables

double precision                  :: loglf_1der(num_parm)
double precision                  :: loc_alpha(0:2)
double precision                  :: loc_beta
double precision                  :: loc_theta(1:num_ntot)

! Calculate the local variables

loc_alpha(0:2) = mle_theta(1:3)
loc_beta  = mle_theta(4)
loc_theta = exp(loc_alpha(0) - loc_alpha(1) * vect_t * exp(loc_alpha(2) * vect_v))

! Calculate the first order partial derivatives of the log likelihood function

loglf_1der(1) = -loc_beta * sum(n_dp) + loc_beta * sum((vect_x / loc_theta)**loc_beta)
loglf_1der(2) =  loc_beta * sum(vect_t * exp(loc_alpha(2) * vect_v) * (1d0 - (vect_x / loc_theta)**loc_beta))
loglf_1der(3) =  loc_beta * sum(vect_t * exp(loc_alpha(2) * vect_v) * (1d0 - (vect_x / loc_theta)**loc_beta) * loc_alpha(1) * vect_v)
loglf_1der(4) =  sum((1d0 / loc_beta) + log(vect_x / loc_theta) * (1d0 - (vect_x / loc_theta)**loc_beta))

! Calculate the first order partial derivatives of the negative log likelihood function

neg_loglf_1der = -loglf_1der

endsubroutine arhn_mle_neg_loglf_1der

!
! ARHN_MLE_NEG_LOGLF_2DER
!

subroutine arhn_mle_neg_loglf_2der(num_parm,mle_theta,neg_loglf_2der,LDH)

! Define the output variables

double precision, intent(out)     :: neg_loglf_2der(LDH,num_parm)

! Define the input variables

integer, intent(in)               :: LDH
integer, intent(in)               :: num_parm
double precision, intent(in)      :: mle_theta(num_parm)

! Define the local variables

double precision                  :: loglf_2der(LDH,num_parm)
double precision                  :: loc_alpha(0:2)
double precision                  :: loc_beta
double precision                  :: loc_theta(1:num_ntot)

! Calculate the local variables

loc_alpha(0:2) = mle_theta(1:3)
loc_beta  = mle_theta(4)
loc_theta = exp(loc_alpha(0) - loc_alpha(1) * vect_t * exp(loc_alpha(2) * vect_v))

! Calculate the second order partial derivatives of the log likelihood function

loglf_2der(1,1) = -(loc_beta**2) * sum((vect_x / loc_theta)**loc_beta)
loglf_2der(1,2) =  (loc_beta**2) * sum(vect_t * exp(loc_alpha(2) * vect_v) * (vect_x / loc_theta)**loc_beta)
loglf_2der(1,3) =  (loc_beta**2) * sum(vect_t * exp(loc_alpha(2) * vect_v) * loc_alpha(1) * vect_v * (vect_x / loc_theta)**loc_beta)
loglf_2der(1,4) = -sum(n_dp) + sum((1d0 + loc_beta * log(vect_x / loc_theta)) * (vect_x / loc_theta)**loc_beta)
loglf_2der(2,1) =  loglf_2der(1,2)
loglf_2der(2,2) = -(loc_beta**2) * sum((vect_t * exp(loc_alpha(2) * vect_v))**2 * (vect_x / loc_theta)**loc_beta)
loglf_2der(2,3) = -(loc_beta**2) * sum((vect_t * exp(loc_alpha(2) * vect_v))**2 * loc_alpha(1) * vect_v * (vect_x / loc_theta)**loc_beta) + loc_beta * sum(vect_t * vect_v * exp(loc_alpha(2) * vect_v) * (1d0 - (vect_x / loc_theta)**loc_beta))
loglf_2der(2,4) =  sum((vect_t * exp(loc_alpha(2) * vect_v)) * (1d0 - (1d0 + loc_beta * log(vect_x / loc_theta)) * (vect_x / loc_theta)**loc_beta))
loglf_2der(3,1) =  loglf_2der(1,3)
loglf_2der(3,2) =  loglf_2der(2,3)
loglf_2der(3,3) = -(loc_beta**2) * sum((vect_t * exp(loc_alpha(2) * vect_v) * loc_alpha(1) * vect_v)**2 * (vect_x / loc_theta)**loc_beta) + loc_beta * sum(vect_t * exp(loc_alpha(2) * vect_v) * (1d0 - (vect_x / loc_theta)**loc_beta) * loc_alpha(1) * vect_v**2)
loglf_2der(3,4) =  sum((vect_t * exp(loc_alpha(2) * vect_v) * loc_alpha(1) * vect_v) * (1d0 - (1d0 + loc_beta * log(vect_x / loc_theta)) * (vect_x / loc_theta)**loc_beta))
loglf_2der(4,1) =  loglf_2der(1,4)
loglf_2der(4,2) =  loglf_2der(2,4)
loglf_2der(4,3) =  loglf_2der(3,4)
loglf_2der(4,4) = -sum(n_dp) / (loc_beta**2) - sum((log(vect_x / loc_theta))**2 * (vect_x / loc_theta)**loc_beta)

! Calculate the second order partial derivatives of the negative log likelihood function

neg_loglf_2der = -loglf_2der

endsubroutine arhn_mle_neg_loglf_2der```
The global variables come up because n_dp, vect_x, vect_t, and vect_v are defined in the main program, but not the internal subroutines. I understand that the manual says the subroutines need to be EXTERNAL. I tried several different ways, but I couldn't determine any efficient method of passing the data to the subroutine unless I used an internal subroutine with global variables. In my test programs (all based on the sample problem in the reference manual), the results were the same for my method.

As for the mathematical problem, I am attempting to solve maximum likelihood equations for a random sample of data from a Weibull population where the shape parameter is constant and the scale parameter is a function of V and T:

$

ln(\Theta_i) = a_0 - a_1 T exp(a_2 V)

$

Hope this helps explain my problem better.

9. Originally Posted by mecej4
If in the "checkpoint/restart" run the freeze-up did not occur at the same program state as in the original run, I suspect that there are array overruns or usage of uninitialized variables.
Unfortunately, I am still trying to solve this problem. I have had a small measure of success using the BCOAH subroutine instead of the UMIAH subroutine, but I am still causing the program to hang.

In a previous post, I noted that the program did not crash at the same point when I used the random seed from one execution as the input to the next execution of the program. This is not actually correct. After examining this issue further, I realized that I was using the new seed in the second program execution to generate the initial data set, not the bootstrap data set that caused the program to hang. Since the bootstrap samples are generated via resampling with replacement from residuals, and not directly from the underlying distribution, I was getting a different sample.

Now that I have taken this into account properly, I am able to get the program to hang at the exact same sample from one execution to the next when I start with the same random seed for the data generation. I'm still not sure what that tells me, but it paints a more complete picture.

10. Originally Posted by steven.alferink
mecej4,

This code is a little bit long, but I'll go ahead and post it. Looks like it exceeds the number of characters for a message...
I looked at your code. Since it contains variables that are declared in the host, and I have not seen the declarations of these variables, I cannot comment. If you can provide a complete example that can be compiled and run, I'll take a look.

You can attach longer files instead of pasting them in-line. See the "Attach Files" box in the "Additional Options" box that is visible when posting a reply.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•