View Full Version : confidence interval for standard deviation?

hcrisp

10-29-2009, 08:57 AM

The PV-WAVE: IMSL Statistics Reference states that I can use ANOVA1 to determine the confidence intervals on all pairwise differences of means (using one of six methods). How would I go about computing the confidence intervals for the pairwise differences of standard deviations?

totallyunimodular

10-30-2009, 09:15 AM

I do not believe there are any routines in PV-WAVE that can directly get these results. That being said, I don't think this functionality is available in most software either. I think you'll need to research and apply Levy's method for multiple comparisons on statistics other than the mean. For example, see this link (http://epm.sagepub.com/cgi/content/abstract/55/5/795) and this link (http://cat.inist.fr/?aModele=afficheN&cpsidt=3663210).

hcrisp

11-12-2009, 03:35 PM

Another question along the same lines:

SIMPLESTAT returns the following:

result(12): lower confidence limit for the variance (assuming normality)

result(13): upper confidence limit for the variance (assuming normality)

Are these the same as the lower/upper confidence limits for the standard deviation? I didn't know if they were since variance = (standard deviation)^2. If they are not, how can I go about getting them for standard deviation?

hcrisp

01-07-2010, 03:00 PM

In case anyone is interested, here is how you can get the confidence interval for standard deviation. I did conclude that the CI for the standard deviation is not the same as the CI for the variance.

npts = 30

x = RANDOMN(s, npts)

; get mean confidence interval

df = npts - 1

one_minus_alpha = 0.95 ; 95% Confidence Interval

alpha = 1. - one_minus_alpha

prob = 1. - alpha / 2.

x_sigma = STDEV(x, x_mean)

t_ahalf = TCDF(prob, df, /INVERSE, /DOUBLE)

error = t_ahalf * x_sigma / SQRT(npts)

mean_ci_low = x_mean - error

mean_ci_high = x_mean + error

info, mean_ci_low, mean_ci_high

; get standard deviation confidence interval

chi_sq_rt = CHISQCDF(prob, df, /INVERSE, /DOUBLE)

chi_sq_lt = CHISQCDF(1-prob, df, /INVERSE, /DOUBLE)

stdev_ci_low = SQRT((df * x_sigma^2) / chi_sq_rt)

stdev_ci_high = SQRT((df * x_sigma^2) / chi_sq_lt)

info, stdev_ci_low, stdev_ci_high

; compare to SIMPLESTAT

res = SIMPLESTAT(x)

info, res(10) ; mean_ci_low

info, res(11) ; mean_ci_high

info, res(12) ; var_ci_low

info, res(13) ; var_ci_high

totallyunimodular

01-08-2010, 09:27 AM

Thanks for posting this! Can you say more about your comment

I did conclude that the CI for the standard deviation is not the same as the CI for the variance.

It looks to me like the square of the standard deviation values you computed manually are the same as what is returned by SIMPLESTAT...

WAVE> info, res(10) ; mean_ci_low

<Expression> FLOAT = -0.382960

WAVE> info, res(11) ; mean_ci_high

<Expression> FLOAT = 0.409303

WAVE> info, res(12) ; var_ci_low

<Expression> FLOAT = 0.713816

WAVE> info, res(13) ; var_ci_high

<Expression> FLOAT = 2.03385

WAVE> pm, stdev_ci_low^2

0.71381576

WAVE> pm, stdev_ci_high^2

2.0338474

I think an important caveat here is gleaned from the output of SIMPLESTAT: sample variance is Chi-square distributed if the underlying sample is Normal in distribution. Referring to the original question about ANOVA though, The F-test in an ANOVA is generally robust to departures from Normality although the test is no longer the "most powerful". But I am still not sure what the correct technique is for estimating the confidence level around pairwise differences between standard deviations.

Thanks again for posting your code!

hcrisp

01-11-2010, 08:29 AM

Ah, thanks! I had forgot to compare the square of the standard deviation to the variance calculated by SIMPLESTAT. With that additional step the comparison of the two for n = 30 is equal.

As to the assumption of normality, I can live with that, although I may have sample sizes less than 30. In that case, I should use the Student-t distribution, not the normal distribution. Unfortunately, the PV-WAVE documentation does not say which distribution is used in its algorithm for small sample sizes, only that the confidence limits "assume normality". Empirical tests do show that the calculated limits are equal to the results of my code, however, so it may be using the Student-t after all.

npts = 15

x = RANDOMN(s, npts)

res = SIMPLESTAT(x)

info, res(10) ; mean_ci_low

;<Expression> FLOAT = -0.488818

info, res(11) ; mean_ci_high

;<Expression> FLOAT = 0.326370

info, SQRT(res(12)) ; stdev_ci_low

;<Expression> FLOAT = 0.538860

info, SQRT(res(13)) ; stdev_ci_high

;<Expression> FLOAT = 1.16078

Powered by vBulletin® Version 4.2.3 Copyright © 2019 vBulletin Solutions, Inc. All rights reserved.