Jump to content

Two different confidence intervals calculated by ProStat


BabcockHall

Recommended Posts

Hello Everyone,

 

I regularly use linear and nonlinear regression. However, I realized recently that I don't know much about confidence intervals. After consulting several books and the ProStat manual, I found myself stuck on why ProStat calculates two values for the 95% confidence values. One is "Uninvariant," and the other is "supporting plane." The supporting plane limits are greater than the uninvariant ones in the examples from the manual. The manual says that when something called k = p (the number of parameters) we get the support plane CI's, and when k = 1, we get the uninvariant CI's. It is unclear from my reading of the manual what k is. The manual provides a formula and a citation to Seber (1989), and the formula has a function F that is mysterious. I have also consulted Motulsky and Christopoulos's book. Their calculation of the CI of a parameter is BestFit ± (t*)•SE, where SE is the standard error. I don't see how one could generate two values of t*•SE that might correspond to the two generated by ProStat. Does anyone have any thoughts?

Link to comment
Share on other sites

Not come across supporting plane limits before so had to read around a little. K must simply be the number of parameters you are estimating: in the univariate case you are only estimating one parameter and the supporting plane method reduces to the univariate case. When k=p with, p>1, you are estimating the parameters in the model taking into account any correlation between the those parameters.

 

Exactly how this is calculated i'm not sure, the only sources i could quickly find were not particularly illuminating, just saying it was computationally expensive, but i'd guess it makes use of the Mahalanobis distance. Are the books you are using available online?

Link to comment
Share on other sites

Thank you; I am already understanding it a bit better. I am attaching a page from the ProStat Manual. I have not yet looked into whether or not the two books I mentioned are available on-line. However, I can type out a paragraph or so from each. I already wrote a Word document for myself to summarize some of this information, and I type out some quotes from the books into that (yeah, pretty old-school).

ProStat_Manual65_p180.pdf

Link to comment
Share on other sites

Here are the books or articles that I have consulted or seen referenced in the ProStat manual:

Seber GAF and Wild CJ, Nonlinear Regression J. Wiley and Sons (1989)

Motulsky H and Christopoulos A, Fitting Models to Biological Data Using Linear and Nonlinear Regression, Oxford University Press, 2004

Motulsky H, Intuitive Biostatistics, 3rd ed., Oxford University Press (2014)

Motulsky H and Ransnas , "Fitting curves to data using nonlinear regression: a practical and non mathematical review." FASEB J. (1987) (PMID: 3315805)

 

At least some parts of Seber and Wild's book are online. Confidence intervals are discussed on p. 192 and p. 235, which is the chapter on Statistical Inference.

 

 

BestFit - t*•SE) to BestFit + t*•SE (p. 103, Motulsky and Christopoulos).

Edited by BabcockHall
Link to comment
Share on other sites

I came across this paper which explains confidence regions quite well, particularly chapter 3, and references Seber and Wild. I'm still not sure why the F distribution describes the joint sampling of the regression coefficients though: maybe the ratio of normal distributions.

 

But don't think of it as generating two values of t*•SE. It's rather that independently finding the CI of each regression coefficient like this will give a confidence region, which is a multivariate extension of a CI. So where you can visualise a CI as two limits on a line around a point estimate, a confidence region for two regression coefficient estimates will be a rectangle around a point, a cube for 3 regression coefficient estimates, a hypercube for more...

 

We would want the property that if we have so found the 95% CI for each regression coefficient then the hypercube for all regression coefficients will contain the true population parameter 95% of the time (on average). Seber et al. are claiming this is not the case, that finding the 95% CIs separately and just putting them together does not result in a 95% confidence region, because there is a correlation between the regression coefficients that has not been taken into account. Other methods are needed, and the F-test method is apparently one of the better ones.

 

I'll check for the references you gave later to see if i can get to the bottom of the F-distribution mystery.

Link to comment
Share on other sites

Motulsky and Christopoulos Presented and compared three methods of generating confidence intervals: Asymptotic (Chapter 16), Monte Carlo (Chapter 17), and Model Comparison (F ratio or F test, Chapter 18). In chapter 19 they compare all three methods. Only the asymptotic method gives a symmetrical distribution.

Link to comment
Share on other sites

  • 2 weeks later...

I now have a copy of Seber and Wild's book from the library, but it is written at a level which is not easy for me to follow. In the section that covers confidence intervals. they do not use the nomenclature that the ProStat manual does. In other words I cannot find terms like "supporting plane" and "univariate" in this book.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.