Jump to content

- - - - -

Why do we use n-1 for sample denominator?

  • Please log in to reply
2 replies to this topic

#1 CuriousBanker



  • New Members
  • 16 posts

Posted 22 August 2012 - 01:38 PM


So I understand the concept of degrees of freedom, as in we are not free to choose x many variables. Like if there are three numbers and the mean is 20, the degree of freedom is 2, because if we choose 2 numbers, the third one is defined.

I don't get, however, why sample's of a population use n-1 in the denominator, when populations only use 1.

Lets say this is a population: 1,3,5,2,7,8,6,9,2,7. The mean would be 50/10=5. The variance is 7.2.

Let's take a sample of this population. 5+7+6+8+7.

Let's take a sample of this population. 5+7+6+8+7. The mean would be 33/5=6.6. The variance, using N IN THE DENOMINATOR (which I know is incorrect), would be 1.04. With N-1 as the denominator, the answer is 1.3. I fail to see how 1.3 is any more accurate of the population than 1.04, since neither is anywhere close to 7.2. I guess because it is ever so slightly closer to 7.2?

I don't really get it conceptually.

Thanks in advance.

Also, if the sample was 1,3,7,8,9. The mean would be 5.6. The variance using N would be 9.44. The variance using N-1 would be 11.8. So in this case, N is actually closer to the real deal of 7.2, than n-1 would be
  • 0

#2 DJBruce



  • Senior Members
  • 890 posts
  • LocationBrighton, MI

Posted 22 August 2012 - 02:30 PM

When using the standard formula for covariance with a sample the statistic is in fact biased due intuitively to the fact that the covariance depends upon the mean, which in turn depends on the sample. To correct this, and form an unbiased statistic we must multiple the standard formula for covariance by \frac{N}{N-1}. This is called Bessel's Correction.
  • 0

"To give anything less than your best is to sacrifice the Gift."

"A lot of people run a race to see who is fastest. I run to see who has the most guts, who can punish himself into exhausting pace, and then at the end, punish himself even more."


Help the children of Meknes, Morocco 

#3 ecoli



  • Moderators
  • 8,669 posts
  • LocationNY, NY

Posted 22 August 2012 - 03:09 PM

To add to what DJBruce said, as I understand it, the degrees of freedom is just that: the number of entries in the residuals vector that are 'allowed' to vary:

since the residuals,  (x_1 - \overline{x} ... x_n - \overline{x}) , must sum to zero, the entire vector is determined fully by the first N-1 entries.
  • 0
[14:02] <Sato> I
[14:02] <Sato> want
[14:02] <Sato> Schroedinger
[14:04] == Schroedingers_hat [~matt@CPE-121-222-209-157.lnse1.woo.bigpond.net.au] has joined #sfn

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users