Jump to content

uncool

Senior Members
  • Content Count

    1174
  • Joined

  • Last visited

  • Days Won

    1

uncool last won the day on September 17

uncool had the most liked content!

Community Reputation

216 Beacon of Hope

2 Followers

About uncool

  • Rank
    Atom

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. The statement of the theorem itself, as I have repeatedly shown. For that I apologize; the forum changed software a couple years ago in a really frustrating way, and I still haven't really tried to learn how the new LaTeX embedding works. Then ask about them. 1) That really is a bad reason for never having had to take it. 2) Either you misunderstood, or you were misinformed. 3) If you've never taken probability before, you should not be making pronouncements on it. 4) To be honest, you should learn the basics of probability theory before teaching it. Probability theory is a very well-developed field of math. The theorem you are trying to talk about is one of the most basic theorems in statistics - one of the subfields of probability - and one of the most necessary.
  2. "They" would not, as "they" have not in the past 118 years. You could try reading my post to find out. Or actually learning probability theory rather than making pronouncements about it. Choose any N, and some small epsilon. Then: For any epsilon and delta, there exists an N such that for n > N, Pr(|Y_n - p| > epsilon) < delta Equivalently, sum_{n*(p - epsilon) < i < n*(p + epsilon)} Pr((sum X_j) = i) > 1 - delta Equivalently, sum_{n*(p - epsilon) < i < n*(p + epsilon)} nCi * p^i * (1 - p)^j > 1 - delta. So take n large (say, larger than 100*(p - p^2)/epsilon^2), and find sum_{n*(p - epsilon) < i < n*(p + epsilon)} nCi * p^i * (1 - p)^j. I claim it will always be larger than 99%. Then explain what in them you disagree with, or think doesn't apply. What you've written shows no indication you even read what I wrote (especially since most of what I wrote are inequalities, not equations). Because, once again, it's about looking in a range. Every indication shows the problem is on the receiving end here, since you have failed to show that you even read what I wrote.
  3. 1) Mathematics has no Nobel Prize. 2) The proof has existed for centuries. As I have told you repeatedly. And as Ghideon has been more than willing to provide data for, and as I have been more than willing to provide data for. For any epsilon and delta, there exists an N such that for n > N, Pr(|Y_n - p| > epsilon) < delta Equivalently, sum_{n*(p - epsilon) < i < n*(p + epsilon)} Pr((sum X_j) = i) > 1 - delta
  4. In other words, there is literally nothing that could possibly convince you, and you aren't even going to try to understand the arguments we post. Do I have that right? I repeat: If you flip a coin 200 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 17%. If you flip a coin 2000 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 36%. If you flip a coin 20000 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 84%. This is an example where I literally chose the numbers (.495, .505, 200, 2000, 20000) arbitrarily (because they were easy for me). The probability of being close to 1/2 approaches 1 - for any reasonable choice of "close".
  5. As with any proof, we should start with the statement we are trying to prove, and then start the proof proper. The statement of the weak law of large numbers for a binary variable is the following: Let X_1, X_2, ..., X_n be independent, identically distributed binary variables (in layman's terms: they're coinflips that don't affect each other and have the same probability p). Define Y_n = (X_1 + X_2 + ... + X_n)/n. Then for any epsilon > 0, lim_{n -> infinity} Pr(|Y_n - p| > epsilon) = 0. Writing out the limit: for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr(|Y_n - p| > epsilon) < delta To prove it, we will need a few lemmas. Definition: X and Y are independent if for any outcomes i, j, P(X = i, Y = j) = P(X = i) * P(Y = j). Definition: For a discrete variable X, E(X) = sum_i i*P(X = i) Note the summation in the above. Lemma 1: For any two independent variables X and Y, E(XY) = E(X) E(Y). Proof: E(XY) = sum_{i, j} i*j*P(X = i, Y = j) = sum_{i, j} i*P(x = i) * j * P(Y = j) = (sum_i i*P(x = i)) (sum_j j*P(x = j)) = E(X) E(Y) Lemma 2: Assume X is a variable with all positive outcomes. Then for any a, P(X > a) <= E(X)/a. Proof: E(X) = sum_i i*P(X = i) = sum_{i > a} i*P(X = i) + sum_{i <= a} i*P(X = i) >= sum_{i > a} a*P(X = i) + sum_{i <= a} 0*P(X = i) = a*sum_{i > a} P(X = i) = a*P(X > a), so P(X > a) <= E(X)/a. Lemma 3: If X and Y are independent, then X - a and Y - b are independent. Left to the reader. Lemma 4: E(X - p) = 0. Left to the reader. Lemma 5: E((X - p)^2) = p - p^2. Left to the reader. Lemma 6: For any variables X and Y, E(X + Y) = E(X) + E(Y) (no assumption of independence needed). Left to the reader. Now, as is usual for limit proofs, we work backwards from the statement we want to prove to the statements we can prove. We want to prove that for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr(|Y_n - p| > epsilon) < delta Equivalently, for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr(|(sum(X_i))/n - p| > epsilon) < delta Equivalently, for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr(|sum(X_i) - p*n| > epsilon*n) < delta I want to note here, once again, that this shows what I've been saying: that this is about a range of possibilities. In this case, that range is epsilon*n around the "perfect" outcome. Equivalently, for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr((sum(X_i) - p*n)^2 > epsilon^2*n^2) < delta Equivalently, for any delta > 0 and any epsilon > 0, there is some N such that for any n>N, Pr((sum(X_i - p))^2 > epsilon^2*n^2) < delta Applying lemma 2 (since squares are always positive), we know this is true as long as E((sum(X_i - p))^2) < delta*epsilon^2*n^2, because then Pr((sum(X_i - p))^2 > epsilon^2*n^2) <= E((sum(X_i - p))^2)/(epsilon^2 * n^2) < delta. (sum(X_i - p))^2 = sum_{i, j} (X_i - p)(X_j - p) = sum_i (X_i - p)^2 + sum_{i =/= j} (X_i - p)(X_j - p), so E((sum(X_i - p))^2) = E(sum_i (X_i - p)^2 + sum_{i =/= j} (X_i - p)(X_j - p)) By lemma 6, we can split this sum up into individual terms. The first term is sum_i E((X_i - p)^2) = sum_i (p - p^2) = n*(p - p^2) by lemma 5. The second term is sum_{i =/= j} E((X_i - p)(X_j - p)) = sum_{i =/= j} E(X_i - p) E(X_j - p) by lemma 1, = 0 by lemma 4. So the condition we want is n*(p - p^2) < n^2*delta*epsilon^2, or n > (p - p^2)/(delta*epsilon^2). Which means choose N = ceil((p - p^2)/(delta*epsilon^2)), and the statement follows. This proof generalizes quite easily; all that's necessary is to replace p by E(X_i) and (p - p^2) by E((X_i - E(X_i))^2). I was waiting for you to show any interest in the actual proof, rather than insisting that it hadn't been proven.
  6. I am writing the following in large font at the beginning and end of the post because it is an offer you have repeatedly ignored, and it is likely central to your confusion. I am offering to prove the weak law of large numbers for a binary variable (i.e. a biased or unbiased coin) using "summing probabilities", i.e. the method I have been using this entire thread. Do you accept? Then why are you rejecting the summation? Why not, when the law of large numbers isn't about the single possible outcome? I don't know where you are pulling this bullshit from, but it is bullshit. That is exactly the point I am saying is irrelevant. This isn't about "benefits and problems". It's about which method is correct. And I guarantee that summation is correct, by the laws of probability. I am offering to prove the weak law of large numbers for a binary variable (i.e. a biased or unbiased coin) using "summing probabilities", i.e. the method I have been using this entire thread. Do you accept?
  7. And why should we want area under a curve for this specific application? Why is that relevant to this particular calculation? The question isn't about whether one application is "better" than the other. The question is which one is correct. And the laws of probability explicitly say that summing is correct. If you mean that someone who thinks "adding together probabilities" (I assume you mean the method I have been demonstrating) can't get the result of the law of large numbers, then you are wrong.
  8. Because an integral is a particular limit that is not being taken here. Further, an integral is a limit of summation, not the other way around. I have offered to provide the proof of the law of large numbers in this thread multiple times.
  9. It is not contradictory, because the range is growing. For 200 flips, the range is between 99 and 101 heads - 3 outcomes. For 2000 flips, it's between 990 and 1010 - 21 outcomes. For 20000 flips, it's between 9900 and 10100 flips - 201 outcomes. For 200 flips, the probability of exactly 100 heads is about 5.6%. For 2000 flips, the probability of exactly 1000 heads is about 1.8%. For 20000 flips, the probability of exactly 10000 heads is about 0.56%. The probability of getting exactly the "perfect" number of heads is decreasing, but the probability of hitting a range including 1/2 is increasing.
  10. I've demonstrated an example of the law of large numbers (ish; technically, all I've done is point out some increasing values, but careful investigation will continue to show that the point I make with those values is correct), in direct contradiction to one of your statements. It warrants you retracting that statement, as a start.
  11. And as defined by the law of large numbers, it grows with n. Which is what I showed. It sounds like you disagree with the axiom that the probability of a union of disjoint events is the sum of the probability of each of those events. Actually, it does change that. If you flip a coin 200 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 17%. If you flip a coin 2000 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 36%. If you flip a coin 20000 times, the probability of (number of heads/number of flips) being between 0.495 and 0.505 is about 84%. Because the law of large numbers is about that range around "perfect".
  12. I am demonstrating that the weak law of large numbers talks about an increasing range of "accepted" outcomes as n grows larger. Which is what you asked me about.
  13. I'm not. I'm applying probability theory. It's not. Don't. It means what it says, and implies exactly what I said it did.
  14. "But instead of taking the sum of a discrete number of things, you're taking the sum of an infinite number". We're looking at discrete numbers of heads. That comes into play because of the expression. |bar(X)_n - \mu| > \epsilon is equivalent to |(X_1 + X_2 + ... + X_n)/n - \mu| > \epsilon is equivalent to |X_1 + X_2 + ... + X_n - n \mu| > n * \epsilon The right-hand side grows with n. It says that the sum can be n*epsilon away from n* mean.
  15. In which case you use a sum, because the outcomes are discrete. As I have repeatedly shown. In the expression "converge in probability" (for the weak law, which is all that is necessary here). The "Pr" in "lim_{n -> infinity} Pr(...) = 0" denotes probability.
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.