Doh! Sorry, you're right. I wrote what I wrote, went out and realized while I was out that I had worked out the probability of at least N matches rather than exactly N matches.
I expanded your new formula and it comes out the same as the previous one, but I find the new reasoning behind it easier to understand. I will work on my brute force program tomorrow so I can try to calculate these probabilities in an independent way and compare with the formula answers.
Welcome to ScienceForums.Net!
|
After you've registered, come in and introduce yourself, or visit the forum index. If you need any help registering, posting, or if you just have some questions about our site, please feel free to contact us at staff at scienceforums dot net.
|
|
| Guest Message © 2012 DevFuse | |
Probability puzzle Picking samples from sets
#22 15 December 2011 - 10:51 PM
beaumontfletcher, on 15 December 2011 - 09:53 PM, said:
Doh! Sorry, you're right. I wrote what I wrote, went out and realized while I was out that I had worked out the probability of at least N matches rather than exactly N matches.
I expanded your new formula and it comes out the same as the previous one, but I find the new reasoning behind it easier to understand. I will work on my brute force program tomorrow so I can try to calculate these probabilities in an independent way and compare with the formula answers.
I expanded your new formula and it comes out the same as the previous one, but I find the new reasoning behind it easier to understand. I will work on my brute force program tomorrow so I can try to calculate these probabilities in an independent way and compare with the formula answers.
No worries. Figuring out the problems is what's interesting.
I actually expected the formula to come out different!, that it would be simpler with some obscure simplification.
Working with combinations instead of factorials should be a lot easier (programmatically or spreadsheet-ically) because you can have large values yet avoid calculating the full factorials. Edit: Actually I guess the same type of shortcuts could be made with either.
This post has been edited by md65536: 15 December 2011 - 11:11 PM
- Posts: 790 | Joined: 28-July 10
Reply
#23 16 December 2011 - 08:09 PM
Success! The formula is confirmed. I wrote my program, which works out probabilities as follows:
Given integers T, X, Y and N:
1. Initialise a counter S to 0.
2. Pick X different random integers in the range 1-T.
3. Pick Y different random integers in the range 1-T.
4. Iff the sets picked in steps 2 and 3 have exactly N integers in common, then add 1 to S.
5. Repeat steps 2-4 a million times.
6. The actual probability = (S / 1,000,000).
Doing this for a few examples today, I found the actual probability equals the theoretical probability from your formula every time, though sometimes I had to go as high as ten million iterations to get the actual probability to be the same as the theoretical probability to three decimal places. A million iterations usually gives an answer correct to two decimal places.
I think we can be sure the formula is correct. I also did some small examples by hand, e.g. for T=6, X=Y=3 and N=2, and got the same answer as formula.
Just to recap, the formula is:
p = X!Y!(T-X)!(T-Y)! / T!N!(X-N)!(Y-N)!(T-X-Y+N)!
Thanks once again for all your help with this over the past two weeks! I was going to say to you that if I get a publishable paper out of this, I will contact you so you can be credited under your real name. I am not one of those people who use other people's work without acknowledgement. That still applies, though for the kind of sets I am interested in (with large T but small N) the probabilities are so close to 0 as to be useless for any real-life work. But I've learnt from doing all this and I hope it's been stimulating for you. Merry Christmas!
Given integers T, X, Y and N:
1. Initialise a counter S to 0.
2. Pick X different random integers in the range 1-T.
3. Pick Y different random integers in the range 1-T.
4. Iff the sets picked in steps 2 and 3 have exactly N integers in common, then add 1 to S.
5. Repeat steps 2-4 a million times.
6. The actual probability = (S / 1,000,000).
Doing this for a few examples today, I found the actual probability equals the theoretical probability from your formula every time, though sometimes I had to go as high as ten million iterations to get the actual probability to be the same as the theoretical probability to three decimal places. A million iterations usually gives an answer correct to two decimal places.
I think we can be sure the formula is correct. I also did some small examples by hand, e.g. for T=6, X=Y=3 and N=2, and got the same answer as formula.
Just to recap, the formula is:
p = X!Y!(T-X)!(T-Y)! / T!N!(X-N)!(Y-N)!(T-X-Y+N)!
Thanks once again for all your help with this over the past two weeks! I was going to say to you that if I get a publishable paper out of this, I will contact you so you can be credited under your real name. I am not one of those people who use other people's work without acknowledgement. That still applies, though for the kind of sets I am interested in (with large T but small N) the probabilities are so close to 0 as to be useless for any real-life work. But I've learnt from doing all this and I hope it's been stimulating for you. Merry Christmas!
- Posts: 13 | Joined: 03-December 11
Reply

Help
Sign In »
Register Now!












