'The procecutor's falicy': abduction and probability

June 6, 200619 yr

Could someone please explain to me why the following is statistically falliciouse:

The prosecutor's falicy

The DNA profile found at the scene matches the suspects. The probability of a randomly chosen person having the same DNA profile is calculated as 1/100.

So... if the suspect left the DNA at the scene of crime, the probability that the DNA from the crime scene matches the suspects DNA is 1. If some random person left the DNA at the scene of crime, the chances of the DNA matching the suspects is 1/100.

Therefore, the fact that the DNA from the crime scene matches the suspect's is 100 times more probable if the suspect left the DNA at the crime scene than if some unknown person left it.

(This next bit is apparently the falicy)

It is therefore 100 times more probable that the suspect left the DNA at the crime scene than some unknown person.

as i see it:

A --> C

B --> C

Both A and B can result in C (A being the suspect leaving DNA at the scene, B being some random blokey leaving DNA at the scene who happens to have the same DNA profile as A, and C being a DNA profile retrieved from the scene that matches A's).

C

We have observed C (i.e., found DNA at the crime scene, and have profiled it, and found it to be matching the suspects profile).

Either:

A --> C

or

B --> C

Because C is true, either A or B must be true to have caused it (i.e. the DNA profile was found, so we can deduce/abduct that either the suspect left it, or someone who coincidentally has a matching DNA profile left it).

P(A | A --> C) = 1

P(B | B --> C) = 1/100*

A is calculated as being 100 times more likely to result in C than B is.

P(A) != 100*P(B)

Why can we not now say that because we have observed C, and A is 100 times more likely than B to result in C, that A is 100 times more likely to have been the case than B is?

===

* in case i've got my notation wrong:

I'm taking P(X | Y) = Z to mean 'the probability of Y, given that X is true, equals Z'

June 6, 200619 yr

But isn't the problem that a) the suspect could be "the random person" and some other guy could be the actual culprit and b) 1/100 means that out of 100 people 1 person would match (I'm assuming here, otherwise you could simply eliminate the problem by retesting) who wasn't guilty, however that one persons chance of matching will be 1, not 1/100, because they will always match, not 1/100 times, every time (ie both the random person and the culprit are equally likely to match).

Maybe I'm misunderstanding, I'm tired.

June 6, 200619 yr

Author

But isn't the problem that a) the suspect could be "the random person" and some other guy could be the actual culprit and

Sort of' date=' yeah. but note that he couldn't be the 'random person' in the example, cos that's the actual perpetrator.

b) 1/100 means that out of 100 people 1 person would match (I'm assuming here, otherwise you could simply eliminate the problem by retesting) who wasn't guilty,

Indeed.

however that one persons chance of matching will be 1, not 1/100, because they will always match, not 1/100 times, every time (ie both the random person and the culprit are equally likely to match).

Not quite sure what you mean, but:

If some random person left the DNA at the scene of crime, the chances of the DNA matching the suspects is 1/100.

In other words, if 1/100 people have DNA matching the suspects, and one (random) person left DNA at the crime scene, there is a 1/100 chance that that DNA would match the suspects.

Which is why i'm not getting the 'we can't say that he's 100 times more likely to be the originator of the crime-scene-DNA than not'

June 6, 200619 yr

It seems to me that you can't relate statistics and probability here. I think that just because something is statistically true, doesn't mean you can say that it's probabilistically true.

You're kind of working backwards in a way that only leaves you with half truths.... does that make any sense?

June 6, 200619 yr

Author

nnnnnnnnnnnnnnnnnnnnnnooooooooooo... :confused:

June 6, 200619 yr

Why can't he be the random person? You are assuming that you already know the suspect is the culprit?? What I'm saying is that say they picked a person whose DNA matched but they weren't the culprit, as their suspect (ie they were that 1 out of 100 people but they were by some fluke suspected). That person's chance of matching was always 1 because they match.

The chance that if you picked a completely random person up and their DNA matched is 1/100 but that doesn't imply that simply because the persons DNA matched that they aren't that 1/100.

June 6, 200619 yr

Assuming only one person is arrested and tested then probability of guilt is given by Bayes theorem

P(A|C) = P(A and C)/ (P(A and C) + P(B and C)) = 1/1.01

However if a whole bunch of people are tested until one is found positive then other distributional assumptions need to be made and the calculation is more complicated. A geometric distribution could be made use of here

Edit - Mistake in this post see post below

June 6, 200619 yr

ok sorry. Let me put it this say.

probability of suspect: 1

probability of random: 1/100

it is 100X more likely that the sample matches the the suspect then a random person.

That doesn't mean that it is 100X more likely that the person commited left the sample. Because the matched sample could still have been left by somebody else, even if it matches.

I believe this is what it's saying.

June 6, 200619 yr

http://en.wikipedia.org/wiki/Prosecutor's_fallacy seems to have some examples and such explaining why it doesn't work.

June 6, 200619 yr

Author

@ Aeternus

Ah, i see. we're both taking 'random person' to reffer to different things. Your taking it to reffer to a non-guilty suspect, i'm taking it to reffer to a non-supect perpetrator.

Focus on the suspect's DNA:

if the suspect is actually guilty and actually left his own DNA at the scene, then the chance of the scene DNA matching the suspects will be 1 ('cos it's his).

If the suspect is innocent and some-one else left the DNA at the scene, then that person will basically be a random person from the population, and the chance that their DNA will coincidentally match the suspect's is 1/100 (hence why the suspect can't be that random person -- if he was he'd be the guilty one).

Logically equivellent to what you said, but that's why I said he cant be the random person in my example

June 6, 200619 yr

Author

http://en.wikipedia.org/wiki/Prosecutor's_fallacy seems to have some examples and such explaining why it doesn't work.

Ah, i assumed that the 'procecutor's fallicy' term was made up by my tutor, so didn't bother googling.

Cheers for the link. i kinda understand now. (and sorry for getting your name wrong earlyer)

Tartaglia: what values were you using for P(A) and P(B)?

June 6, 200619 yr

Dak - I made a mistake when I put it up. In my calculation they would be the same which clearly would not be true, but this does lead you into how to demonstrate the fallacy mathematically.

When I put the post up I was thinking in terms of testing a long line of suspects

June 6, 200619 yr

Ah' date=' i assumed that the 'procecutor's fallicy' term was made up by my tutor, so didn't bother googling.

Cheers for the link. i kinda understand now. (and sorry for getting your name wrong earlyer)

[b']Tartaglia:[/b] what values were you using for P(A) and P(B)?

Heh, it's ok, you should see what Klaynos ends up calling me sometimes.

June 6, 200619 yr

Author

Cheers.

Would this be right?

Note that i have finally reallised that my notation was wrong. Henceforth, P(A|B) means the prob of A given B.

A = suspect being at the scene

C = DNA profile that matches suspects

as forensic scientists are required to be unbiased, assume a prior P(A) to be 0.5

P(A|C) = P(C|A)P(A)/P©

P(A|C) = 1*0.5/0.01 = 50... umm.. OK, that'd be 'no, dak, that's not right' then :embarass:

what'd i do wrong?

June 6, 200619 yr

P(A|C) = P(C|A)*P(A)/(P(C|A)*P(A) + P(C|B)*P(B))

where P(C|A) = 1, P(C|B) =0.01 and P(A) and P(B) are chosen appropriately

June 6, 200619 yr

Author

P(A|C) = P(C|A)*P(A)/(P(C|A)*P(A) + P(C|B)*P(B))

P(A|C) = 1*0.5/1*0.5 + 0.01*0.5 = 0.99

umm... that's just the same as saying if there's only a 1/100 chance of someone else having the profile, theres a 99/100 chance of him being guilty, which is pretty much the procecutors falicy?

June 6, 200619 yr

You have forgotten a bracket - assuming P(A) = P(B) = 0.5 (which is not necessarily a good assumption) then your answer should be 1/1.01

June 6, 200619 yr

Author

i think i noticed that and edited about the same time you noticed it.

And it's a neccesary assumption in forensics (according to my lecture notes... i found a bit that briefly touches on bayes theorum, tho it's not much help)

June 6, 200619 yr

Clearly if the police are just rounding up suspects P(A) is a lot less than 0.5, but if they are making a single arrest after a lot of detective work then P(A) could be a lot more than 0.5

June 6, 200619 yr

Author

hmm... actually, i didn't go to that lecture, and the notes that i picked up off of someone else are a tad confusingly worded, so maybe you're right.

How would one deal with that? could baye's theorum not be applied if we can't estimate an a prior P(A)?

June 6, 200619 yr

Your estimate of p(A) would be a prior estimate. After applying Bayes theorem you have your posterior estimate P(A|C)

June 6, 200619 yr

Author

yea, but how would i work out P(A)? i can't take other evidence into account, otherwize it becomes tatologous: 'assuming the guy is probably guilty, then this is probably his DNA, indicating that he's probably guilty'.

I can't take on the procecutor or the defendants view, otherwize P(A) would have to be set at either 0 or 1, and if i take the view that i strictly speaking should (i.e., unbiased -- no prior assumptions) then both P(A) and P(B) are going to be 0.5, and, given that P(C|A) is allways going to be 1, it becomes 1/(1+P(C|B)), which seems to return awfully high results reguardless of P(C|B) :confused:

Can i not, then, use bayes theorum in this case?

June 6, 200619 yr

That's where experience comes into it. You would have to collate data from similar situations in the past or guess

June 7, 200619 yr

Whenever I hear of something like this I tend to think of this case:

(fallacy) There is a 1/1,000,000 chance of someone's DNA test returning as a match. If you match DNA of the criminal then the chances of a match being 1/1,000,000 mean you must be that criminal 999,999 times out of 1,000,000.

Now, really, what you should think is this: there are (in the uk) 65 people then who match that DNA sample. So if that's the only evidence they have then the suspect really has only a 1/65 chance of being the correct person.

June 7, 200619 yr

Author

That would be the defendants fallacy

The reason it's invalid is that it assumes each person with that DNA profile was in the area, and thus capable of leaving their DNA, when in actual fact it's very unlikely that the other 64 were anywhere near the crime scene.

Sign In

'The procecutor's falicy': abduction and probability

Featured Replies

Archived

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)