Jump to content

Local False Discovery Rate


Dagl1

Recommended Posts

Hey everyone,

At some part during my studies I had to read and subsequently discuss a paper, I remember this as one of the more difficult papers I had ever read, although after the discussion understood everything. I recently looked back at the paper and now see that I do not understand it anymore (or maybe never did). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474757/ My question is specifically about the false discovery rate metric used here (see text and picture at the end of this post). For context, I will briefly explain what they do in this paper before asking my question:

In this paper the researchers introduced random mutations into cells (on average each cell would have a single mutation) with the aim of identifying genes involved in oxidative phosphorylation (a process that can utilise both galactose and glucose). Galactose is only used in oxidative phosphorylation, while cells can use other pathways to create energy from glucose, thus by letting mutated cells replicate a few times and then splitting them up into glucose- or galactose-containing medium, they can identify mutations that only target oxidative phosphorylation (as cells with these mutations will remain alive within the glucose-containing medium but died in the galactose-containing medium).

What I am wondering about is their usage of false discovery rate, as they obtain genes 'with a given false discovery rate'; this implies to me that they are talking about local false discovery rates, in contrast to the false discovery rate of a given experiment (I know it as the rate of false discoveries versus the total amount of hypotheses tested). From what I understand (from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC520755/), local false discovery rate is the chance that a given gene is a false positive.  

I am however not sure if that is the case or that my mind has made up this interpretation to let it all make sense. Below is an excerpt of the original paper and the specific passages that discuss the false discovery rate. If anyone could correct me, verify that my interpretation is right, or supply general comments, that would be greatly appreciated!

 -Dagl
Stay safe and healthy!

We also observed this expected enrichment of positive controls at the gene level, and the known OXPHOS disease genes scored significantly better in galactose compared with glucose, as measured by the false discovery rate (FDR) (Figure 1E). Moreover, the fraction of OXPHOS disease genes below 10% FDR in galactose was enriched 39-fold compared with the background of all genes.

We used the MAGeCK scores to define a set of genes that were specifically necessary for survival in galactose relative to glucose. We filtered out 92 genes with an FDR below 30% in glucose medium because these likely represent broadly lethal genes that cause non-specific cell death. We then identified hit genes enriched for lethality in galactose at three FDR thresholds: 191 “high confidence” hits at 10%, an additional 48 hits between 10% and 20%, and 61 hits between 20% and 30% (Figures 1F and 2).

image.png.3fc496b3db81ac3eb3ed4a08a1bb5e5d.png

 

Link to comment
Share on other sites

My interpretation of the local FDR from that paper you gave is; given a p-value what is the probability that the null hypothesis is true, adjusted to take into account all the pairwise hypothesis tests in the set. But there are lots of nuances in that paper which would take a while to pick apart.

It seems to rely on the independence of the p-values to estimate some of its properties though - is that a reasonable assumption for these kinds of genetic studies?

Link to comment
Share on other sites

1 hour ago, Prometheus said:

My interpretation of the local FDR from that paper you gave is; given a p-value what is the probability that the null hypothesis is true, adjusted to take into account all the pairwise hypothesis tests in the set. But there are lots of nuances in that paper which would take a while to pick apart.

It seems to rely on the independence of the p-values to estimate some of its properties though - is that a reasonable assumption for these kinds of genetic studies?

Is the paper in question pertaining the first paper, which looks at the different genes, or the paper explaining local FDR.
I am not sure if the first paper actually uses FDR (they just call it FDR, but from its description I think it is local FDR (do you agree?)).

Regarding the independence of its p-values, each cell line is randomly mutated through the use of some mutagen (if I remember correctly), meaning that each cell line's mutation should be independent from all others (going by the idea that the mutagen does not target one gene over another which won't be entirely true as the size of one gene may make it more or less susceptible to mutations that lead to problems (not all mutations lead to problems)). I am not comfortable enough with the statistics to say that more studies rely on the independence of P-values, but I might be able to fill in details regarding general methods, so you could come to the conclusion if that is or is not the case (aka = I can explain the biology, maybe that helps you interpret the data/study better). 

If we go by what you said: 

2 hours ago, Prometheus said:

given a p-value what is the probability that the null hypothesis is true, adjusted to take into account all the pairwise hypothesis tests in the set.

I don't really understand how that would work with figure 1.E, as we can see fraction of the total genes at certain FDR values. Could you elaborate? (I am not saying your explanation/interpretation isn't true, I just can't connect your explanation with that figure).

Thanks for taking the time to look into it and replying!

-Dagl

Link to comment
Share on other sites

I only looked at the second paper - the one explaining local FDR. I have a little familiarity with FDRs but never heard of local FDR. If i get the chance i'll take the time to look at the first paper and see if i can apply my understanding to it. In normal times i would just ask one of my colleagues... haven't seen them for months now.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.