Not all equalities are the same. There are at least these variants:

**Definition** (or substituting symbol or replacing symbol, etc.)

Example: fine structure constant,

\[\alpha\overset{{\scriptstyle \textrm{def}}}{=}\frac{e^{2}}{4\pi\epsilon_{0}\hbar c}\]

**Identity** (or universal equivalence under known or specified set of previous rules that may be algebraic, geometric, etc.)

Example: constriction between sine and cosine,

\[\sin^{2}\theta+\cos^{2}\theta\equiv1\]

**Equation** (proposed equivalence considered in relation to the finding of "solvers" or "solutions")

Example: Pythagorean Theorem assumed true, solve for c_{2}, given c_{1} and h,

\[5^{2}=\left(c_{1}\right)^{2}+3^{2}\]

**Formula** (proposed equivalence under non-universal, somehow hidden, or not necessarily specified assumptions)

Example: Pythagorean Theorem,

\[h^{2}\overset{\cdot}{=}\left(c_{1}\right)^{2}+\left(c_{2}\right)^{2}\]

Some formulas can become equations once you give values to terms or supply additional information. And yes, sometimes the distinction between what is a formula and what an identity can be blurry, depending on how we look at the defining "valid rules." E.g., the constriction between sine and cosine can be seen as an identity if we take Euler's identities,

\[\sin\theta\equiv\frac{e^{i\theta}-e^{-i\theta}}{2i}\]

\[\cos\theta\equiv\frac{e^{i\theta}+e^{-i\theta}}{2}\]

as the point of departure; or a formula, if we adopt Pythagoras' theorem plus geometrical definitions of sine and cosine. It may also depend on assumptions about curvature of space, etc.

Needless to say, most people who use mathematics on a regular basis, don't need to be reminded of these distinctions, because they intuitively know what they're about. The danger is when people start playing with equalities (especially definitions, as I've seen) thinking they have a different value than they really do. Also needless to say, but better said, the symbols for eq., id., form., and def. are not intended for general use, but just to illustrate how confusing all this proves to be to many people.

]]>At some part during my studies I had to read and subsequently discuss a paper, I remember this as one of the more difficult papers I had ever read, although after the discussion understood everything. I recently looked back at the paper and now see that I do not understand it anymore (or maybe never did). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474757/ My question is specifically about the false discovery rate metric used here (see text and picture at the end of this post). For context, I will briefly explain what they do in this paper before asking my question:

In this paper the researchers introduced random mutations into cells (on average each cell would have a single mutation) with the aim of identifying genes involved in oxidative phosphorylation (a process that can utilise both galactose and glucose). Galactose is only used in oxidative phosphorylation, while cells can use other pathways to create energy from glucose, thus by letting mutated cells replicate a few times and then splitting them up into glucose- or galactose-containing medium, they can identify mutations that only target oxidative phosphorylation (as cells with these mutations will remain alive within the glucose-containing medium but died in the galactose-containing medium).

What I am wondering about is their usage of false discovery rate, as they obtain genes 'with a given false discovery rate'; this implies to me that they are talking about local false discovery rates, in contrast to the false discovery rate of a given experiment (I know it as the rate of false discoveries versus the total amount of hypotheses tested). From what I understand (from this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC520755/), **local **false discovery rate is the chance that a given gene is a false positive.

I am however not sure if that is the case or that my mind has made up this interpretation to let it all make sense. Below is an excerpt of the original paper and the specific passages that discuss the false discovery rate. If anyone could correct me, verify that my interpretation is right, or supply general comments, that would be greatly appreciated!

-Dagl

Stay safe and healthy!

*We also observed this expected enrichment of positive controls at the gene level, and the known OXPHOS disease genes scored significantly better in galactose compared with glucose, as measured by the false discovery rate (FDR) (Figure 1E). Moreover, the fraction of OXPHOS disease genes below 10% FDR in galactose was enriched 39-fold compared with the background of all genes.*

*We used the MAGeCK scores to define a set of genes that were specifically necessary for survival in galactose relative to glucose. We filtered out 92 genes with an FDR below 30% in glucose medium because these likely represent broadly lethal genes that cause non-specific cell death. We then identified hit genes enriched for lethality in galactose at three FDR thresholds: 191 “high confidence” hits at 10%, an additional 48 hits between 10% and 20%, and 61 hits between 20% and 30% (Figures 1F and 2).*

* *

I was reading an article related to MonteCarlo Method. The link of the article is:

http://ib.berkeley.edu/labs/slatkin/eriq/classes/guest_lect/mc_lecture_notes.pdf

I found the equation in the following attached image.

1)In the equation related to the attached image, we are assigning a value to a function. What does this mean?

2)What the expected function will do with this value?

3)Please guide me by providing examples of function fX(ꭓ), E(g(X)) and g(ꭓ).

Zulfi.

]]>Specifically - by using regularisation when fitting a very high order polynomial, intuitively the regularisation part of the objective function should kill off some of the higher order terms.

However since the penalty is on the L2 norm of the weights, it will penalise larger weights preferentially and on first glance this would appear to penalise the *lower* order terms first.

Since the data should be normalised this is not a problem in reality (ie. higher order terms in the data matrix are reduced, so their weights larger). But in that case by intuition would be the weight penalty reduces all weights at roughly similar rates, ie. unrrelated to the exponent.

Hope that makes roughly sense ... any thoughts?

Thanks in advance.

Ed

Thanks for the answer.

]]>Thank you for read this question. I am trying to understand dynamic programming to find a solution to a optimization problem. I understand optimization through linear programming but it is not enough to solve my problem.

What I would like to know is how dynamic programming deal with the optimization of the process as a whole, I mentioned LP because I can see how LP works (objective and constraints) and maybe DP has similitudes.

Maybe the answer of my question is studying more DP but any advise would be good.

I feel like a loop should be closed like in LP but I do not see the concept or concepts which ensure that.

I’ll apreciate any comment, thank you.

]]>I have attached both question and solution

I can see the application of Fourier's law to the right hand side of the solution, but why is right hand side multiplied by area?, also to the left, what has happened to Ko.

Thanks

]]>I am part of a research project on the intuitive understanding of probabilities in contour plots and we are seeking for participants for a short online experiment.

Participation is completely **anonymous** and will take **less than 15min of your time**. Just go with the browser of your laptop or desktop PC (no mobile

With you participation in the experiment you directly contribute to current basic research of the Friedrich-Schiller-University Jena and the German Aerospace Center. We would be very happy if you could support our work.

Thank you!

]]>I have just started going through a python programming book and one of the functions presented is the error function. I understand it is not very necessary for me to understand the mathematics behind it, but I would just like to. Now from looking at, reading a bit about it and watching a video how it is derived, it is my current understanding that the function computes the probability that a Random Variable (if assumptions of normal distribution, standard deviation and expected value are all met) can be found within [-x, x].

I think two things are unclear for me, firstly: what exactly is a Random Variable, the wiki article and some other websites talk about it as if it just any variable that is determined by chance, such as the rolling of dice (I presume this cannot be used in error functions due to the lack of normal distributions). However I don't understand how this could be used (so most likely I am misinterpreting the explanations on the internet). Let's say I measure how tall some people are, and the people I choose are randomly picked with no bias in selection. That Random Variable's value would be.... what?

Secondly I don't understand the usage of x in this case (erf(x)). The functions domain is - infinity to infinity, but from values of x that are around -3 or 3, we already have almost a 100% chance of finding our Random Variable. In my imagination those numbers would be arbitrary, and I can't see how one could use it (let's say we measure length in centimetres, why is the chance that the Random Variable will always be present when we do erf(3) = 0.9999978?

I hope the explanation of my thought process and (possibly false) assumptions is enough for someone to point out the faults in my reasoning.

I would like to know how I could apply this function and when this is useful. On Wikipedia is written: "this is useful, for example, in determining the bit error rate of a digital communication system." But that doesn't (yet) make it clear for me;/

Hope someone can help, and please forgive my ignorance on the subject.

-Dagl

]]>
Graphically, by projecting normals to the centres of the lines *A-A' ', B-B' ', C-C' '*, (also *p-p' '*), where they cross at point *r*, gives me the center of a single rotation I seek.

Question: how to do that mathematically given only: *p* = (7.160299318411282, 0) rotation1 = -360/21° & rotation2 = 18°?

Caveat: The above procedure does not work for all combinations of two rotations. Eg. In the next image *p* = (50,0) and the rotations are (-45° & +45°); which results in the normals to the bisectors all being parallel!

I know that affine transformation using homogeneous coordinates can be composed [https://en.wikipedia.org/wiki/Transformation_matrix#Composing_and_inverting_transformations], but I am stuck for how to utilise that here as in the environment in which I am doing this (LUA embedded in a FEA package), I only have two mechanisms available: rotation about a point and translation in the XY plane.

Question2: Assuming that I get a solution to Q1 above, is my only option to deal with the Caveat case, to compare the angles of rotation and do something different if they are equal?

Thanks.

]]>

Can anyone clarify for me please.

]]>

I saw this article on Cha Chathat said it was about nine millimeters, but that CAN'T be right!

How thick is a sheet of printer paper?

]]>
I would like to know whether or not there is a statistic that can differentiate between the case at the top left versus the top right. Clearly R^{2} does not do so. One could plot the residuals, and the non-random distribution sometimes becomes apparent. However, what I was hoping to find is some number, preferably one that would be calculated by a statistics program, that could be compared in the two situations. I am reading Motulsky's book Intuitive Biostatistics (that is where I first saw the Anscombe quartet, but I have not found anything in his book yet. I am presently using ProStat, which has both a calculation of COD (which I am pretty sure is R^{2}), as well as a calculation of "Corrl" which is said by the user manual to indicate "how closely the two variables approximate a linear relationship to each other." I note the presence of squared differences in the numerator of COD, which are not found in Corrl.

Joe has investments in Company A, Company B, and Company C.

Joe is fated to earn $25.00 from Company A within 2 days from now.

Joe is fated to earn $45.00 from Company B within 3 days from now.

Joe is fated to earn $100.00 from Company C within 5 days from now.

Joe is fated to earn no more than $26.00 from Company C and Company B on day 1 (1 day from now).

Joe is fated to earn at least $14.00 from Company A and Company C on day 2 (2 days from now).

Joe has to earn twice the amount of money on the first day than the second day from Companies A, B, and C and twice the amount of money on the second day than the third day from Companies A, B, and C. This can be expressed algebraically as Joe earning x money on day 3 (3 days from now), 2x money on day 2 (2 days from now), and 4x money on day 1 (1 day from now).

Joe can earn whatever amount of money (that satisfies the other conditions) from Companies A, B, and C on day 4 and day 5 (4 and 5 days from now).

What is the lowest amount of money Joe can earn on day 1 (1 day from now) from Companies A, B, and C? Explain your reasoning.

P.S. How come there doesn't seem to be good formulas to use for this question?

]]>]]>

Using PSPP, I was doing some basic linear regressions. Examining the following correlation:

I included data from over a 100 countries and looked at both baseline values and values 15 years later. Calculating the differences in value for both the independent and dependent variable. Plotting them in a graph.

Results were as follows:

A weak R-squared value which was highly significant nonetheless (P = < 0.0001). With a negative trendline. No confounding was detected from other variables.

I found a "strong" correlation between baseline values of the dependent variable and it's successive changes in values during the 15-year follow-up period. The R-squared values was > 0.7. There was a positive correlation: Higher changes during follow-up were related to higher baseline values.

My problem is as follows:

Most values for the dependent variable dropped over the 15-year follow-up period. When I added baseline values for the dependent variable to the model, there was no noteworthy correlation left between the independent- and dependent variable (P for sig: > 0.50).

**Would it be correct to assume the negative correlation between independent- and dependent variable were (probably) caused by the strong correlation between the 2 values for the dependent variable? **

Subgroup analyses of the correlation between the independent- and dependent variable after 15 years showed the following:

-Decreases in values for the independent variable were not linked to changes of the dependent variable.

-Increases in values for the dependent variable were not linked to changes of the independent variable.

.

]]>

I need your help to calculate (approximate) a double series as shown in the attached.

Thank you so much.

Best,

Steve.double_series.pdf

]]>This problems arises in data compression; consider the bits that make up a file (or a substring of bits of the file) and treat it as a number (i.e. the bits are the binary representation of this number). If we could write a pair function+input(s) whose output happens to be the substring, this whole substring can be replaced by the function+input(s).

I've thought of expressing the number as sums (or differences) of relative big powers of prime numbers. Is this a good approach? And, if not, what would be a good one? And how to proceed?

Motivation of the question: A simples function like raising the nth prime number to a power S can result (depending on the values of p and S) on various outputs, each of which is unique (given that any number has only one prime factorization). If we pick p = 17 and S = 89435, for example, that's computationally trivial to compute (takes logarithmic time), and will result in a somewhat gigantic number. We can then generate a file whose bits are the same of the binary representation of this number (or at least some of the bits are). (This is just a rough example). The problem is going the other way: Given a bit string (hence, a number), how to express this specific bitstring with less bits (very few, actually) through a function that results in the number.

Any ideas/answers/comments are welcome!

]]>f(x) = [f(x)+f(-x)]/2 + [f(x)-f(-x)]/2 = f(even) + f(odd)

So, for even functions, odd part equals to zero, and vice versa. That may be surprising, that such a simple logic shows the truth that may seem counterintuitive. Interesting is however, that symmetry in a microscopic world, for example in the world of elementary particles, is exact, while in a macroscopic world, for example in biology, it is only approximate. Why is it so?

By that I mean that while hydrogen molecule is perfectly symmetrical consisting of two identical atoms, neither our bodies are perfectly symmetrical, nor we can produce any macroscopic object that is perfectly symmetrical. Is there a mathematical explanation for that fact, or does this question belong to a philosophy forum?

]]>

Take the formula for momentum, p=mv, for example - you have (say) kg times m/s. I know how to interpret m/s - for every second that passes by, so many meters are traversed. But what does kgxm mean? For every second, there are so many kilogram-meters. But what is a "kilogram-meter"?

]]>

I wonder if anyone might have anything to say on the subject and whether it can be shown in more detail how this is the case.

]]>

In machine learning library scikit-learn of python, the logestic regression function has an argument "class_weight". When you set a higher class weight to a class during fitting the logestic model, you will get higher predictive accuracy of this class. I wish to know what is mathematical principal of setting class_weight. Is it related to modify the target function of logestic regression (https://drive.google.com/open?id=16TKZFCwkMXRKx_fMnn3d1rvBWwsLbgAU) ?

Thank you in advance.

]]>