Validity of Statistics

April 1, 201412 yr

What would you say is the validity of statistics?

Also, if you're going to support your argument with definitions please don't, as I think you must ask yourself why you have so much confidence in the definitions in the 1st place

I believe the uncertainty just encourages people to not search for the *real* relationship or variables / constants, and it also encourages pointless studies that emphasizes on correlation when the true relationship (if existent at all) is *not* proven.

This also goes for probability, however, I heard that probability is quite applicable in quantum physics.

Well, you can derive the basic rules of probability from the propositional calculus and the fact that we can order our beliefs in terms of credence. From there, it's a short hop, skip, and jump to Bayes's theorem. Bayesian statistics has a firm grounding.

Unfortunately for frequentist statistics, it's more of a messy jumble of stuff.

April 1, 201412 yr

I have no opinion on the approximation theory, the purpose for the replacement of statistics should be to eliminate uncertainty, hence estimations will not be preferred regardless of their relations.

You use too much metaphors, and is very unspecific, difficult to discuss with you, I try. You also shouldn't be so presumptuous in that many problems cannot be solved with "explicit" equations, this is the mentality in which should be discouraged, and is encouraged by statistics, just because certain problems seems difficult currently does not mean it is unsolvable, and should be worked on towards certainty in contrast to living in uncertainty. I previously asked my teacher how could you determine the steepest gradient of any given function, she said it is not possible (I searched online for a method, which is beyond my current skill yet is possible), other questions I have asked includes the summation of root numbers which is premature yet does not mean it is not possible to be precise in the formulation of formulas, you are not unlike many authorities who are confident that what they know is the truth, relativity shows that time is relative, a contrast to the previous authorities who refused such concepts.

The nature of science is that statistical arguments are gradually replaced with exact mathematical ones -- when they can be. Kepler's orbital laws, for example, were basically an empirical and statistical argument from data, with no theoretical grounding. Once Isaac Newton formulated a theory of gravity, he was able to replace the empirical argument with a theoretical one and show why Kepler's laws had to be true.

Statistics is a way of making progress in science even without the exact theory. Without statistics to give us a rough picture, we will not know what our final theory should look like

On the other hand, my current research (I am getting a PhD in statistics) isn't something that can be replaced with exact formulas. I'm trying to map the background radiation levels over a wide area. Background radiation levels are a function of the radioisotopes buried in the ground and in common building materials, such as concrete. There's no a priori way to figure out where the radioisotopes are, unless you can devise math to predict exactly how the Earth would be shaped from four billion years ago to today, including construction and man-made activity. So I have to measure empirically, and to do that I need statistics.

April 1, 201412 yr

Author

@Bignose

No, that is not what I'm saying, you can substitute statistics yet the mentality impedes the encouragement of certainty.

Apart from all other aggressive comments aside:

Statistics is not certain, for most concepts there are no proofs. In the definition of variance, I have made this statement before, a component is the summation of the difference of 'x' squared values minus the mean, depending on the person I have spoken with before, the set of reasons are different, most common answers is that it inflates the values, ensures positive value (hence eliminates possible application of complex numbers), or subjectively for some people, is more mathematically "nicer". The formulas of correlation are based of from the definition of variance and they are not proven, only, derived from uncertain definitions, standard deviation is also an example that is derived from the definition variance. One of the reasons I have asked before for the method to determine the steepest gradient of any given function, is that I want to see what I can do in correlation in statistics, to ensure that correlation is more mathematically pure and not just based off from the definition variance that is ambiguous.

Now, you have different analytic statistical methods, such as with the classical median, mean or mode, these are classics in which have been made clear that they are also ambiguous, and uncertain, depending on the set of data, the value can be skewed and biased, and are not completely applicable. You also have methods for skewness and distribution of data, which are essentially more definitions, that fails to seek out true relationships as in pure maths, only altering various parts of definitions to result in the numbers they want.

Finally, in the case of probability, probability can only be proven in the context of probability theory, that is, in any other given context it is false, unlike proofs found in other branches of maths. If you determine the probability that a huge comet will crash into Earth and destroy large portions of life-form on Earth, you can apply probability and presume to understand the picture, yet, you can search for the variables and calculate the size of comets necessary to pass through Jupiter's gravitational field and such necessities, to be implemented in computer algorithms and result in 100% certainty, this is the thing in which I keep repeating.

Also, I study electronics, in modern electronics there are no uncertainty, that's why at the beginning I have asked the application of probability in quantum physics, because in the real world, including economics (which I also study), probability is usually based on luck, it does not give a picture of anything as probability is derived from limited variables with unknown relationship that is not accounted for.

"And, there is no way to be 100% sure what is infecting you will kill you-" More presumptuous statements from the people who defend statistics, clearly you're not interested in discussion but decisive statements.

Edited April 1, 201412 yr by AdvRoboticsE529

April 1, 201412 yr

That post contains a number of misconceptions- sadly very common ones.

Statistics is - like the rest of maths- objective.

You can apply a number of statistical tests to some data and get different outcomes.

There are two reasons for that

1) some of the tests are simply not appropriate.

2) the tests have different powers and- they always give an "answer" that's probabilistic in nature so they can legitimately differ.

It is the job of scientists to be objective.

It's also their job to either know what statistical methods to apply or to get help from a statistician.

The problem is that most scientists think they know what they are doing, so they don't ask the experts unfortunately they also overestimate their own abilities and use the wrong methods and tests..

so, when you say "it is up to the professionals of the field to decide if it is useful."

Do you mean you should choose the right tools for the job (in which case I agree) or are you saying

"keep on doing different tests until you get the answer you want" ?

in which case you are talking about the antithesis of science.

It's also bizarre to claim, on a science website that

"right and wrong are subjective ideas in themselves"

The right answer in science or maths is the right answer.

itook care to not say that statistics was subjective, aklthough i see where you may have misread.

statistics is clearly a form of math used in almost every field i can think of at least to some degree.

every time you deal with a number from a real world thing you are looking at a statistic.

your last was a swing and a miss also. any time someone has to tell me what i am saying, i immediately start looking at the ground for trails of straw.

i implied nor said outright anything resmbling statistics being invalid.

are you sure that you were reading my posts?

April 1, 201412 yr

@Bignose

No, that is not what I'm saying, you can substitute statistics yet the mentality impedes the encouragement of certainty.

Apart from all other aggressive comments aside:

Statistics is not certain, for most concepts there are no proofs. In the definition of variance, I have made this statement before, a component is the summation of the difference of 'x' squared values minus the mean, depending on the person I have spoken with before, the set of reasons are different, most common answers is that it inflates the values, ensures positive value (hence eliminates possible application of complex numbers), or subjectively for some people, is more mathematically "nicer". The formulas of correlation are based of from the definition of variance and they are not proven, only, derived from uncertain definitions, standard deviation is also an example that is derived from the definition variance. One of the reasons I have asked before for the method to determine the steepest gradient of any given function, is that I want to see what I can do in correlation in statistics, to ensure that correlation is more mathematically pure and not just based off from the definition variance that is ambiguous.

Now, you have different analytic statistical methods, such as with the classical median, mean or mode, these are classics in which have been made clear that they are also ambiguous, and uncertain, depending on the set of data, the value can be skewed and biased, and are not completely applicable. You also have methods for skewness and distribution of data, which are essentially more definitions, that fails to seek out true relationships as in pure maths, only altering various parts of definitions to result in the numbers they want.

Finally, in the case of probability, probability can only be proven in the context of probability theory, that is, in any other given context it is false, unlike proofs found in other branches of maths. If you determine the probability that a huge comet will crash into Earth and destroy large portions of life-form on Earth, you can apply probability and presume to understand the picture, yet, you can search for the variables and calculate the size of comets necessary to pass through Jupiter's gravitational field and such necessities, to be implemented in computer algorithms and result in 100% certainty, this is the thing in which I keep repeating.

Also, I study electronics, in modern electronics there are no uncertainty, that's why at the beginning I have asked the application of probability in quantum physics, because in the real world, including economics (which I also study), probability is usually based on luck, it does not give a picture of anything as probability is derived from limited variables with unknown relationship that is not accounted for.

"And, there is no way to be 100% sure what is infecting you will kill you-" More presumptuous statements from the people who defend statistics, clearly you're not interested in discussion but decisive statements.

I still think there is a gross misunderstanding. You don't prove a definition. You define a definition. Then you can use proofs to what rules that defined quantity will follow, how it interacts with other defined or proved quantities, and so on.

You are still railing against how the terms are interpreted. "such as with the classical median, mean or mode, these are classics in which have been made clear that they are also ambiguous". The definitions of these terms are iron clad. How people use them, how they are interpreted and so on -- that's all game for discussion. But I don't see how you can call a definition ambiguous for these terms. The definitions use well established definitions from mathematics like addition and division -- clearly not ambiguous.

Or, let me put it this way... what is ambiguous in the definition of a mean = sum of all elements divided by the count of those elements?

In other words, I still don't understand exactly what your problem with statistics is, nor do I understand exactly what you think can replace it.

April 1, 201412 yr

"Statistics is not certain, for most concepts there are no proofs. In the definition of variance, I have made this statement before, a component is the summation of the difference of 'x' squared values minus the mean, depending on the person I have spoken with before, the set of reasons are different, most common answers is that it inflates the values, ensures positive value"

Nope.

The choice of squared (rather than , for example, the absolute difference or the 4th power ) is a decision made on the basis of the underlying distribution.

If that distribution is normal (or is modeled as such) then the square of the difference is the one that gives the right answer.

And re Davidivad's comment "i implied nor said outright anything resmbling statistics being invalid."

Nope, and I didn't say otherwise, so it's you who is straw manning.

However you did say "right and wrong are subjective ideas in themselves, we pick the interpretations of those best able to pick the least of evils."

And that's not science.

April 1, 201412 yr

Author

Perhaps the definitions of classics such as the "averages" should be interpreted, however, this does not change the fact the definitions such that of variance or probability theory is far from reality, this isn't f = m*a type of definition or radian theta = r*l, the definitions are purposely engineered so to result in a number which fits *their version* of reality, it does not look at the real relationship, which I keep repeating. If you saw and link the variables in relations that distinguishes the variation of the set of data from others, yet result in different numbers or undesirable numbers such as complex numbers, by changing the definition to avoid those results you are already making it artificial, to fit the their own ideal world.

There are countless respective fields where I do not have methods to replace statistics, that does not mean it shouldn't be improved nor does it validate the method, anyways, I intend to work in the medical field to eliminate uncertainty, hence statistics.

In probability theory, the likelihood is such that the outcome is only as likely as its supposed "concentration" (so basically, fraction or percentage, to be represented with decimals, again, more artificials) as part of a set, which is assumed to be random when it is likely that it is not understood properly. If statistics is 100% valid as akin to many other math branches, in the application of statistics in unbiased gambling this should hold true, yet, it is dependant on the gambling and whereby the relative differences are usually distinguished by different unknown variables that affects the known variables.

I heard that probability is quite applicable in quantum physics, either it is not understood thoroughly due to complexity, or perhaps it is true that probability holds to be 100% valid, as am told by confirmation by experiements.

@John Cuthber

Yes, based on underlying distribution such that it is specifically engineered to result in desired numbers and properties of the set of data, not, the value given by true relationships itself.

Edited April 1, 201412 yr by AdvRoboticsE529

April 1, 201412 yr

No.

The use of "squared" is not an arbitrary choice, it follows from the properties of the distribution.

Do you actually know that maths behind statistics, or are you criticising it blindly?

April 1, 201412 yr

Looking over this thread I see that several times I have offered friendly pointers and even made a few short unequivocal statements.

All of which you have wafted away with an imperious wave of the hand and some rudeness.

Considering your fixation with 'certainty', especially in mechanics, I am suprised that having been told that

There are mechanical systems where the mathematics cannot be solved

and

There are mechanical systems where there are no mathematical formulae

You have not enquired what I think these are.

Are you afraid you might learn something?

Did you follow up my pointer to the theory (famous in the history of Sscience) that matched your statements and ideas exactly?

What do you actually mean by certainty?

Give three warfarin tablets daily is certainty (if three are indeed handed out).

But that could kill the patient with nearly 10 times too much warfarin if blue ones are taken instead of brown ones.

April 1, 201412 yr

Author

Distribution is defined as such to result in the set of data to possess specific properties and values, if you disagree, you might as well show me the relationship of distribution (extremely subjective) and prove it.

To make it less subjective, I had the idea by finding the steepest gradient of any given function, and to of course, know the difference of gradient of an infinite set of limiting values , which shows the true relationship, as contrast to statistics, not that I'm anywhere close, and if I am, will not publish it here.

@studiot

Please don't make it personal.

There are mechanical systems which currently is not conclusive, that doesn't mean it cannot be conclusive, that has always been my point.

Certainty is with 100% accuracy.

Giving three warfarin tablets daily is not certainty as the patient in which you are treating may be quite different form the average (statistical analysis), personalised treatment will prevail over statistical analysis.

Edited April 1, 201412 yr by AdvRoboticsE529

April 1, 201412 yr

OK.

Start with a binomial distribution- say tossing a coin.

Half the time you get heads and half the time you get tails.

You can calculate the distribution for the number of heads you get with any given number of trials.

That's fine, but it's a discrete distribution and a lot of real ones are continuous so they found a replacement - its the normal distribution and it's the distribution you get from a very (in principle, infinitely) large number of coin tosses.

If you look athe the variance (a measure of how wrong you are likely to be) you can calculate it.

It turns out that, if you want to use the same "measure of wrongness" for binomial and normal distributions you need to use (x-mu) squared.

Neither the normal, nor the binomial distributions is "subjective". They are mathematically deterministic.

April 1, 201412 yr

Author

Exactly.

"Half the time you get heads and half the time you get tails."

Can only be proven given the context matches, in this case probability theory.

It doesn't "turns out", it is the exact case that they prefer the value for variance to be seemingly logical hence positive.

Binomial distribution is not subjective as invented, distribution of data however is.

April 1, 201412 yr

You do know that the binomial distribution holds whatever the probability of getting heads is, don't you?

It assumes that there is a probability and that the probability is constant with time- that's all.

After that, it's just maths and , as such, objective.

Perhaps it would be better if you said what problem you are actually trying to solve. otherwise I think we will just go round in circles,

April 1, 201412 yr

Author

You already entered development after the application of probability theory with uncertainty and unpure proofs from artificial definitions that does not look at the true relationship but is engineered specifically from previous simpler definitions.

If it was pure maths this would not hold at all because the relationship itself of the properties of the set of data is not looked at sufficiently.

This is akin to decision maths, where there are algorithms which works yet can only be proven given artificial context made purposely to prove such, only to be replaced with a better algorithm until it begins to delve more into pure maths, such as with the Taylor series.

April 1, 201412 yr

Perhaps the definitions of classics such as the "averages" should be interpreted, however, this does not change the fact the definitions such that of variance or probability theory is far from reality,

Can I ask you to please answer this direct question: what is ambiguous in the definition of a mean = sum of all elements divided by the count of those elements?

You claimed it was ambiguous here:

Now, you have different analytic statistical methods, such as with the classical median, mean or mode, these are classics in which have been made clear that they are also ambiguous, and uncertain, depending on the set of data, the value can be skewed and biased, and are not completely applicable.

Let's start simple and clear up what ambiguities you have on the mean before we talk about variances.

April 1, 201412 yr

Author

Sorry, typed a bit quickly.

The application of such methods in order to analysis the properties of the set of data is ambiguous,

April 1, 201412 yr

Sorry, typed a bit quickly.

The application of such methods in order to analysis the properties of the set of data is ambiguous,

Ok, so we're back to my point which is that you have a problem with the application of statistics, not the math itself. If this is correct, please confirm, because it certain hasn't been clear to date.

April 1, 201412 yr

Please don't make it personal.

Why do you not answer my questions?

How can we have a dicussion if I am supposed to believe everything you say but you are allowed to ignore all that I say?

April 1, 201412 yr

Author

Well, it is subjective whether it is maths at all, when I do maths, I always try to find the relationships, whether the value may be difficult or not, I don't try to create definitions and methods specifically to fit the range of values I want, or the type of values.

The application is also ambiguous, the application of mean, it is not used as in pure maths, such as by looking at the average coordinate when given several in pure, for geometry purposes. Whereby in statistics, the notion of mean itself is applied to a set of data given, in an attempt to minimise its difference from every-single value, as contrast to determining the true average coordinate as in geometry that is linked to all given coordinates and can be represented with a function.

April 1, 201412 yr

Well, it is subjective whether it is maths at all,

Really? Now you're going to claim that sums and division is not math at all?

I always try to find the relationships, whether the value may be difficult or not, I don't try to create definitions and methods specifically to fit the range of values I want, or the type of values.

You are free to find whatever relationship you want. But to simply the language, statisticians and mathematics have set a specific definition of a mean.

It doesn't do any good to try to discuss what you think a 'mean' is, if you are going to use your own definition.

It doesn't do us any good to argue about a Tesla Roadster when what I call a Tesla Roadster, the rest of you call a 'banana'. So, let's just stick with the accepted definitions, shall we.

The application is also ambiguous, the application of mean, it is not used as in pure maths, such as by looking at the average coordinate when given several in pure, for geometry purposes. Whereby in statistics, the notion of mean itself is applied to a set of data given, in an attempt to minimise its difference from every-single value, as contrast to determining the true average coordinate as in geometry that is linked to all given coordinates and can be represented with a function.

Please write out in formulas what the difference between these two calculations are. Because I don't see how an average coordinate is "pure, for geometry purposes" and somehow the average in statistics is hence not-pure?

So, please write out exactly the pure calculation, and the not-pure one, please.

Edited April 1, 201412 yr by Bignose

April 1, 201412 yr

Author

I'm not claiming sums and divisions aren't maths, I'm claiming that the false application of it hence does not truly address the relationships.

Averages are of course common especially in geometry and many fields of pure maths, however, its application in the field of statistics is quite artificial.

There is no difference in calculation only the application and purpose of it, you don't seem to understand, for different relationships you search for you apply different methods, and the methodology of the average in statistics only poses to minimize the difference of the final value from every-single value, I'll say it again, it is quite artificial.

April 1, 201412 yr

Averages are of course common especially in geometry and many fields of pure maths, however, its application in the field of statistics is quite artificial.

There is no difference in calculation only the application and purpose of it, you don't seem to understand, for different relationships you search for you apply different methods, and the methodology of the average in statistics only poses to minimize the difference of the final value from every-single value, I'll say it again, it is quite artificial.

Please back this up instead of just stating it. You're right, I don't understand, because you aren't explaining it very well. I don't understand why it is 'pure' in geometry, but 'artificial' in statistics.

Edited April 1, 201412 yr by Bignose

April 1, 201412 yr

Certainty is with 100% accuracy.

Certainty is bad science.

All hypothesis tests should accept the possibility of the observations recorded being potential anomalies. Therefore, any scientific study worth reporting will be reported with measurements of possible error.

April 1, 201412 yr

Author

For example, the average coordinate given several coordinate in the case of geometry can be expressed with a function, you look at the true relationship, or, suppose the average of different vectors in mechanics, which gives the true final vector, different respective fields and their application of average holds true.

In statistics, the basic, the mean, as the average of a set of data, whereby the known and unknown variables affecting every-single value of the set of data is different and not constant hence not equal, yet the average is still applied, and hence only poses to minimise the difference of the mean value from every-single value within the set of data.

In pure or mechanics, the coordinates of vectors are such that a function can give you predictions of any coordinate or vector hence they are, shall we say, the same thing.

In statistics mostly, especially in applying to reality, the values within the set of data are very much different with unknown variables affecting them.

You can try to test the function in economics with statistics, it usually will not work and is quite inaccurate.

This is the ambiguity which I find in that when you attempt to calculate the average of a set of data in statistics, that mostly does not give an accurate picture of the set of data.

Edited April 1, 201412 yr by AdvRoboticsE529

April 1, 201412 yr

Do you know that, because of the effects of quantum mechanics, this makes no real sense

"suppose the average of different vectors in mechanics, which gives the true final vector"

Also this

"You can try to test the function in economics with statistics, it usually will not work and is quite inaccurate."

indicates a problem with the model- not with statistics,.

Edited April 1, 201412 yr by John Cuthber

Validity of Statistics

Featured Replies

Archived

Important Information

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)