Jump to content

Did the Definition of Variance in Probability Change?


TakenItSeriously

Recommended Posts

Below is from Wikipedia

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean, and it informally measures how far a set of (random) numbers are spread out from their mean. The variance has a central role in statistics. It is used in descriptive statistics, statistical inference, hypothesis testing, goodness of fit, Monte Carlo sampling, amongst many others. This makes it a central quantity in numerous fields such as physics, biology, chemistry, cryptography, economics, and finance. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by

Not that long ago probability and statistics had different definitions of variance because of their different frames of reference. Probability is about future expectations and statistics is about past results.

 

In probability, it used to be something like: the maximum amount something could change based on a single event. However, now I can't find any evidence of that old definition.

 

My question has nothing to do with the merits of the change.

 

Before thinking that hard about it, it seems like a more useful form in some respects but its also bad in other respects but then, perhaps there should be two terms instead of one. However, just changing definitions without any reference to the change is confusing and dangerous.

 

My questions are:

  • When did it change?
  • How did it change with no evidence of the change based on a quick Google search?
Link to comment
Share on other sites

"In probability, it used to be something like: the maximum amount something could change based on a single event. However, now I can't find any evidence of that old definition."

 

I'm pretty darn elderly and I have never seen such a definition. The probability definition of "variance" is the sum of the square of x times the probability x occurs, [math]\sum (x- \mu)^2P(x)[/math], where x varies over all possible outcomes (and, in the case of a continuous variable, the sum becomes an integral).

Edited by Country Boy
Link to comment
Share on other sites

"In probability, it used to be something like: the maximum amount something could change based on a single event. However, now I can't find any evidence of that old definition."

 

I'm pretty darn elderly and I have never seen such a definition. The probability definition of "variance" is the sum of the square of x times the probability x occurs, [math]\sum (x- \mu)^2P(x)[/math], where x varies over all possible outcomes (and, in the case of a continuous variable, the sum becomes an integral).

I'm certain they were different at most 10 years or so ago. I remember it was extremely annoying in the poker forums when long debates would break out over vairance when they didn't realize they wer discussing two different definitions.

 

Specifically it was an issue of variance as used in tournaments where large entry fees and extremely large first place prizes make a huge diffeence and those used in online cash games, where variance was in terms of a database of stats.

 

edit to add: and yes I did look up their specific definitions back then.

Edited by TakenItSeriously
Link to comment
Share on other sites

Only 10 years?

 

Perhaps your memory has slipped?

 

There was a time when they used to distinguish between 'absolute variation' and 'relative variation'

 

Perhaps you are referring to this?

 

Relative variation is the fore runner of the modern approach.

 

Absolute variation was renamed to range about 1920.

Link to comment
Share on other sites

Only 10 years?

 

Perhaps your memory has slipped?

 

There was a time when they used to distinguish between 'absolute variation' and 'relative variation'

 

Perhaps you are referring to this?

 

Relative variation is the fore runner of the modern approach.

 

Absolute variation was renamed to range about 1920.

1920! I'm not that old, lol.

 

It may have been an issue of the times due to the poker boom or the reliability of my sources though I know they included at least three.

 

I also remember Wikipedia had many definitions for different fields, such as statistics, probability, accounting, etc.

 

edit to add:

is standard deviation even used in probability? how could that work without a data sample?

Edited by TakenItSeriously
Link to comment
Share on other sites

Well the definition of range sounds like what you are describing since it defines 'how far you can go'.

 

It comes from the oldest stats book I have

 

Statistical Methods by F.C Mills, Professor of Economics and Statistics at Columbia. (1924)

Link to comment
Share on other sites

Maybe it's the population and sample variance you are conflating?

 

A definition based on the maximum amount it can change sounds like an odd definition, it would mean, for instance, any Brownian motion would have infinite variance.

 

edit:

 

 

is standard deviation even used in probability? how could that work without a data sample?

 

Is often easier to work with the variance rather than standard deviation, but given the latter is the principle root of the former yes it's used all the time in probability theory. Maybe one way to think about it is that the theoretical variance is the amount you expect a random variable will vary before you roll the dice (or whatever is your outcome), assuming that theoretical distribution well models your dice roll outcomes.

Edited by Prometheus
Link to comment
Share on other sites

Well the definition of range sounds like what you are describing since it defines 'how far you can go'.

 

It comes from the oldest stats book I have

 

Statistical Methods by F.C Mills, Professor of Economics and Statistics at Columbia. (1924)

Maybe it's the population and sample variance you are conflating?

 

A definition based on the maximum amount it can change sounds like an odd definition, it would mean, for instance, any Brownian motion would have infinite variance.

I may not have been clear, my point was that variance in statistics and probability used to be different and now their defined as the same.

 

Statistics definition was the same as given in the article, effectively the square of the standard deviation.

 

Probability definition was effectively the largest positive or engative swing.

 

But it seems like they need to be different because of past vs future points of view. As I had meantioned earlier, statistics is an analisys of data from past or current events

 

probability is estimating the odds of future outcomes. So how could you provide a standard deviation when you have no data and only relly on something like combinatorics to calculate odds. Is their a standard deviation that could be predicted?

 

Edit to add:

I see the definition was already given so perhaps I need to look into the sources a little more deeply

"In probability, it used to be something like: the maximum amount something could change based on a single event. However, now I can't find any evidence of that old definition."

 

I'm pretty darn elderly and I have never seen such a definition. The probability definition of "variance" is the sum of the square of x times the probability x occurs, [math]\sum (x- \mu)^2P(x)[/math], where x varies over all possible outcomes (and, in the case of a continuous variable, the sum becomes an integral).

Edited by TakenItSeriously
Link to comment
Share on other sites

Perhaps it's an issue of the context for how I mostly applied probability theory which is for poker and other games but I see it as a matter of deterministic vs non-deterministic results, where only games of chance could have predictable variance, but for games of skill, the variance can only be measured from past data which would be a statistical variance. Without data, I'm sure only the fundamental data has practical use.

 

It may be possible to improve on that based on structure but it still wouldnt be very practical, I would think. At least not without Quantum Computers because thousands of entries will have some incredible number of permutations to work out. I just don't see a mathematical way around that without first order approximations maybe? which isnt math in any case, because any valid form of approximation requires using logic, I think.

 

For Example:

the variance from throwing 5 dice would have a predictable outcome that I would imagine could be calculated using combinatorics or simulated using Monti-Carlo because it's deterministic.

 

For the examples of the variance in poker tournaments it depends upon human factors relative to other human factors which is not that predictable. At least not without past results at which point it would be a statistical variance even if applied in a probability context.

 

I actually spent a great deal of time studying tournament variance based on personal results and I found that there were some very big surprises that I would never have predicted, though I was able to explain them in hind site.

 

Examples:

For a winning player, you might think that the lower entry fees would have lower relative variance. However, at least in my case, I found that the higher entry fee events had far less relative variance due to the more homogenious play of the more advanced player, vs the more erratic play of beginners.

 

In fact, not only was my variance much much lower, I even had a much higher ROI at the top levels vs lower levels which surprised me quite a bit. That's because the styles I was facing had become so homogenized through playing high volumes and through learning from poker forums. They weren't very adaptive and their style was better suited against the more diverse fields of players.

 

I've always been a highly adaptive player, so given the choice of playing 9 pros with a homogeneous non-adaptive style that had been optimized against a wide range of players, vs 9 amatures who all made moderately larger mistakes on average, but more random mistakes, I'd rather play the all pro event and it wouldn't even be close.

 

I was actually able to see the same kind of phenomenon going on in the field of digital security when every type of account started using identical security techniques. It created an extremely exploitable situation where discovering a single exploit could trheaten all accounts. Especially if exploited using mallware.

 

I don't know if it was what made the difference, but I did warn Cert about this threat on two occasions before I started to see divergence in the kinds of security again.

Edited by TakenItSeriously
Link to comment
Share on other sites

Perhaps it's an issue of the context for how I mostly applied probability theory which is for poker and other games but I see it as a matter of deterministic vs non-deterministic results, where only games of chance could have predictable variance, but for games of skill, the variance can only be measured from past data which would be a statistical variance. Without data, I'm sure only the fundamental data has practical use.

 

It sounds like a problem of finding a suitable model.

 

The sum of five dice, for instance, can be well modelled by a theoretical distibution. We need only that some assumptions hold approximately true: mostly the assumption that all 5 dice are fair. From this theoretical distribution, and without rolling a single die, we can calculate the variance: the variance is a property of the theoretical distribution. No rolls are necessary for this to exist just as no perfect circles need physically exist for pi to exist: variance is a mathematical property. We are hoping to model the dice rolling with this theoretical outcome, and the degree to which we are successful depends on how well our assumptions hold.

 

We may attempt to extend this model to, for instance, outcomes in poker games (about which i know very little so forgive any technical errors). But here we need to make very many more assumptions, and some of those will be about the nature of the humans involved. It would be very difficult to find a theoretical distribution to model these outcomes as well as we could for the dice rolls. Another method to model these poker outcomes which captures the 'human' factor is to collect data from real games and construct an empirical distribution - i.e data. We could then use this empirical distribution to predict future outcomes on the assumption that the future will be similar enough to the past for the model to hold.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.