Jump to content

least action principle ~ occams razor ~ dynamics


Recommended Posts

In the quest for reformulating current physics in terms of the information view of modern physics, I pose this reflection.

 

The idea behind the principle of least action, is that a system evolves as per the "trajectory" that minimised the action. In that, there is an implied known relation between the trajectories in general, and the action.

 

So how can one interpret the meaning of this action?

 

One intuitive interpretation is that, the action is simply a measure, defined on the set of possible trajectories, so that it is basically a rating system for trajectories, which is ultimately related to the a priori probability for that trajectory to happen.

 

In that case the least action principle can be reinterpreted as the idealisation where one assume that the trajectory chosen, is simply the most likely one, if you associate the action with a transition probability.

 

In this view, this suggest that the physics in this "least action principle" is mainly in defining the action as a function of the set of possibilies.

 

Thus, if we forget about he notion of action itself (which we may suspect is just a _choice_ of measure for a kind of transition probability anyway - just like entropy is measure related to a priori probabilities), we might as well directly ask,

 

What is the transition probability, for a certain transformation - where the transformation is identified as the one giving a certain change, or trajectory?

 

So, what we are looking for, is a rating system for transformation. Ie. a sort of physical "subjective probability measure" on the set of transformations.

 

It seems that this measure, must clearly be related to our confidence in the initial state so one can relate the initial state and final state by a "minimum information change"? Ie. the diffusion of the system is determined by the initial confidence, in the system. A highly confident initial state, will naturally have a lower measure of transformation, and thus effectively possess "inertia".

 

So this seems to relate, and maybe unify entropies with actions.

 

Entropy is a measure on the set of states.

Action is a measure of the set of transformations on states.

 

Both are unified at the upper level, via confidence ratings. Which, when taking togethre should imply a relational dynamics.

 

So far that seems nice, but where does the set of states come from in the first place? It seems the SET of states, should be part of the initial information, and itself be subject of change, if you include also transformations that change topologies. And these should also be rated.

 

Ultimately, it seems it boils down to the rating systems themselves. These must ALSO be part of the intial conditions, and be subject to change?

 

How can we put these pieces together? Any suggestions? This is what I'm currently trying to gfigure out.

 

I am trying to start simple, and take a simple example. How can the transformation of say a "q-space" to a q-p space be understood as sponteanous? What exactly drives this expansion? And how can the defining relation between the dual variables, be understood to be selected as the most fit one? Somehow I think there is and answer to this.

 

Any maybe the duality relation can be understood as the most efficient information sharing splitting?

 

Anyone aware of papers relating to this?

 

/Fredrik

Link to comment
Share on other sites

Surely the path chosen is not the most likely one but the least energetic one (which will also be the most likely but not chosen because it's the most likely but the least energetic...)

 

It depends on how you define the probability measure. And how do we defined energy, and mass, and space and time?

 

The "energy constraints" should of course we accounted for, so I see no contradiction. The transition probability, should certainly include this issue.

 

And about least energetic, it's also relative to the environment. If the system release energy, then the environment is exited, there is clearly somwhere an equilibrium, a most likely balance.

 

A "high energy" fluctuation is usually more unlikely than a low energy one, so "extreme" or "unlikely" options are expected to be self-supressing.

 

Anyway, maybe I was unclear. I am trying to unify the formalism, and understnad how certain concepts are actually expected to emerge.

 

"Energy" is definitely on the list to be properly defined. Classically we know what it is, but in the big picture it's still a fuzzy concept.

 

/Fredrik

 

I attempted to suggest a meaning for entropy and action above that both relates to a rating system. I did this without introducing the concept of energy.

 

Of course, in classical mechanics the action is computer from the energy and the langrangian, say T-V, and so on. But this is all "classical thinking".

 

What I was aiming at, is to find a deeper, first principle, understanding of this, that doesn't depend on classical ontologies.

 

/Fredrik

 

Maybe I should have said that the purpose of the post is to stimulate reflections in this direction. I sure can't be alone to pursue these lines...

 

As I see it, the action more advanced than entropy so one can start reviewing entropy. What I'm suggesting is that the principle of least action and the principle of maximum entropy are sort of different expression of an underlying more fundamental idea.

 

Try to interpret [math]

1/T = k \frac{\partial S}{\partial U}

[/math]

 

in terms of a a choice of information measure, which can be thought of as the entropy of a probability distribution over distinguishable states - the higher the entropy of the distribution, the more plausible is this distribution to be found, given ignorance of the underlying microstates.

 

The question is, is there a more plausible measure of a priori plausability of a state, than to actually try to find the "probability" of the state?

 

IE. What is the mathematical relation between entropy of a distribution, and some probability of the distribution, defined in some space of all possible probabilities? Exactly how would one construct such measure?

 

And what would the correspondence to "energy" be, in such a picture? any "natural candidates"?

 

Somehow, classically energy is often as a kind of potential impact, or significance in nature, so that there is a limit of the impact a small system with low energy can have on a high energy system - regardless of internal structure. This is intriguing.

 

Maybe energy can relate to the sample size, defining the structure? If we consider a relative frequency, this approaches a contiuum as the sample size goes to zero, but then so does the information content. What is possibly the physical sense in such continuum?

 

Could it be that relating energy to the information capacity from start is anywhere on track?

 

If one tries a combinatorical trick, to represent distributions with finite sample sizes, the first idea seems to be discrete relative frequences. But clearly the _confidence_ in a relative frequency, depends on the _absolute_ frequency.

 

[math]m \rho_i = f_i[/math]

 

If you take this to be the "mass", or sample size, of the distribution that factors out. However, if you perform the combinatorical calculations for the distribution, and tires to computer the combinatorical "probability" that you will randomly find a particular absolute frequency, the distribution mass does not factor out, nad thus one can not simply take the continuum limit.

 

Then a reflection is to try to make sense out of this expression

[math]

1/T_{info} = \frac{ln\partial P[\rho,m]}{\partial m}

[/math]

 

In this interpretation, the better choice instead of entropy is "probabilty of probability", which can be shown to be better related to the this entropy http://en.wikipedia.org/wiki/Kullback-Leibler_divergence, rather than the shannon-like entropy.

 

This is how energy or mass might be seen as a overall rating of distributions. When two distributions clash, the impact of a low measure distribution is bounded by it's sample size.

 

Then this might be possible to generalise, to change. So far I only talked about "static distributions", but how does actually one distribution morph into another? Can this _change_ be rated with a probability? And will the relational changes suggested by that, possibly have any relation to dynamics relevant to physics?

 

Similarly, one can try to find the probability not for a static distribution, but for transitions! Or if you like, probabilities define on the space of "differential distributions". This can be interpreted also as the probability that you were mistaken in the first place. And it's the fact that one never knows anything exactly that allows for this dynamics to take place.

 

And things that you are confident in, will not change much relative to other things. This should suggest a relational change. Now if one could also try to explain structure formation and time from this, it would be a strong case. One should also be able to explain the notion of complex amplitudes rather than real probabilities. But I expect that to come out of this too.

 

I'm an amateur having very little time to spend on this, so my progress is slow. But there has to be lots of papers on this that I havent' seen yet. Most standard texts on stat mech I've seen seem to be too entangled up with classical thinking, that it's more or less useless. Even though the parallells are clearly there, it's just not clean enough to expose the beauty.

 

This only serves to briefly express the ideas for discussion and relating to current parallell ideas. If you think bits are missing, that's right, and those are what I'm looking for.

 

I'm basically trying to further develop probability formalism to better fit reality and physics. Alot of questions at once, but that's life.

 

/Fredrik

 

From http://en.wikipedia.org/wiki/Kullback-Leibler_divergence, the setting of the reflections can be understood:

 

"The idea of Kullback–Leibler divergence as discrimination information led Kullback to propose the Principle of Minimum Discrimination Information (MDI): given new facts, a new distribution f should be chosen which is as hard to discriminate from the original distribution f0 as possible; so that the new data produces as small an information gain DKL( f || f0 ) as possible"

 

and also

 

"MDI can be seen as an extension of Laplace's Principle of Insufficient Reason, and the Principle of Maximum Entropy of E.T. Jaynes. In particular, it is the natural extension of the principle of maximum entropy from discrete to continuous distributions, for which Shannon entropy ceases to be so useful (see differential entropy), but the KL divergence continues to be just as relevant."

 

The interesting thing happens when part of our information, says somthing about the expected change of the same! This means that information about change, is already encoded in the initial conditions.

 

Still the Kullback–Leibler divergence is just one measure. And the question still remains how this relates to something that we would call the transition probability, which I think is the relevant measure?

 

Normally from normal stat.mech the type of "dynamics" we can see coming out naturally is basically a type of diffusion, where the states diffuse downhill the probability gradient, and the ultimate equilibrium state is that of maximum entropy.

 

How can we understand more complex dynamics coming out of this, rather than the somewhat "trivial" diffusion and asymptotic approach to heat death? One possibility is the fact that entropy and disorder is relative, and thus the a priori "maximum entropy" state, may "a posteriori" prove NOT to be the most preferred one, since the conditions has changed.

 

One certainly asks why don't they somehow "meet in the middle", but here is where inertia comes into play. The _confidence_ in the change is to high, that it may continue pass the presumed equilibrium point. This type of dynamics seems not too unlike that of general relativity! I find this also extremely intriguing.

 

So "momentum" can the associated with confidence or "inertia of change". So the concepts are unified. If you ultimately related everything to information, in the epistemological spirit, the unification should be unavoidable. Also seems to fit well in line with the GR nuts and bolts, although possible far more general.

 

But this suggest that we need to reconstruct alot of formalisms in current models. In particularly any background references, or a priori assume objective facts, are prime suspects IMO.

 

One interesting key is to try to understand how relations evolve. For example, the relation between information that defines the dual spaces, say x and p.

 

To just postulate, or accept that relation seems way too easy imo. The key to understanding in it depth seems to require understanding it's history, and the observable status of both these spaces can be understood via emergent relations?

 

/Fredrik

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.