# How can information (Shannon) entropy decrease ?

## Recommended Posts

Posted (edited)
10 hours ago, Ghideon said:

Let I be information.

'sigh'. If that's what you want I to be, you should also define how the information is encoded.

If you want to call J information that can be deduced from I, you should also define what that means. Is J a copy of I or another encoding of information?

10 hours ago, Ghideon said:

Information K is required to create the function f. This could be a formula, an algorithm or some optimisation.
In the bookcase example I is the book shelf and J is the list of books. Function is intuitively easy, we create a list by looking at the books.

Yes; what you do is copy information from the form it's in when you look at the books, to a different form, a list. The list is not the books and nor is what you see when you look at them.

So, information is something you defined, in terms of books. An arbitrary choice. Information is a choice of an encoding.

Even when it's information that you know is unreadable, say the individual velocities of molecules of gas, in a bottle of gas. You still know there is information, and you know something about the encoding of it. You can design an algorithm; the algorithm doesn't have to have a physical realization, it doesn't have to work. It just has to be an algorithm.

Edited by SuperSlim

• Replies 108
• Created

#### Posted Images

5 hours ago, SuperSlim said:

So, information is something you defined, in terms of books. An arbitrary choice. Information is a choice of an encoding.

I disagree.

Information is the meaning of the coding.

##### Share on other sites

12 hours ago, SuperSlim said:

If that's what you want I to be, you should also define how the information is encoded.

Why is encoding important in this part of the discussion? In the bookcase example, does it matter how the titles are encoded?

And in the coin example @studiot used "0101" (0=no, 1=yes); Shannon entropy is as far as I know unaffected if we used N=no, Y=yes instead, resulting in "NYNY".

12 hours ago, SuperSlim said:

'sigh'.

?

##### Share on other sites

Posted (edited)
7 hours ago, studiot said:

I disagree.

Information is the meaning of the coding.

No. An encoding doesn't encode meaning. The meaning of information is something above and beyond how it's encoded.

For instance if you try to run a program that isn't a program (it's not the right format), you'd expect the computer to reject it, to output an error message. If the program is say, an mpg file it wouldn't be expected to run like a program.

If you interpret information, you apply meaning. This is something a digital computer does all the time: it has to be able to distinguish the meaning of an address word from an instruction word, and so on.

1 hour ago, Ghideon said:

Why is encoding important in this part of the discussion?

Encoding is only important if you want to encode some . . . information!

Edited by SuperSlim
##### Share on other sites

13 minutes ago, SuperSlim said:

No. An encoding doesn't encode meaning. The meaning of information is something above and beyond how it's encoded.

For instance if you try to run a program that isn't a program (it's not the right format), you'd expect the computer to reject it, to output an error message. If the program is say, an mpg file it wouldn't be expected to run like a program.

If you interpret information, you apply meaning. This is something a digital computer does all the time: it has to be able to distinguish the meaning of an address word from an instruction word, and so on.

Encoding is only important if you want to encode some . . . information!

I'm sorry but you are either being disingenuous here or mistaken.

You specifically related information to encoding by the use of the word 'choice' , which I highlighted from your post.

Information exists, regardless of the method of encoding or even whether it is encoded or not.

Furthermore the more complex examples we are discussing in this thread demonstrates that there is even more to information than this. Entanglement brings yet another layer.

Information is most definitely not the choice of encoding since otherwise there would be an infinite amount of information since there is an infinite count of different ways of encoding even a simple binary piece of information.
Since the choice is infinite, the information is infinite, by your definition.

##### Share on other sites

2 hours ago, SuperSlim said:

Encoding is only important if you want to encode some . . . information!

I did not want to encode information, I tried to generalise @studiot's example so we could compare our points of view. You stated I should define encoding:

15 hours ago, SuperSlim said:

If that's what you want I to be, you should also define how the information is encoded.

My question is: why?

##### Share on other sites

Agreed that this is a difficult topic. I think I understand pretty well what entropy is in the context of physics. Yet, I'm finding pretty tricky to build a completely unambiguous conceptual bridge from what entropy means in physics, and what it means in computer science. There may well be a good reason why that is so. While entropy is completely central to physics, it seems to be --unless I'm proven otherwise-- that it's not a matter of life and death in computer science, at least from the practical point of view.

It's very tempting to me to start talking about control parameters and how they really determine what information is available to anyone trying to describe a system, but it would make it a discussion to heavily imbued with a purely-physics outlook.

You've been very dismissive of every attempt at defining information, entropy, as well as of other qualifications, comments, etc. by the other three of us. I think you made a good point when you talked about evolution. I was kinda waiting for the illuminating aspects that were to follow. But they never came.

Would you care to offer any insights? I'm all ears. And I'm sure @Ghideon and @studiot are too.

##### Share on other sites

Posted (edited)
8 hours ago, joigus said:

You've been very dismissive of every attempt at defining information, entropy, as well as of other qualifications, comments, etc. by the other three of us.

Ok. Well, I'm sorry if that's how it comes across. I'm "dismissive" generally of people's naive ideas about certain subjects. It's fairly apparent that people generally believe they know all about the subject: what is information, and what is computation, and I'm sorry again, but that is clearly not the case with this thread.

11 hours ago, studiot said:

I'm sorry but you are either being disingenuous here or mistaken.

You specifically related information to encoding by the use of the word 'choice' , which I highlighted from your post.

Information exists, regardless of the method of encoding or even whether it is encoded or not.

Yes, you choose how to encode information. Or maybe the universe does. Information does not exist if it isn't encoded somehow; so your third sentence there is incorrect. The encoding must have a physical basis.

11 hours ago, studiot said:

Information is most definitely not the choice of encoding since otherwise there would be an infinite amount of information since there is an infinite count of different ways of encoding even a simple binary piece of information.

Yes it is. There may well be an infinite number of ways to encode the same finite amount of information; each encoding will be the same information, unless it's been transformed irreversibly.

9 hours ago, Ghideon said:

I did not want to encode information,

Yes you did. You said you could write a list that represented some books.

Quote

. . . I tried to generalise

@studiot's example so we could compare our points of view. You stated I should define encoding:

9 hours ago, Ghideon said:

My question is: why?

Because . . . you said I is some information and information is always encoded.

Seriously, if you handed in an assignment that said "I is information" then didn't specify what kind, what physical units, what the encoding is, a professor would probably not give you a passing grade.

p.s. the level of criticism I'm using in this thread is nothing compared to the real thing; when you study at university, particularly hard science subjects, you get criticised if you say something that's incorrect, that isn't quite the whole story.

This is not because university lecturers are nasty people, it's because they want you to learn something. One lesson is accepting there are things you don't know about, so you don't understand them.

A review of what I just posted: You and I and everyone actually has a good idea of what information is, and what computation is; however, getting those marks in that exam means you need to understand it in a formal way; you need to be able to trot out those equations.

I guess I've been kind of arguing the point, somewhat. But so far, neither studiot, nor Ghideon, has managed to leave the pier. With binary computers a lot of the heavy lifting has been done, thanks to computer designers.

Binary information is pretty obvious. But as I say, how do you know a particular binary word represents an address, or an address offset, or is an instruction? How do you tell which binary strings are which? You don't have to, it's all been done for you . . .

On the other hand, information from say, the CMB, is not different kinds of strings, instructions or addresses. It is encoded though.

8 hours ago, joigus said:

I think you made a good point when you talked about evolution. I was kinda waiting for the illuminating aspects that were to follow. But they never came.

Would you care to offer any insights?

Well, have you heard of Baez' cobordism hypothesis? In which a program is a cobordism between manifolds?

Edited by SuperSlim
##### Share on other sites

Posted (edited)

Also, I'd recommend looking into monoidal categories;  a free monoid is just a set of strings over an alphabet, i.e. an alphabet with concatenation. Formal languages can be free monoids, or monoids with restrictions on string composition (i.e. not free).

A monoid is basically a group with no inverses, but there's an identity (the empty string).

A monoid is, according to category theory, a de-categorification of a symmetric monoidal category (!)

Edited by SuperSlim
##### Share on other sites

3 hours ago, SuperSlim said:

Well, have you heard of Baez' cobordism hypothesis? In which a program is a cobordism between manifolds?

Not really that much of an insight TBH, but I'll take it anyway. Baez-Dolan cobordism hypothesis is used to clasify topological quantum field theories. What does it have to do with computation? I'm familiar with Baez's motto "A QFT is a functor." However interesting such statements may be in physics, what do these manifolds represent in computer science? Topological information on a manifold could be coded in topological invariants, like, ie., the Betti numbers. Is that the connection?

2 hours ago, SuperSlim said:

Also, I'd recommend looking into monoidal categories;  a free monoid is just a set of strings over an alphabet, i.e. an alphabet with concatenation. Formal languages can be free monoids, or monoids with restrictions on string composition (i.e. not free).

A monoid is basically a group with no inverses, but there's an identity (the empty string).

A monoid is, according to category theory, a de-categorification of a symmetric monoidal category (!)

Thank you, but no, thank you. I spent three months studying those under a teacher who was a mathematical physicist.

Very nice, very good teacher. A fountain of mathematical knowledge in the vein of Baez. I thank him for the A he gave me, which I didn't deserve. But I don't know what the hell we were doing studying monoids and alphabets as a substitute for group theory for physicists. Apparently, no one in the department wanted to teach the Lorentz group because it's not compact, and that's very very bad, for some reason. I had to learn that from Weinberg's QFT Volume I.

I tell you this just to explain I'm a person very much shaped by the effort of trying to free myself from the shackles of being exposed to layers and layers of abstraction that don't lead anywhere useful.

As to,

4 hours ago, SuperSlim said:

A monoid is, according to category theory, a de-categorification of a symmetric monoidal category (!)

I don't know what to do with that, or in what sense it clarifies what information is, or how it's stored, deleted, encrypted, etc. It certainly doesn't clarify Landauer's principle to me. You might as well have given me the procedure to pluck a chicken.

What particular aspect does it clarify in information theory? That's what I call an insight.

##### Share on other sites

Posted (edited)
15 hours ago, SuperSlim said:

Because . . . you said I is some information and information is always encoded.

Ok. I had hoped for something more helpful to answer the question I asked.

15 hours ago, SuperSlim said:

Seriously, if you handed in an assignment that said "I is information" then didn't specify what kind, what physical units, what the encoding is, a professor would probably not give you a passing grade.

I tried to sort out if I disagreed with @studiot or just misunderstood some point of view, I did not know this was an exam that required such a level of rigor. "List" was just an example that studiot posted, we could use any structure. Anyway, since you do not like generalisations I tried, here is a practical example instead, based on Studiot's bookcase.

1: Studiot says: “Can you sort the titles in my bookcase in alphabetical order and hand me the list of titles?”
me: “Yes”
2: Studiot:  “can you group the titles in my bookcase by color?”
me “yes, under the assumption that I may use personal preferences to decide where to draw the lines beteen different colors”
3: Studiot: “can you sort the titles in my bookcase in the order I bought them?” Me: “No, I need additional information*”

I was curious about the differences between 1,2 and 3 above and if and how it applied to the initial coin example. One of the aspects that got me curious about the coin example was that it seemed open for interpretation whether it is most similar to 1,2 or 3. I wanted to sort out if that was due to my lack of knowledge, misunderstandings or other.

15 hours ago, SuperSlim said:

however, getting those marks in that exam means you need to understand it in a formal way; you need to be able to trot out those equations.

Thanks for the info but I already know how exams work.

*) Assuming, for this example, that the date of purchase is not stored in Studiots bookcase.

Edited by Ghideon
spelling
##### Share on other sites

Posted (edited)
10 hours ago, joigus said:

Baez-Dolan cobordism hypothesis is used to clasify topological quantum field theories. What does it have to do with computation?

Perhaps monoidal categories have more to do with computation than most scientists realise. You understand, I hope, that the "big idea" is that category theory can provide a common language that spans QFT, QIS, classical IS, and maybe some other disciplines?

Unfortunately I'm in the process of moving house and I've stashed all my notes in storage. But that's the concept, that category theory can fill the gaps in understanding. However, it seems to still be largely not understood.

I'll just trot this out, since I do know what the connection is between field theories and monoids.

If you've looked at the computational problem of Maxwell's demon, the monoid in question is N molecules of gas in thermal equilibrium. The demon sees a "string of characters" which are all the same. If the demon could get some information about just one of the molecules and store it in a memory, then the  second law is doomed.

Since the second law doesn't seem to be doomed and time keeps ticking forwards, the demon can't store information about the molecules. It can't encode anything except a string with indeterminate length, from an alphabet of 1 character.

But I'll leave you all to carry on, figuring out whatever it is you think you need to figure out. I can't help you it seems.

So good luck with your search.

Edited by SuperSlim
##### Share on other sites

1 hour ago, Ghideon said:

Ok. I had hoped for something more helpful to answer the question I asked.

I tried to sort out if I disagreed with @studiot or just misunderstood some point of view, I did not know this was an exam that required such a level of rigor. "List" was just an example that studiot posted, we could use any structure. Anyway, since you do not like generalisations I tried, here is a practical example instead, based on Studiot's bookcase.

1: Studiot says: “Can you sort the titles in my bookcase in alphabetical order and hand me the list of titles?”
me: “Yes”
2: Studiot:  “can you group the titles in my bookcase by color?”
me “yes, under the assumption that I may use personal preferences to decide where to draw the lines beteen different colors”
3: Studiot: “can you sort the titles in my bookcase in the order I bought them?” Me: “No, I need additional information*”

I was curious about the differences between 1,2 and 3 above and if and how it applied to the initial coin example. One of the aspects that got me curious about the coin example was that it seemed open for interpretation whether it is most similar to 1,2 or 3. I wanted to sort out if that was due to my lack of knowledge, misunderstandings or other.

Thanks for the info but I already know how exams work.

*) Assuming, for this example, that the date of purchase is not stored in Studiots bookcase.

Very interesting.  +1

Also to @joigus for his latest thoughts.

I had in mind to extend the bookcase/list example and you took (some of) the thoughts right out of my head.

Information is certainly a slippery concept, which is why it is carefully specified (limited) in information theory and associated information entropy, so that case 3 for instance cannot arise within the theory.

I was thinking that one needs definitions / explanations for

Data
Message
Encoding
Encryption
Meaning
and perhaps some other concepts I haven't thought of.

to properly parse the various manifestations of information.

I also has some new examples

A field Marshall about to engage the enemy tells his general A that the plan is to attack at 3am. He will send a signal '1' if
This is confirmed and a '0' if he is not to attack at 3am.

A perfectly good message, but the situation is not the same as with the coin and squares game as it is open ended.

A variation in a sort of entanglement occurs if a pincher movement in conjunction with general B is envisaged.
Because general A will not only know what he himself is doing, but also general B's movements.

A second new example concerns nautical ssignal flags.
A certain asmiral operates the following practice.
His flagship flies two signal flags.
The top one carries the fleet number of the ship he is signalling.
Underneath the second flag carries a sentence from section 5 of Maryat's signals book, say 'Report to Flag'
Each flag is actually an easily distinguishable colour pattern.
So the message is encoded twice, but not encrypted.

##### Share on other sites

18 hours ago, joigus said:

Baez-Dolan cobordism hypothesis is used to clasify topological quantum field theories. What does it have to do with computation?

Unfortunately, these different fields [, physics, topology, logic and computation,] focus on slightly different kinds of categories. Though most physicists don’t know it, quantum physics has long made use of ‘compact symmetric monoidal categories’.
Knot theory uses ‘compact braided monoidal categories’, which are slightly more general. However, it became clear in the 1990’s that these more general gadgets are useful in physics too. Logic and computer science used to focus on ‘cartesian closed categories’ — where ‘cartesian’ can be seen, roughly, as an antonym of ‘quantum’.
However, thanks to work on linear logic and quantum computation, some logicians and computer scientists have dropped their insistence on cartesianness: now they study more general sorts of ‘closed symmetric monoidal categories’.
--https://www.cs.auckland.ac.nz/research/groups/CDMTCS/researchreports/352mike.pdf

##### Share on other sites

Posted (edited)

A note on the initial question @studiot; How can Shannon entropy decrease.

It’s been a while since I encountered this so I did not think of it until now, it's practical case where Shannon entropy decreases as far as I can tell.
The Swedish alphabet has 29 letters* ; the english letters a-z plus "å", "ä" and "ö". When using terminal software way back in the days the character encoding was mostly 7-bit ascii which lacks Swedish characters åäö. Sometimes the solution** was to simply 'remove the dots' so that “å”, “ä”, “ö” became “a”, “a”, “o”. This results in fewer symbols and increased probability of “a” and “o”. This result is a decreased Shannon entropy for the text entered into the program compared to the unchanged Swedish original.

Note: I have not (yet) provided a formal mathematical proof so there’s room for error in my example.

*) I’m assuming case insensitivity in this discussion
**) another solutions was to use characters “{“, “}” and “|” from 7-bit ascii as replacements for the Swedish characters.

Edited by Ghideon
clarified a sentence
##### Share on other sites

1 hour ago, Ghideon said:

A note on the initial question @studiot; How can Shannon entropy decrease.

It’s been a while since I encountered this so I did not think of it until now, it's practical case where Shannon entropy decreases as far as I can tell.
The Swedish alphabet has 29 letters* ; the english letters a-z plus "å", "ä" and "ö". When using terminal software way back in the days the character encoding was mostly 7-bit ascii which lacks Swedish characters åäö. Sometimes the solution** was to simply 'remove the dots' so that “å”, “ä”, “ö” became “a”, “a”, “o”. This results in fewer symbols and increased probability of “a” and “o”. This result is a decreased Shannon entropy for the text entered into the program compared to the unchanged Swedish original.

Note: I have not (yet) provided a formal mathematical proof so there’s room for error in my example.

Interesting, there's life in the old thread yet.  +1

##### Share on other sites

On 3/10/2022 at 11:09 AM, Ghideon said:

The Swedish alphabet has 29 letters* ; the english letters a-z plus "å", "ä" and "ö". When using terminal software way back in the days the character encoding was mostly 7-bit ascii which lacks Swedish characters åäö. Sometimes the solution** was to simply 'remove the dots' so that “å”, “ä”, “ö” became “a”, “a”, “o”. This results in fewer symbols and increased probability of “a” and “o”. This result is a decreased Shannon entropy for the text entered into the program compared to the unchanged Swedish original.

The way to really consider the Shannon entropy is as a sender, a receiver, and a channel. It's about how to encode a set of messages "efficiently". I consider your example wouldn't change the coding efficiency much; not many words would be "surprising".

##### Share on other sites

1 hour ago, SuperSlim said:

The way to really consider the Shannon entropy is as a sender, a receiver, and a channel. It's about how to encode a set of messages "efficiently". I consider your example wouldn't change the coding efficiency much; not many words would be "surprising".

What's with the quotation marks?

You seem to imply that no word in a code like Swedish would be surprising.

What word in Swedish am I thinking about now?

Spoiler

Besserwisser

##### Share on other sites

7 hours ago, joigus said:

What's with the quotation marks?

You seem to imply that no word in a code like Swedish would be surprising.

What word in Swedish am I thinking about now?

Hide contents

Besserwisser

Jörk? 😄

##### Share on other sites

Posted (edited)
9 hours ago, joigus said:

You seem to imply that no word in a code like Swedish would be surprising.

What I meant was no reader of Swedish would find the missing marks surprising, because they expect to see them. So they would understand written Swedish with or without the marks; it's like how you can ndrstnd nglsh wtht vwls n t.

Or wat.

ys. mst frms r fr jrks. Lk xchmst.

9 hours ago, joigus said:

What word in Swedish am I thinking about now?

What a mature question; you must feel so proud of yourself; you don't even have to try, do you?

Seriously, you don't have anything better than schoolboy jokes? What a bunch of clowns.

Seriously. What a pack of goddam idiots. Patting each other on the back aboout how much you like each others inane posts. Jesus Christ.

You can keep this shit. I'm wasting my time with it

Eat shit and die, you dumb fucks.

Edited by SuperSlim
##### Share on other sites

1 hour ago, SuperSlim said:

What I meant was no reader of Swedish would find the missing marks surprising, because they expect to see them. So they would understand written Swedish with or without the marks; it's like how you can ndrstnd nglsh wtht vwls n t.

Or wat.

ys. mst frms r fr jrks. Lk xchmst.

What a mature question; you must feel so proud of yourself; you don't even have to try, do you?

Seriously, you don't have anything better than schoolboy jokes? What a bunch of clowns.

Seriously. What a pack of goddam idiots. Patting each other on the back aboout how much you like each others inane posts. Jesus Christ.

You can keep this shit. I'm wasting my time with it

Eat shit and die, you dumb fucks.

My code seems to have worked. True colours in full view.

##### Share on other sites

Posted (edited)
14 hours ago, SuperSlim said:

The way to really consider the Shannon entropy is as a sender, a receiver, and a channel. It's about how to encode a set of messages "efficiently". I consider your example wouldn't change the coding efficiency much; not many words would be "surprising".

Explicit definition of sender, receiver and channel is not required. A difference between 7 bit and 8 bit ascii encoding can be seen in the mathematical definition for Shannon entropy.

4 hours ago, SuperSlim said:

What I meant was no reader of Swedish would find the missing marks surprising, because they expect to see them. So they would understand written Swedish with or without the marks; it's like how you can ndrstnd nglsh wtht vwls n t.

The example I provided is not about what a reader may or may not understand or find surprising (that is subjective), it's about mathematical probabilities due to the changed number of available symbols. In Swedish it is trivial to find a counter example to your claim. Also note that you are using a different encoding than the one I defined so your comparison does not fully apply.

Edited by Ghideon
##### Share on other sites

5 hours ago, exchemist said:

Jörk? 😄

😆

I would seriously would like this conversation to get back on its tracks.

I don't know what relevance monoidal categories would have in the conversation. Or functors and categories, or metrics of algorithmic complexity (those I think came up in previous but related thread and were brought from out of the blue by offended member.) Mentioned offended member then shifts to using terms as "efficient" or "surprising," apparently implying some unspecified technical sense.

Summoning highfalutin concepts by name without explanation and dismissing everything else everyone is saying on the grounds that... well, that they don't live up to your expectations in expertise, I don't think is the most useful strategy.

I think entropy can be defined at many different levels depending on the level of description that one is trying to achieve. In that sense, I think it would be useful to talk about control parameters, which I think say it all about what level of description one is trying to achieve.

Every system (whether a computer, a gas, or a coding machine) would have a set of states that we can control, and a set of microstates, that have been programmed either by us or by Nature, that we can't see, control, etc. It's in that sense that the concept of entropy, be it Shannon's or Clausius/Boltzmann, etc. is relevant.

It's my intuition that in the case of a computer, the control parameters are the bits that can be read out, while the entropic degrees of freedom correspond to the bits that are being used by the program, but cannot be read out --thereby the entropy. But I'm not sure about this and I would like to know of other views on how to interpret this.

The fact that Shannon entropy may decrease doesn't really bother me because, as I said before, a system that's not the whole universe can have its entropy decrease without any physical laws being violated.

##### Share on other sites

Posted (edited)
On 3/14/2022 at 4:07 AM, Ghideon said:

Explicit definition of sender, receiver and channel is not required.

Right. According to you a computer can be switched off and still be computing! What a fascinating worldview.

On 3/14/2022 at 4:07 AM, Ghideon said:

The example I provided is not about what a reader may or may not understand or find surprising (that is subjective),

More completely dumbass stuff from an "expert". Shannon entropy is about the frequency of messages; it's about information content and how to encode that efficiently. The surprise factor is not some kind of highfalutin terminology. Expectation is not an ill-defined term in communication theory. My guess is you probably think data and information are different things too.

You provide an example: the Swedish language without the extra marks. A change of encoding that makes almost no difference to the information content. So it has about the same entropy.

What a pack of retards.

Edited by SuperSlim
##### Share on other sites

1 hour ago, SuperSlim said:

My guess is you probably think data and information are different things too.

Despite the fact that this thread is primarily about entropy, not information, I think a digression about the difference between data and information is worth noting.

Data can contain other things, besides information.
In communications theory and encryption theory this might be padding data.
Interestingly may padding contain additional information that is not part of the 'message' or 'information content' of the message. This is because it is possible to analyse the padding to deduce the proprotion of padding and therefore isolate the information bits.
Further in computing theory data may also contain a further section, know as a 'key', for the purpose of information storage and retrieval.

The persistent incivility has been reported.

## Create an account or sign in to comment

You need to be a member in order to leave a comment

## Create an account

Sign up for a new account in our community. It's easy!

Register a new account