Jump to content

Why our DNA use 4 bit operation?


alpha2cen

Recommended Posts

Our cell use DNA as information storage medium. This is not 2 bit 0 1 , but 4 bit A T G C code.

This is very developed method, which has been tested 4.5 billion years from the beginning of the earth.

What is the benefit of this 4 bit storage method?

Link to comment
Share on other sites

The DNA is not a 4-bit-storage, it's alot more than that, and also (what I think) that

the DNA is used to store a unique generic data that identifies the human, it's more like

the perfect hash .. it is also the reference for every component of the body to get information

about how to work, for example color of the skin is stored in the DNA all over the body ...

Link to comment
Share on other sites

Khaled,

I think Alpha's point is that you could do that equally well with a binary code. Why are there 4 bases, not 2?

Mapping the quaternary code onto a binary one wouldn't be difficult.

However, if you did so then the data strings would be longer.

It may be that 4 bases constitute a compromise between shorter genes and having a more complicated system of lots of bases.

Of course, it might simply be that, because it works, it has carried on

Link to comment
Share on other sites

2-bit dna would be more susceptibel to damage. With 64 possible combinations (each codon coding for an amino acid is 3 bases long) coding for 20 amino acids, you get a lot of redundancy. Even better: if a mutation occured which would code for another amino acid, it is very likely that an amino acid with comparable functionality would be built into a proteine.

Link to comment
Share on other sites

you mean base-4, QUADRATIC, { 0000, 0001, .. , 1111 }

 

means [latex]2^4 = 16[/latex] possibilities .. can be represented in HEXADECIMAL { 0, .. , 9, A, .. , F }

 

every digit (or two\three) is enough to encode a value,

 

1 base-4 digit = [latex]2^4 = 16[/latex] possible values

 

2 base-4 digits = [latex]2^4 \times 2^4 = 256[/latex] possible values

 

3 base-4 digits = [latex]2^4 \times 2^4 \times 2^4 = 4096[/latex] (4K) possible values

 

amazingly, if we go for 4 base-4 digits = [latex]2^4 \times 2^4 \times 2^4 \times 2^4 = 65536[/latex] (64K) possible values !

 

and if you go little bit ahead, it will go galactical ...

 

also about the safety of data, i don't think there is any safety, it's genetic-based, using population approximation and the possibility of mutations ...

Link to comment
Share on other sites

you mean base-4, QUADRATIC, { 0000, 0001, .. , 1111 }

 

means [latex]2^4 = 16[/latex] possibilities .. can be represented in HEXADECIMAL { 0, .. , 9, A, .. , F }

 

every digit (or two\three) is enough to encode a value,

 

1 base-4 digit = [latex]2^4 = 16[/latex] possible values

 

2 base-4 digits = [latex]2^4 \times 2^4 = 256[/latex] possible values

 

3 base-4 digits = [latex]2^4 \times 2^4 \times 2^4 = 4096[/latex] (4K) possible values

 

amazingly, if we go for 4 base-4 digits = [latex]2^4 \times 2^4 \times 2^4 \times 2^4 = 65536[/latex] (64K) possible values !

 

and if you go little bit ahead, it will go galactical ...

 

also about the safety of data, i don't think there is any safety, it's genetic-based, using population approximation and the possibility of mutations ...

 

I think he meant two bit [math] 2^2 [/math] as in:

 

G 00

A 01

T 10

C 11

 

which if represented as say just Guanine and Cytosine a Codon would have to be six pairs and the pairs would have to be paired successively as well, doubling the length of the chain like:

 

what was

 

GAT

CTA

 

would now be

 

GG GC CG

CC CG GC

 

assuming that

 

G = GG

A = GC

T = CG

C = CC

 

And also in real life there are other nucleotides and DNA is not represented the same in all species ... I would assume this change would have much impact on the ability to transcribe codons as the properties necessary to code for proteins is probably imparted on by their chemical activity with the nucleotides in question!

 

Oh and if Adenine didn't exist neither would ATP right?

Link to comment
Share on other sites

How about this assumption not proved?

 

min(2bit(information storage energy + information reading energy + information writing energy)) >= min(4bit(information storage energy + information reading energy + information writing energy))

 

or

 

max(2bit(information reading speed + information writing speed)) <= max(4bit(information reading speed + information writing speed))

 

where min( ) is minimization function, max( ) is maximization function .

Edited by alpha2cen
Link to comment
Share on other sites

I think he meant two bit [math] 2^2 [/math] as in:

 

G 00

A 01

T 10

C 11

 

which if represented as say just Guanine and Cytosine a Codon would have to be six pairs and the pairs would have to be paired successively as well, doubling the length of the chain like:

 

what was

 

GAT

CTA

 

would now be

 

GG GC CG

CC CG GC

 

assuming that

 

G = GG

A = GC

T = CG

C = CC

Which there isn't, right? I know that I might sound stoopid for asking, but you're saying the 6 pair codon would exist hypothetically if what the OP said was true, am I correct?

Because, I think that codons only come in pairs of three, right?

And also in real life there are other nucleotides and DNA is not represented the same in all species ... I would assume this change would have much impact on the ability to transcribe codons as the properties necessary to code for proteins is probably imparted on by their chemical activity with the nucleotides in question!

 

Oh and if Adenine didn't exist neither would ATP right?

 

You're right. I remember ATP from high school.

 

 

Our cell use DNA as information storage medium. This is not 2 bit 0 1 , but 4 bit A T G C code.

 

As another poster pointed out, it's not necessarily number 4-state number system. With binary you have two states, with zero and one representing false and true - respectively. The number of bits, however really just represent the number of places that are used to represent the number. For instance, the numbers 8 to 15 would need 4-bits, or places, to be represented in binary, but you can represent those numbers with an arbitrary number of places that is greater than 4.

But, all the numbers in binary are the same as the numbers that can be represented in the octal, decimal, and hexadecimal number systems - you just convert them.

This is very developed method, which has been tested 4.5 billion years from the beginning of the earth.

There's nothing really "developed" about it though. A number system with a base of two is no less developed than one that uses a base of 4, or 8, 10, or 16. They're really just the same numbers anyways. The advantage of using number systems with a higher base would be that they're cheaper to represent on paper or any visual media. However, the larger the base the more visual symbols are needed that represent the digits that are smaller than the base number.

What is the benefit of this 4 bit storage method?

 

Not a whole lot, I guess. 4-bits doesn't really store anything these days, right?

 

Our cell use DNA as information storage medium. This is not 2 bit 0 1 , but 4 bit A T G C code.

 

As another poster pointed out, it's not necessarily number 4-state number system. With binary you have two states, with zero and one representing false and true - respectively. The number of bits, however really just represent the number of places that are used to represent the number. For instance, the numbers 8 to 15 would need 4-bits, or places, to be represented in binary, but you can represent those numbers with an arbitrary number of places that is greater than 4.

But, all the numbers in binary are the same as the numbers that can be represented in the octal, decimal, and hexadecimal number systems - you just convert them.

This is very developed method, which has been tested 4.5 billion years from the beginning of the earth.

There's nothing really "developed" about it though. A number system with a base of two is no less developed than one that uses a base of 4, or 8, 10, or 16. They're really just the same numbers anyways. The advantage of using number systems with a higher base would be that they're cheaper to represent on paper or any visual media. However, the larger the base the more visual symbols are needed that represent the digits that are smaller than the base number.

What is the benefit of this 4 bit storage method?

 

Not a whole lot, I guess. 4-bits doesn't really store anything these days, right?

 

Khaled,

I think Alpha's point is that you could do that equally well with a binary code. Why are there 4 bases, not 2?

Mapping the quaternary code onto a binary one wouldn't be difficult.

However, if you did so then the data strings would be longer.

It may be that 4 bases constitute a compromise between shorter genes and having a more complicated system of lots of bases.

Of course, it might simply be that, because it works, it has carried on

 

It depends on the system that you're using. Binary only works for computers because it's easier to have gates which only need to be able to tell the difference between HIGH or LOW voltage.

I recall from my computer science Networking class, that one Line Coding technique uses three states to represent a digit. Meaning, that you needed 3 states to represent something like -5, 0, and +5 volts. But, this form of line coding has its uses and isn't something that you use with everything else. I guess that the same would apply to a base-4 number system.

Edited by liars_paradox
Link to comment
Share on other sites

 

Which there isn't, right? I know that I might sound stoopid for asking, but you're saying the 6 pair codon would exist hypothetically if what the OP said was true, am I correct?

Because, I think that codons only come in pairs of three, right?

 

 

Yes, if DNA was binary RNA would be transcribed six at a time if the codon/amino acid tables were to maintain the same essential structure. The current coding structure is highly redundant however so the question remains would it be necessitated to maintain? I am not familiar with the influence of codon order and ribosome binding on bio-chemical activity which may be some of the reasoning behind the redundant nature of the code. The code could simply be imperfect and maligned even despite its evolution and progress; it wasn't like someone had engineered the process! But if this last bit was the case how could one explain ATP Synthase and the Krebs Cycle a highly sophisticated and complex system which is as of yet unparalleled by anything man has ever made .... Point? The validity of this question is questionable given the scope it is being asked in.

Edited by Xittenn
Link to comment
Share on other sites

Yes, if DNA was binary RNA would be transcribed six at a time if the codon/protein tables were to maintain the same essential structure.

Okay, I'm going to try see if I understand this sentence. If I did the following to your sentence above, would it have the intended meaning of your original sentence?

Yes, if DNA was binary, then RNA would be transcribed six at a time - that is if the codon/protein tables were to maintain the same essential structure.

 

When I first read your sentence, I did actually think that you said something like "if (DNA==binaryRNA) ....", which threw me off and took me a while to figure out your sentence. lol. I'm sure most of you don't actually experience that, but I know that I did and just had to make sure.

The current coding structure is highly redundant however so the question remains would it be necessitated to maintain?

The "current coding structure"? Which one is that, the hypothetical one or the real-life one? I think that in real-life, the codon's structure isn't redundant. If I'm not mistaken, the 3 base pairs is the smallest unit of "code" for DNA/RNA. The only redundancy that you encounter is between the valid string of codons. I think that the valid string starts with a specific codon (I forget what), and ends with another codon (I also don't remember that one either).

But, in between the finish and start codons is the redundant DNA/RNA. This technique is much like one technique that they use for data transmission in networks. They surround the good data with redundant data, before they transmit the data, in the hopes that the useless data will protect the good data.

I am not familiar with the influence of codon order and ribosome binding on bio-chemical activity which may be some of the reasoning behind the redundant nature of the code. The code could simply be imperfect and maligned even despite its evolution and progress; it wasn't like someone had engineered the process!

I know that you don't believe, so it definitely wouldn't seem like anything did, and I understand your reasons for seeing things like this. But, for me, it sort of seems like something did. Like how with transmitted data in networks you have redundant code that surrounds the valid code, I think that the same applies to DNA. In case something happens to a cell's DNA that would cause it to mutate, the redundant code acts a barrier around the valid string of codons. The readable DNA is broken up into smaller segments, each surrounded by redundant DNA, so as to decrease the chances that the readable DNA would be mutated.

 

 

 

 

Link to comment
Share on other sites

Yeah, if DNA were binary you could get away with using 5 binary base pairs instead of 3 base 4 base pairs (so 5 bits instead of 6). There would still be some redundancy. However, the current organization is pretty good at preventing dangerous mutations, and dropping one bit would lose half of that redundancy.

Link to comment
Share on other sites

From my experience, I think 2 bit coding system is more energy efficient. To make 2 more compound for cording is more difficult to do.

But information treatment speed is also important. There are speed limits of molecules moving in the liquid, i.e., diffusivity. And so, 4bit system might be more efficient than 2bit system in the cell liquid.

Link to comment
Share on other sites

Our 3-place base-4 codons result in 64 combinations (ie, 4³=64) for only 20 amino acids. So, we could say that 2-place base-5 codons (5²=25 combinations) would be more "information efficient", but perhaps evolution, in effect, abandoned efficiency for the sake of flexibility and expansion (ie, evolution).

 

A strong correlation exists between the first two bases in the codons and the amino acids they code for, and a weak correlation exists between the third base and the amino acids. So, maybe codons were originally 2-place base-4 (4²=16) and coded for maybe a dozen amino acids. I wonder what is the least number of amino acids required for life, and in what species does this occur?

 

I think if we worked out the evolution of amino acids and codons, we would understand the progression of what we now see. For example, glycine (GGU and GGC) is an intermediate of serine (AGU and AGC), and they show the strong base-amino correlation described above and differ only by the first base. Phenylalanine (UUU and UUC) is an intermediate for tyrosine (UAU and UAC), and they show the strong correlation and differ only by the second base. The three Stop codons (UAA, UAG and UGA) differ only by the second or third base, and the codon that differs by both the second *and* third base (UGG) is the only codon that codes for Tryptophan.

 

Someone must have done research on the evolution of codons and amino acids.

Link to comment
Share on other sites

 

I know that you don't believe, so it definitely wouldn't seem like anything did, and I understand your reasons for seeing things like this. But, for me, it sort of seems like something did. Like how with transmitted data in networks you have redundant code that surrounds the valid code, I think that the same applies to DNA. In case something happens to a cell's DNA that would cause it to mutate, the redundant code acts a barrier around the valid string of codons. The readable DNA is broken up into smaller segments, each surrounded by redundant DNA, so as to decrease the chances that the readable DNA would be mutated.

 

 

Yeah I was referring to the degeneracy of the amino acid/codon code where there are different codons coding for the same amino acids. I was processing how this might affect the resulting Electrostatic Potential Map formed at the binding site. This, in my thoughts, would be a consequence of what ewmon is mentioning and would have bearing on the probability of finding certain formations including malformations ....

 

The wrapping of redundant code around functional code is a pretty well documented issue and I do believe it is formally dubbed non-coding DNA!

Edited by Xittenn
Link to comment
Share on other sites

Someone must have done research on the evolution of codons and amino acids.

 

I think this is very important to understand life designs.

Form this we can understand the basic operating principles of all biochemical reaction.

...

From I mentioned above, one of the dominating factor for selecting 4bit is speed.

If we recoded same information to 2 bit, we would be required two times longer molecules, i.e., two times long m-rna.

This makes the movement from transcription place to translate place very slow.

The diffusivity of the macro-molecules is reduced by this relation, M-1/2 , and stereoscopic interference by other molecules in the liquid is high. I suppose that this is one of the very important factor to select 4bit evolution process.

Edited by alpha2cen
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.