Jump to content

Base 256 character set, and "Base Byte" numbering system.


tar
 Share

Recommended Posts

Thread,

 

Over 1/4 century ago, I was reading about base 256 and the book stated that if you used all the english letters and numbers, and borrowed characters from other languages you would still not have 256 identifiable characters, and nobody was about to come up with such a character set...so I took that as a challenge and came up with 256 unique and sensible characters. I wrote a basic program on my Commodore 64 Executive that printed them all out. I sent it in to the patent office, but they said you couldn't patent a number system, but I might be able to copyright the characters. I told various people about the system over the years, and entertained various ideas that used the characters, but never fully liked the system because I had a binary character wrapped around a center dot, and if you looked at the character sideways or upside down, it looked like another character. About 1/2 year ago, I came up with a solution. Make the center dot a right triangle, with the 90 degree corner always up and to the right. Then you could recognize an upside down, or sideways character as unique.

 

So I am presenting here the Base Byte numbering system, for my friends on Scienceforum and the visitors here. With newly developed IPV6, the need to simply visualize and designate huge exact numbers is more prevalent today, than ever. And with the thousands of expert app creators, and coders around, it would not be unlikely that ways to use such a system as Base Byte, could come into existence.

 

The principle is simple and straightforward. It is base 256, with places to the left of the byte point being worth 1, 256, 65536,16777216, and so on, as you add places to the left, and places to the right of the byte point being worth 1/256, 1/65536 and so on, as you add places off to the right.

 

Then you need the characters that would occupy the places. You would need a zero character, or place holder, which is the small right triangle with the right angle pointing up and to the right. Then you would need 255 other characters, which you can get by taking a byte of binary and hang the ones on the triangle, around in a circle, each in their place. 00000001 is a line, up and to the right from the triangle (like an hour hand on an analogue clock at 1:30.)

 

post-15509-0-20816800-1434561614_thumb.jpg

 

Very large, exact numbers, can be designated with relatively few characters, using this system.

 

Regards, TAR

 


I have yet to come up with a way (simple, systematic and pleasing) to pronounce these symbols. Anybody have a good idea?

Edited by tar
Link to comment
Share on other sites

Must I climb to the top of the highest hill to consult with this oracle "him, that was formerly known as Prince"?

Thread,

 

If I(we)(or him, that was formerly known as Prince) can come up with a way to pronounce these 256 characters, we would have similtaneously come up with a way to pronounce a byte worth of binary characters, as well. Other than zero, one, one, zero, one, zero, zero, one, zero. Which is hard to identify with anything and takes too long to say. It would be good to come up with a way to pronounce a character, to where, when you heard it, you visualized its shape.

 

Regards. TAR

Edited by tar
Link to comment
Share on other sites

Thorham,

 

And so you probably should.

 

I was just remembering, that one of my solutions for how one might pronounce BaseByte, was to think in terms of hex. The four left hand binary character in a byte have a hex representative, and so do the four on the right. Therefore a binary byte can be pronounced in Hex, as in A6. Perhaps where I left this thought, before, was in coming up with a particular way to pronounce 0,1,2,3,4,5,7,8,9,A,B,C,D,E,F each in one syllable when designating the first digit in a hex pair, and a slightly different way to pronounce the same digit when it was the second in the pair.

 

Regards, TAR

Link to comment
Share on other sites

"Perhaps where I left this thought, before, was in coming up with a particular way to pronounce 0,1,2,3,4,5,7,8,9,A,B,C,D,E,F "

 

un deux trois quatre cinq six sept huit neuf dix

http://www.frenchtutorial.com/en/learn-french/pronunciation/alphabet

You are stuffed with F which is pretty similar in English and French.

You might have to try German or something.

Link to comment
Share on other sites

John Cuthber,

 

Thank you, that is good. Maybe French for the first Hex digit and German for the second.

 

But that is just the numbers. What about the A,B,C,D,E and F?

 

Regards, TAR


I don't know French, and will have to remember how the Germans pronounce the alphabet. Maybe that will work out as well.

 

 

Good idea. Thanks John Cuthber. I will work on that.

Link to comment
Share on other sites

John Cuthber,

 

Thank you, that is good. Maybe French for the first Hex digit and German for the second.

 

But that is just the numbers. What about the A,B,C,D,E and F?

 

Regards, TAR.

 

 

What about:

 

Artph, banth, cantomine, diatherm, econtad and fartaguil?

Link to comment
Share on other sites

John Cuthber,

 

Thanks for the French alphabet pronunciation link. I skimmed past that before.

 

As for dimreeper's letter pronunciations, they are not one syllable, sensible, meaningful, and easy to remember.

 

Regards, TAR

Link to comment
Share on other sites

Over 1/4 century ago, I was reading about base 256 and the book stated that if you used all the english letters and numbers, and borrowed characters from other languages you would still not have 256 identifiable characters

 

That is a bizarre, if not profoundly ignorant, thing for someone to have said. I'm fairly sure you would get close to 256 just using European characters (Greek and Cyrillic upper and lower case, the various accented characters, etc). You might have to throw in Korean and Hebrew if you wanted to avoid punctuation. But this ignores things like Japanese and Chinese (traditional and simplified), the many Indian scripts, Thai, Vietnamese ...

 

But kudos for coming up with an imaginative and regular scheme. It reminds me of the phase encoding used in high speed modems.

 

I have yet to come up with a way (simple, systematic and pleasing) to pronounce these symbols. Anybody have a good idea?

 

You could try allocating a consonant to each position and then devise some rules for the ordering of these to produce pronounceable words (with vowels inserted as necessary to break up impossible consonant clusters).

 

I don't know it is possible for these all to be single syllables in the rules of English phonemics. I doubt it. But maybe if you allow multiple choices for the consonant in each position to allow a larger number of legal consonant clusters to be created. This might mean there is no unique mapping from a symbol to its name (i.e. a symbol could have multiple names) but there is a unique symbol for each name.

 

You could start by just labelling the angles with the consonants in order (you can go round just over twice) and then seeing how various numbers come out as words. I would probably drop the redundant letters (c, x, q) and replace them with the consonants we don't have letters for (sh, th, ...)

Link to comment
Share on other sites

Strange,

 

I probably misquoted the book about nobody being likely to come up with a 256 characters. The drawback though was not that you couldn't come up with 256 characters, but that most such collections would be arbitrary and it would be hard to remember what was assigned to what value.

 

I like your angle idea, and that might be workable in giving each character a characteristic "look" when its angles are pronounced, but that might be too long when the character has 6 arms.

 

The hex idea I think is a good one to work on, because the sound would actually be the same for a Base Byte character, or a Hex pair. This would keep the numerical "meaning" of the character within its pronouncable name.

 

Using the John Cuthber, French/German two syllable approach to name the Hex pair that Thorham would rather work with, satisfies a number of my criteria for a good system.

 

in fact if you can say 255 by saying FF that is more efficient than saying two-fif-ty-five.

 

Regards, TAR

Link to comment
Share on other sites

Using the John Cuthber, French/German two syllable approach to name the Hex pair that Thorham would rather work with, satisfies a number of my criteria for a good system.

My reason for preferring normal hexadecimal notation is that you don't have too many digits. You could, for example, use base 100 to replace base 10, and you'd have the same problem as replacing base 16 with base 256: Too many digits. There's a point where more digits doesn't improve readability, and doesn't make numbers easier to write.

 

As for naming conventions, just pronounce a hex number the way it reads: 0xBC50 -> hex bee see fifty. Don't make it more complicated than it is.

 

Link to comment
Share on other sites

I like your angle idea, and that might be workable in giving each character a characteristic "look" when its angles are pronounced, but that might be too long when the character has 6 arms.

6 arms (digits?) could be represented by "sklumpf", for example. 8 arm numbers would require two syllables (e.g. "blondsting").

 

But I think pronouncing as hex is a more sensible approach...

Link to comment
Share on other sites

However, I am liking John Cuthber's idea of having the x16 left portion of the character being in one language and the x0 right portion being in another.

 

Hex does only a partial job, because there is no differentiation between the value of the left and right symbols. The pair makes the difference and the left hand symbol should be thought of, as times 16.

 

In addition, since the beauty of the system is that there are 256 distinct looking characters, each character should have a distinct sounding name. It should not be a pair of words, each should have one word, its own word. Great though, to have the hex idea built in. Then a character can simultaneously show the binary reality of the number, and the hex nature, while remaining a "Base Byte" character.

 

So, I am thinking to rename the hex characters in a Latin way for the first syllable (left, x16 part of the character) and in an English way for the right.

 

Also abbreviating quindecim to a one syllable quind, and fifteen to fift. The ABCDEF portion of the hex count would go dec, und, dude, tred, quad, quind, and the english portion would go ten, elt, twelt, thirt, fourt, fift.

 

So, going down the rows in the last diagram, each "first part" of a character would be (decimal number=syllable) 0=Null, 16=un, 32=du, 48=tres, 64=qua, 80=quin, 96=sex, 112=sep, 128=oct, 144=nov, 160=dec, 176=und, 192=dude, 208=tred, 224=quad. 240=quid

 

Going across the columns, each second part, or right part of a character would be pronounced (decimal number=syllable) 0=zer, 1=one, 2=two, 3=three, 4=four, 5=five, 6=six, 7=sev(en), 8=eight, 9=nine, 10=ten, 11=elt, 12=twet, 13=thirt, 14=fort, and 15=fift.

 

Thus there are 256 combinations of the above 32 syllables. Keeping the Latin first, and the English second, each combination is an exact number, ie. quaelt is 64+11 or 75 decimal. Which is also 4B hex, and the basebyte character labeled 4B in the diagram, fifth row down, 12th column over.

 

Thus by labeling the rows

 

Null, un, du, tres, qua, quin, sex, sep, oct, nov, dec, und, dude, tred, quad, quid,

 

and the columns

 

zer, one, two, three, four, five, six, sev(en), eight, nine, ten, elt, twet, thirt, fort, fift,

 

you can pronounce each of the 256 base byte characters in one word of two syllables.

 

Regards, TAR

Edited by tar
Link to comment
Share on other sites

I told various people about the system over the years, and entertained various ideas that used the characters, but never fully liked the system because I had a binary character wrapped around a center dot, and if you looked at the character sideways or upside down, it looked like another character. About 1/2 year ago, I came up with a solution. Make the center dot a right triangle, with the 90 degree corner always up and to the right. Then you could recognize an upside down, or sideways character as unique.

 

I think a raindrop shape or D shape would be easier to draw by hand.

Link to comment
Share on other sites

You could easily expand it to more bytes. It's merely a shape with 8 pieces that lend themselves to an intutive ordering, and where included pieces are 1 and excluded are 0. You could express the next byte as a square along the perimeter, with straight line pieces touching the vertical and horizontal lines, and corner pieces toughing the diagonals. The next byte could be a 45o rotated square, and the next could be an extra set of lines beyond the perimeter, and so on and so on.


It certainly has an advantage in only requiring you to draw the 1s and not the 0s.


Furthermore, your shape allows 180o opposite lines to be fused, making it a very concise drawing indeed.

Link to comment
Share on other sites

MonDie,

 

I am not sure about the D, as it looks the same upside down.

However, the raindrop might be good, as gravity would always have the bottom toward the 8 arm and the point straight up in the 128 place.

 

Now that I think of it, the D works too, as the middle of the straight part of the D, is always in the direction of the 32 arm.

 

How about we combine the triangle and the D and have a D tilted left 45 degrees? The hypotenuse would then be orthogonal to the 16, SW arm, and the curve of the D would be symbolic of the right angle pointing to the "first" place. Then the way to draw the nullzer symbol would be to draw a small line, like the NW to SE part of an X and then a 45 degree tilted, backward C to close the figure and connect the ends of the first line.

 

Actually MonDie I am liking the D a lot. The D gives the drawer of the rest of the character a reference line from which to draw the other lines. The middle of the straight line of the D would be the center of the character. I am thinking that maybe not tilting it 45 degrees, but keeping it in D orientation might be ideal. Well actually, after drawing the 255 character around a D in the 8 different orientations, the one that makes the most sense is a backward D, because the arms to the right of the line (or the extension of the line downward) are the 1,2,4,8 arms, and the arms with "something extra" the lines that cross the arc or the one that goes straight up, are the x16 arms, or 16, 32, 64 and 128.

 

So I think we might go with the MonDie backward D.

 

Nice.

 

Regards, TAR


Mondie,

 

Expanding on your ideas.

 

The inside, first byte could be designated by the backward D semi circle, and then outside this you could draw concentric circles, the area between the first and second circle would be your second byte to the left, in the normal positional numbering sense. Then another larger circle would give you an area between the second and third circles that would be your third byte to the left and the area just inside your fourth circle would be your fourth byte to the left. You couldn't show fractions though, unless you had a second circle arrangement next to, or below the first arrangement that showed a "divided by" quantity.

 

Using this scheme, with 34 concentric circles, each with eight points that could show every number between 0 and 255 decimal, you could show the exact number of atoms in the observable universe (if you knew that number) with a large diagram of 34 concentric circles, and an the appropriate marks at 45, 90, 135, 180, 225, 270, 315, 360 degrees in each concentric circular space.

 

Each circle would be worth 256 time the value the "character" had in the circle inside. So 256 to the 34th power is 7.5e+81. 255 time that (largest base byte character in the 34th position) is 1.9e+84 . So, within the 34th area (including the 33 areas below) you should be able to exactly show the 10e+79 to 10e+82 amount of atoms in the observable universe. Which means each atom in the observable universe could have its own 68 syllable name.

 

Regards, TAR


So pronunciationwise we have to come up with a way to say that a character is in the x256th place or the x65536 place or the times 16777216 place, etc..

 

Some derivation of the names for the 256 characters should do the trick though, since 256 places should be quite enough. 256 to the 256th power is 3.23e+616.

 

But in counting, what comes after quidfift, when you have one in the second place?


MonDie,

 

Actually your continuation idea is nice in another way as well. The first position in the x256 position is twice the value of the last position (128) in the x1 position. That means that each 1/8 position on a tightly wrapped spiral is worth double the value of the previous position. We all know how powerful doubling is, so big exact numbers could be designated on such a MonDie spiral by simple marks on the spiral in the appropriate 1/8th spots. Any length binary number you wanted, could be shown in this fashion.

 

Regards, TAR


Thread,

 

The MonDie Spiral

post-15509-0-13403900-1434819888_thumb.jpg

 

Regards, TAR


equal to 11111111111111111111111111111111111111111111111111111111111111111111111111111111(binary) but somehow easier to see what that means

Edited by tar
Link to comment
Share on other sites

Thorham,

 

Or 10 Base Byte characters.

 

I understand the system is already set, but Base Byte is a way to show a pair of hex characters in one character, and it shows the binary nature at the same time. It is everything that hex is, plus everything that binary is, and it is base 256, all rolled into one.

 

Base 256 is sometimes shown in hex pairs, sometimes shown in decimal. It can be shown in BaseByte characters as well. Certainly there is no pressure for anybody to use it. Only if it makes doing or visualizing something easier, would it be used.

 

I understand the current systems handle the situation fine. It is not broken. I just am offering another neat and simple way to do it where the doubling and times 256 has more visual power.

 

You are probably right though, nobody will be interested...but maybe if I write a program that will display the characters and work with the characters there will be some new and interesting ways to display numbers and charts and graphs, that have more visual meaning, than hex numbers. In your argument's favor however are constructed languages like Esperanto. Nicely put together, good ideas, but not organic, not automatic, not built from the ground up, in concert with all the accessories.

 

So I accept your opinion as probably the right way to look at it, but I will offer the system anyway. Just in case it makes sense to anyone, or is at least neat to ponder.

 

Here is the revised notation, with the backward D and the Latin/English syllables.

post-15509-0-92177500-1434865374_thumb.jpg

Regards, TAR

Edited by tar
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.