Jump to content

Compression slow but good

Featured Replies

And here it is, the long waited algorithm thanks to Staphen that provided the algorithm in Crunchyroll. You are free to test it out on bigger files, currently it only runs for Blank.gif without going too slow. You are free to criticize only if you found a way to make this faster. Starting with iteratedbitcount. Thanks in advance!

 

using System;
using System.IO;
using System.Collections.Generic;
using System.Linq;
using System.Numerics;
using System.Threading.Tasks;

namespace HelloWorld
{
class Hello
{
public static byte[] bytes = System.IO.File.ReadAllBytes(@"E:\data\Blank.gif");

public static byte[] bytes_new;

static void Main()
{
BigInteger number_compress = new BigInteger(bytes);

Console.WriteLine(number_compress);
Console.WriteLine();

int[] one = new int[50];
int[] zero = new int[50];
int[] position = new int[50];

for (int i = 0; i < 50; i++)
{
//Console.WriteLine(i);

one = IteratedBitcount(number_compress);
zero = IteratedZerocount(number_compress);
position = one + zero;

BigInteger fac = Factorial(one);
BigInteger n = FindN(number_compress, one, position);

number_compress = n;


}

Console.WriteLine(number_compress);
Console.WriteLine();

BigInteger number_decompress = 0;

for (int i = 50-1; i >= 0; i--)
{
//Console.WriteLine(i);

number_decompress = FindNumber(number_compress, one, position);

number_compress = number_decompress;
}

Console.WriteLine(number_decompress);
Console.WriteLine();

bytes_new = number_decompress.ToByteArray();

File.WriteAllBytes(@"E:\data\New_Blank.gif", bytes_new);

}

static BigInteger FindNumber(BigInteger n, int one, int position)
{
BigInteger number=0;
int k = one;

for (int j = position-1; j >=0 ; j--)
{
BigInteger mult = j;
BigInteger num = 0;

for (BigInteger i = 1; i < (BigInteger)k; i++)
{
mult *= (j - i);
}


num = (mult / Factorial(k));

if (num <= n)
{
number = number + BigInteger.Pow(2, j);
k--;
n = n - num;
}
if (k == 0)
break;
}
return number;
}

static BigInteger FindN(BigInteger number, int one, int position)
{
BigInteger findMSB = number;
BigInteger test = number;
var numBits = (int)Math.Ceiling(BigInteger.Log(2));
BigInteger MSB=1;
while (findMSB > 0)
{
MSB <<= 1;
findMSB >>= 1;
}
MSB >>= 1;

BigInteger n = 0;
int k = one;
int pos = position;

while (MSB != 0)
{
if ((test & MSB) > 0)
{
int a = pos-1;
BigInteger mult = a;

for (BigInteger i = 1; i < (BigInteger)k; i++)
{
mult *= (a - i);
}


n += (mult/Factorial(k));
k--;

}
MSB >>= 1;
pos--;

}
return n;
}

static int IteratedBitcount(BigInteger n)
{
BigInteger test = n;
int count = 0;

while (test != 0)
{
if ((test & 1) == 1)
{
count++;
}
test >>= 1;
}
return count;
}

static int IteratedZerocount(BigInteger n)
{
BigInteger test = n;
int count = 0;

while (test != 0)
{
if ((test & 1)==0)
{
count++;
}
test >>= 1;
}
return count;
}

static BigInteger Factorial(int arg)
{
BigInteger value = 1;
for (int index = 2; index <= arg; index++)
{
//Console.WriteLine(index);
value *= index;
}
return value;
}


}
}

You are free to test it out on bigger files, currently it only runs for Blank.gif without going too slow.

 

 

What language is this written in?

 

How large is "blank.gif"?

 

Have you checked that the output of compressing and then decompressing the file is identical to the input?

 

Have you tested it with other files of a similar size (e.g. a sequence of random bytes)?

 

 

 

You are free to criticize only if you found a way to make this faster.

 

Not if we find it doesn't work? :)

Again, and again, the same..

You have no idea whether it truly works, as you have no decompression function...

Writing off data to file, disallows you checking whether every single byte is equal to original (after compressing, then decompressing data).

 

Result from this line of code is nowhere used, as fac is nowhere else referenced:

BigInteger fac = Factorial(one);

 

What language is this written in?

C# .NET Framework

Edited by Sensei

  • Author

 

 

What language is this written in?

 

How large is "blank.gif"?

 

Have you checked that the output of compressing and then decompressing the file is identical to the input?

 

Have you tested it with other files of a similar size (e.g. a sequence of random bytes)?

 

 

Not if we find it doesn't work? :)

Yes the output is identical to the input Strange :D. Gonna try optimizing this weekend, let me know if you can help lol

Again, and again, the same..

You have no idea whether it truly works, as you have no decompression function...

Writing off data to file, disallows you checking whether every single byte is equal to original (after compressing, then decompressing data).

 

Result from this line of code is nowhere used, as fac is nowhere else referenced:

BigInteger fac = Factorial(one);

 

C# .NET Framework

Cool, I'll take it out, thanks!

  • Author

This step is taking the most amount of my time. My best bet is a multiplication table stored in database, anyone else got any idea? The for loop here is taking too long, no for loop

for (int i = 1; i < k; i++)
{
    mult *= (a - (BigInteger)i);
}

As far as I can see you are calculating mult = a*(a-1)*...*(a-k+1) in that loop. Special cases like k>a+1 and k<2 aside (consider if they can appear in your code), this is the same as mult=a!/k!. You already have a factorial function in your code: Use it, then optimize it with a lookup-table if needed. Note: You are dividing by another k! later on. You only need to calculate it once, of course.

I would still like to see some evidence that the algorithm works, and how well it works. What is the size of the inout file? What is the size of the compressed file? How does that change for (a) random data; (b) ordered data; (c ) all zeroes; etc. Is the result of uncompressing the data identical in all cases?

  • Author

Let's see, I got a question first, I converted bytes to BigInteger like this

BigInteger number_compress = new BigInteger(bytes.Concat(new byte[] { 0 }).ToArray());

Now the byte array is [378] byte array, as you know byte array is 8 bits each, so that's a total of 3024 bits. The thing is after conversion to number_compress, this BigInteger has 911 base 10 digits, so I'm not sure if BigInteger actually made the value bigger since I can't convert BigInteger to bits in c#. Below is the number.

10511498515079005786001923156952989929760765099887642049490645075407147670074787123453081119964496075798572683908765969937737368112628851278719151548358246532408860460103033076417881911156352613863717173123230737730516444716700786319847260734876596336573683014182274170198577808390323212579797450822781696155303651128091350419064337358263354652969443506081373456328170981909504950042936463577106765595222710454155651057937886003720893239530662179029448821080213785811445808269660735189910921681751787499664128734166678222596180132877088540155530232472178185615057339811168768357276447018187314658464330483545764600882883706505887000156889348127462631364561412835589414227905626001556124920721381180916353004195357239703395912173839712380569892467769276028873585749529029704900277393477308046170568306899216653561856937648186383356499772886802103617644079045242679876437789706198682378675257965584060895470702729
  • Author

I used a StringBuilder to append the bits, the compression rate is much better. But I still need to deal with the time = =


As far as I can see you are calculating mult = a*(a-1)*...*(a-k+1) in that loop. Special cases like k>a+1 and k<2 aside (consider if they can appear in your code), this is the same as mult=a!/k!. You already have a factorial function in your code: Use it, then optimize it with a lookup-table if needed. Note: You are dividing by another k! later on. You only need to calculate it once, of course.

Hmm, the thing is it is (a - i), and it could be something like mult = 15*14*13/5!, I'm interested in using a look up table, possibly a database? But I am a newbie at that, how do you think I should start? Other than the look up table, is there other possible methods?


Btw I'm applying the recursive "a" computation I asked here.


As an example of the multiplication

 

BigInteger mult = 274341;

 

for (int j = 1; j < 157594; j++)
{
mult *= (BigInteger)(274341 - j);
}

 

P.S. Right I see it, mult=a!/k!, looks like I need a look up table

Edited by fredreload

Archived

This topic is now archived and is closed to further replies.

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.