Jump to content

Q:Fast Multiplication Algorithms


Tiberius

Recommended Posts

I was looking up the fastest multiplication algorithms on Wikipedia. Karatsuba showed up [O(N1.585)], Strassen showed up[O(N log N log log N)], all with respectable big O scores, but a friend recently told me of an algorithm that seems to work in linear time (wrt to the number of digits in the numbers).

 

I can't seem to find it on Google, so I'll just describe it here with an example:

 

Suppose you have two numbers 141 and 45 and we want to multiply them. We keep dividing the first number by 2 until it becomes 1 (integer division), and keep multiplying the 2nd number the same number of times.

 

141 45

70 90

35 180

17 360

8 720

4 1440

2 2880

1 5760

 

Now, we look at the LHS and add up all the terms that correspond to ODD terms on the left: 45+180+360+5760=6345 which is 141*45

 

Now as far as I could tell, there is a linear relationship between the number of iterations of division before you get down to 1 and the number of digits. (You end up adding 3 or 4 iterations per digit increase because multiplying by 10 is multiplying by 1010 in binary).

 

The relationship is something like K=3.322N +0.5, where N is the number of digits and K is the number of iterations. ( I wasn't too sure of a mathematical approach to the relation between K and N, so I plotted K vs N for N=1 to 600 and found that a linear relationship existed)

 

So in the worst case, for an N digit number, one has to do K multiplications by 2 and K divisions by 2. During the process, one has to do at max (if all LHS terms are odd) K additions as well. [Multiplying/Dividing a number by two is a constant time operation right ? Just a simple bit-shift ?] So shouldn't the overall run-time be O(2K+K)

 

=O(9.966N+1.5) = O(N) ?

 

Am I missing something here ? Can I consider division by 2 as a constant time operation as well ? I assumed it was because bit-shifting seems like something that would be classified as constant time, just as adding zero's in Karatsuba's algorithm was considered a constant time operation. Is there something wrong with the relationship between K and N ? Will it deviate for higher values on N ? (I couldn't really go beyond 600, nor could I derive a formal mathematical relationship).

Link to comment
Share on other sites

I was looking up the fastest multiplication algorithms on Wikipedia. Karatsuba showed up [O(N1.585)], Strassen showed up[O(N log N log log N)], all with respectable big O scores, but a friend recently told me of an algorithm that seems to work in linear time (wrt to the number of digits in the numbers).

 

I can't seem to find it on Google, so I'll just describe it here with an example:

 

Suppose you have two numbers 141 and 45 and we want to multiply them. We keep dividing the first number by 2 until it becomes 1 (integer division), and keep multiplying the 2nd number the same number of times.

 

141 45

70 90

35 180

17 360

8 720

4 1440

2 2880

1 5760

 

Now, we look at the LHS and add up all the terms that correspond to ODD terms on the left: 45+180+360+5760=6345 which is 141*45

 

Now as far as I could tell, there is a linear relationship between the number of iterations of division before you get down to 1 and the number of digits. (You end up adding 3 or 4 iterations per digit increase because multiplying by 10 is multiplying by 1010 in binary).

 

The relationship is something like K=3.322N +0.5, where N is the number of digits and K is the number of iterations. ( I wasn't too sure of a mathematical approach to the relation between K and N, so I plotted K vs N for N=1 to 600 and found that a linear relationship existed)

 

So in the worst case, for an N digit number, one has to do K multiplications by 2 and K divisions by 2. During the process, one has to do at max (if all LHS terms are odd) K additions as well. [Multiplying/Dividing a number by two is a constant time operation right ? Just a simple bit-shift ?] So shouldn't the overall run-time be O(2K+K)

 

=O(9.966N+1.5) = O(N) ?

 

Am I missing something here ? Can I consider division by 2 as a constant time operation as well ? I assumed it was because bit-shifting seems like something that would be classified as constant time, just as adding zero's in Karatsuba's algorithm was considered a constant time operation. Is there something wrong with the relationship between K and N ? Will it deviate for higher values on N ? (I couldn't really go beyond 600, nor could I derive a formal mathematical relationship).

This is known as peasant multiplication. Both schoolbook multiplication and peasant multiplication are examples of a more general algorithm

called Shift and add.

 

Your reasoning is okay except for your definition of operation.

Addition is order N as well (ie. the amount of time increases linearly with the length of the number), so you have to do K bitshifts and then one order N addition for each bitshift that passes your oddness test. N*N = O(N^2).

You can muddle with the time constants a bit by using different bases and memorizing a multiplication table for smaller numbers, larger is generally better.

This is basically the principle on which karatsuba/divide and conquer is based on iirc, but the 'multiplication table' is the same multiplication algorithm on two smaller numbers. Because this process of picking a smaller base is recursive and starts with a base dependant on your original number, it can have lower asymptotic complexity.

Link to comment
Share on other sites

Seeing the term "shift and add" reminded me of a mechanical office machine common in the 1950's called a comptometer.

This machine had several columns of keys, each column numbered 1 to 9. They used a form of shift and add (perhaps in this case add and shift). In the example attached an operator would start from the right and press keys 8 and 7 simultaneously six times. They would then move their fingers one row to the left and press keys 8 and 7 five times. The result (87 multiplied by 56) would appear in the display. What the machine did in this example was add 87*6 to 870*5.

The operators of these machines could usually produce results faster than the early electronic devices and remained in use for many years.

 

 

http://www.vintageca...omptometer.html

post-68560-0-39672000-1335360016_thumb.jpg

Edited by Joatmon
Link to comment
Share on other sites

This is known as peasant multiplication. Both schoolbook multiplication and peasant multiplication are examples of a more general algorithm

called Shift and add.

 

Your reasoning is okay except for your definition of operation.

Addition is order N as well (ie. the amount of time increases linearly with the length of the number), so you have to do K bitshifts and then one order N addition for each bitshift that passes your oddness test. N*N = O(N^2).

You can muddle with the time constants a bit by using different bases and memorizing a multiplication table for smaller numbers, larger is generally better.

This is basically the principle on which karatsuba/divide and conquer is based on iirc, but the 'multiplication table' is the same multiplication algorithm on two smaller numbers. Because this process of picking a smaller base is recursive and starts with a base dependant on your original number, it can have lower asymptotic complexity.

 

Ah, yes, that would make sense. For some reason I believed that addition was a constant time operation. How silly of me.

So K=Nlog210, and a bunch of O(N) additions for each pass in the worst case--- O(N2)

 

Got it, thanks.

 

 

 

Link to comment
Share on other sites

i'm not sure its as bad as O(n^2) for peasent multiplication.

for example, lets take 32767 * 8191; the worst case.

with a length of 14 and 12 bits, it should take roughly 168 operations.

so let's count them and see.

a = 0b11111111111111
b = 0b111111111111
r1 = 0
r2 = 0
count = 0
while a > 0:
  if a & 1 == 1:
     r2 = b
     while r2 != 0:
         r1, r2 = r1^r2, (r1&r2)<<1  #add function
         count += 3
  a >>= 1
  b <<= 1
  count += 2
print(count)

i get 142 for an expected 168 operations. but more importantly, the above code runs in about .02 seconds.

now what happens if i double the size if the input?

i get roughly 290 operations, for an expected 28*24 = 672 number of operations.

but agian, more importantly the code still runs in about .02 seconds.

Link to comment
Share on other sites

i'm not sure its as bad as O(n^2) for peasent multiplication.

for example, lets take 32767 * 8191; the worst case.

with a length of 14 and 12 bits, it should take roughly 168 operations.

so let's count them and see.

a = 0b11111111111111
b = 0b111111111111
r1 = 0
r2 = 0
count = 0
while a > 0:
  if a & 1 == 1:
     r2 = b
     while r2 != 0:
         r1, r2 = r1^r2, (r1&r2)<<1  #add function
         count += 3
  a >>= 1
  b <<= 1
  count += 2
print(count)

i get 142 for an expected 168 operations. but more importantly, the above code runs in about .02 seconds.

now what happens if i double the size if the input?

i get roughly 290 operations, for an expected 28*24 = 672 number of operations.

but agian, more importantly the code still runs in about .02 seconds.

 

I don't think this is a fair way to test the complexity of an algorithm. O(n2) is an upper bound to the running time. In general, the actual T(n) terms could have fractional constants, or could be a very complex polynomial in the number of digits as well.

Even in this case, The N2log210 is rounding up a lot of things [for example, as you move down the table, the complexity of addition keeps reducing by a fraction, but it is still O(n)].

 

I don't think doubling the input length is sufficient to draw conclusions. Look at the graphs of 0.1N2 and NlogN:

Untitled.png

Clearly, the hidden constants in the big O notation are relevant when you have small input sizes of 14 and 12, and you are simply doubling them.

I guess if you raise the input size to a sufficiently large power, then it might be a good indication of the complexity. (say N^5 ?)

 

I'm just learning this stuff, so I could be wrong.

Edited by Tiberius
Link to comment
Share on other sites

hmm, fair enough. let's double it 3 more times. with 56 and 48 bits, 568 operations, and i still get .02 seconds with 112 and 96 bits, 1178 operations, and i still get .02 seconds. finally, with 224 and 192 bits, 2362 operations, and i still get .02 seconds. (from www.ideone.com ) <BR>edit: i tried timing it with my own cpu rather than a website, even at 224, 192 bits, it was taking 0 seconds!<BR>now that is a fast multiplication algorithm!<BR>

Edited by phillip1882
Link to comment
Share on other sites

hmm, fair enough. let's double it 3 more times. with 56 and 48 bits, 568 operations, and i still get .02 seconds with 112 and 96 bits, 1178 operations, and i still get .02 seconds. finally, with 224 and 192 bits, 2362 operations, and i still get .02 seconds. (from www.ideone.com ) <BR>edit: i tried timing it with my own cpu rather than a website, even at 224, 192 bits, it was taking 0 seconds!<BR>now that is a fast multiplication algorithm!<BR>

 

Modern CPUs will tend to do about as much as you can fit in cache as quickly as you put it there for simple n or n^2 algorithms. Also if you don't exceed word length the computer probably won't know the difference.

Also -- as Tiberius said -- there will be larger numbers on the constant and first order terms that change the results significantly for small n.

Try it again with a few thousand bits and then double that (may have to use assembly unless you want to go shuffling things between different words -- not sure -- my C knowledge is rather weak).

Link to comment
Share on other sites

I was going to make a similar point. For small operations, there is computational overhead from just getting the numbers from memory and printing to the terminal, so it depends how you're counting system time. If the operation is very quick, probably you can't measure with enough precision to tell the difference. Definitely sweep through a range of sizes and see what happens.

Link to comment
Share on other sites

at roughly multiplying two 2000 bit integers, i finally got an elapsed time recording of 0.016 seconds.

doubling that, i got an elapsed time of 0.032 seconds.

doubling it once more, i get roughly .125 seconds.

doubling yet again, i get roughly .465 seconds.

so it does indeed seem to be exhibiting n^2 behavior in terms of time, even though number of operations is linearly increasing.

i guess this isn't too surprising, since I'm doubling the size of two inputs, which is more like multiplying by 4 than 2.

still, it's an interesting algorithmic approach.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.