raw code

Optimizing integer multiplication

Robert Eisele

I was quite obsessed with the idea of optimizing integer multiplication by using bit shifts so that I noticed a beautiful pattern in the binary representation when I looked at some results of previous equations, I wrote. I wrote a small brute-force algorithm to get all possible numbers which can be optimized a bit as constant factor of a multiplication. The algorithm is implemented in JavaScript, so if you're interested in the source code, feel free to read it.

0 * n000000000000n - n
1 * n000000000001n
2 * n000000000010(n << 1)
3 * n000000000011(n << 1) + n
4 * n000000000100(n << 2)
5 * n000000000101(n << 2) + n
6 * n000000000110(n << 2) + (n << 1)
7 * n000000000111(n << 3) - n
8 * n000000001000(n << 3)
9 * n000000001001(n << 3) + n
10 * n000000001010(n << 3) + (n << 1)
12 * n000000001100(n << 3) + (n << 2)
14 * n000000001110(n << 4) - (n << 1)
15 * n000000001111(n << 4) - n
16 * n000000010000(n << 4)
17 * n000000010001(n << 4) + n
18 * n000000010010(n << 4) + (n << 1)
20 * n000000010100(n << 4) + (n << 2)
24 * n000000011000(n << 4) + (n << 3)
28 * n000000011100(n << 5) - (n << 2)
30 * n000000011110(n << 5) - (n << 1)
31 * n000000011111(n << 5) - n
32 * n000000100000(n << 5)
33 * n000000100001(n << 5) + n
34 * n000000100010(n << 5) + (n << 1)
36 * n000000100100(n << 5) + (n << 2)
40 * n000000101000(n << 5) + (n << 3)
48 * n000000110000(n << 5) + (n << 4)
56 * n000000111000(n << 6) - (n << 3)
60 * n000000111100(n << 6) - (n << 2)
62 * n000000111110(n << 6) - (n << 1)
63 * n000000111111(n << 6) - n
64 * n000001000000(n << 6)
65 * n000001000001(n << 6) + n
66 * n000001000010(n << 6) + (n << 1)
68 * n000001000100(n << 6) + (n << 2)
72 * n000001001000(n << 6) + (n << 3)
80 * n000001010000(n << 6) + (n << 4)
96 * n000001100000(n << 6) + (n << 5)
112 * n000001110000(n << 7) - (n << 4)
120 * n000001111000(n << 7) - (n << 3)
124 * n000001111100(n << 7) - (n << 2)
126 * n000001111110(n << 7) - (n << 1)
127 * n000001111111(n << 7) - n
128 * n000010000000(n << 7)
129 * n000010000001(n << 7) + n
130 * n000010000010(n << 7) + (n << 1)
132 * n000010000100(n << 7) + (n << 2)
136 * n000010001000(n << 7) + (n << 3)
144 * n000010010000(n << 7) + (n << 4)
160 * n000010100000(n << 7) + (n << 5)
192 * n000011000000(n << 7) + (n << 6)
224 * n000011100000(n << 8) - (n << 5)
240 * n000011110000(n << 8) - (n << 4)
248 * n000011111000(n << 8) - (n << 3)
252 * n000011111100(n << 8) - (n << 2)
254 * n000011111110(n << 8) - (n << 1)
255 * n000011111111(n << 8) - n
256 * n000100000000(n << 8)
257 * n000100000001(n << 8) + n
258 * n000100000010(n << 8) + (n << 1)
260 * n000100000100(n << 8) + (n << 2)
264 * n000100001000(n << 8) + (n << 3)
272 * n000100010000(n << 8) + (n << 4)
288 * n000100100000(n << 8) + (n << 5)
320 * n000101000000(n << 8) + (n << 6)
384 * n000110000000(n << 8) + (n << 7)
448 * n000111000000(n << 9) - (n << 6)
480 * n000111100000(n << 9) - (n << 5)
496 * n000111110000(n << 9) - (n << 4)
504 * n000111111000(n << 9) - (n << 3)
508 * n000111111100(n << 9) - (n << 2)
510 * n000111111110(n << 9) - (n << 1)
511 * n000111111111(n << 9) - n
512 * n001000000000(n << 9)
513 * n001000000001(n << 9) + n
514 * n001000000010(n << 9) + (n << 1)
516 * n001000000100(n << 9) + (n << 2)
520 * n001000001000(n << 9) + (n << 3)
528 * n001000010000(n << 9) + (n << 4)
544 * n001000100000(n << 9) + (n << 5)
576 * n001001000000(n << 9) + (n << 6)
640 * n001010000000(n << 9) + (n << 7)
768 * n001100000000(n << 9) + (n << 8)
896 * n001110000000(n << 10) - (n << 7)
960 * n001111000000(n << 10) - (n << 6)
992 * n001111100000(n << 10) - (n << 5)
1008 * n001111110000(n << 10) - (n << 4)
1016 * n001111111000(n << 10) - (n << 3)
1020 * n001111111100(n << 10) - (n << 2)
1022 * n001111111110(n << 10) - (n << 1)
1023 * n001111111111(n << 10) - n
1024 * n010000000000(n << 10)
1025 * n010000000001(n << 10) + n
1026 * n010000000010(n << 10) + (n << 1)
1028 * n010000000100(n << 10) + (n << 2)
1032 * n010000001000(n << 10) + (n << 3)
1040 * n010000010000(n << 10) + (n << 4)
1056 * n010000100000(n << 10) + (n << 5)
1088 * n010001000000(n << 10) + (n << 6)
1152 * n010010000000(n << 10) + (n << 7)
1280 * n010100000000(n << 10) + (n << 8)
1536 * n011000000000(n << 10) + (n << 9)
1792 * n011100000000(n << 11) - (n << 8)
1920 * n011110000000(n << 11) - (n << 7)
1984 * n011111000000(n << 11) - (n << 6)
2016 * n011111100000(n << 11) - (n << 5)
2032 * n011111110000(n << 11) - (n << 4)
2040 * n011111111000(n << 11) - (n << 3)
2044 * n011111111100(n << 11) - (n << 2)
2046 * n011111111110(n << 11) - (n << 1)
2047 * n011111111111(n << 11) - n
2048 * n100000000000(n << 11)
2049 * n100000000001(n << 11) + n
2050 * n100000000010(n << 11) + (n << 1)
2052 * n100000000100(n << 11) + (n << 2)
2056 * n100000001000(n << 11) + (n << 3)
2064 * n100000010000(n << 11) + (n << 4)
2080 * n100000100000(n << 11) + (n << 5)
2112 * n100001000000(n << 11) + (n << 6)
2176 * n100010000000(n << 11) + (n << 7)
2304 * n100100000000(n << 11) + (n << 8)
2560 * n101000000000(n << 11) + (n << 9)
3072 * n110000000000(n << 11) + (n << 10)
4096 * n1000000000000(n << 11) + (n << 11)

As you can see, on the left most column, a simple multiplication formula is given, which can also be expresed by the right most column. The basic approach was already discussed in the prior article, where I developed the following relation:

\[(2^x - 2^y)\cdot n = (n<<x)-(n<<y)\]

I want to give you an example which is not so trivial on the first sight. Please consider the equation 496 * n. The table above says, we can describe 496 by \(2^9 - 2^4\), so that we could rewrite the formula to \((n << 9) - (n << 4)\)

and after the reshaping, we'll get the same result as with a perhaps slower multiplication. Of course, the trick will only bring a performance benefit, if the architecture uses a slow multiplication implementation and fast binary/subtraction operations and if the compiler does not already do a similar thing.