Codesignal Solution: totalBinSum

Original Problem

Given a number num in its binary representation, your task is to sum all the numbers in base 10 formed by the prefixes of num. More formally, sum up int(num[0, i]) for all possible i.

Since the answer can be very big, return it modulo 10⁹ + 7.

Example

For bin = "1001", the output should be
totalBinSum(bin) = 16.

Here are all the prefixes:

1₂ = 1₁₀;
10₂ = 2₁₀;
100₂ = 4₁₀;
1001₂ = 9₁₀.

Thus, the answer is 1 + 2 + 4 + 9 = 16.

Input/Output

[time limit] 4000ms (rb)

[input] string num
A binary representation of some number, i.e. a string consisting of the characters '0' and '1'.
Guaranteed constraints:
2 ≤ num.length ≤ 5 · 10⁴.
[output] integer
The sum of all the prefixes of num in base 10.

Solution

These kind of problems almost always hide a nice mathematical pattern. Generating a few elements of the sequence and looking them up in oeis.org reveils nice formulas quite often. For this problem there are two ways to describe them. First in a recursive way, \(a(n) = n + a(\lfloor n/2\rfloor)\) and a closed form solution \(2n-\text{popcount}(n)\). This is pretty amazing! But how can this work at all? Let's proof the identity. Let \(\ell:= \lfloor 1+\log_2(n)\rfloor\) and \(n_k\) be the \(k\)th digit of \(n\)

\[\begin{array}{rl} a(n) &= 2n-\text{popcount}(n) \\ &=2\sum\limits_{i=0}^\ell 2^in_i - \sum\limits_{i=0}^\ell n_i\\ &= \sum\limits_{i=0}^\ell 2^{i+1}n_i - \sum\limits_{i=0}^\ell n_i\\ &= \sum\limits_{i=0}^\ell (2^{i+1}n_i-n_i)\\ &= \sum\limits_{i=0}^\ell (2^{i+1}-1)n_i\\ &\overset{*}{=} \sum\limits_{i=0}^\ell \left\lfloor\frac{n}{2^i}\right\rfloor\\ &=n + a(\lfloor n/2\rfloor) \end{array}\]

\(*\) This step might look weird. But things might make more sense with the following example. Let's say we want to sum the prefixes of 1101. That means the last row sums \(1101_2 + 110_2 + 11_2 + 1_2 = 23\). When looking at the previous row, the values become pretty similar. The only difference is that masks are created that get summed conditionally: \(1_2\cdot 1 + 11_2\cdot 0 + 111_2\cdot 1 + 1111_2\cdot 1 = 23\).

Implementing the formula in ruby, which has has arbitrary large integer precision is then quite trivial:

def totalBinSum n
    2 * n.to_i(2) % 1000000007 - n.count('1')
end

Solution

Related Stuff

Codesignal Solution: Near Square

Project Euler Solution #32: Pandigital products

Project Euler Solution #13: Large sum