Count the Zeros

6

For a given n, count the total number of zeros in the decimal representation of the positive integers less than or equal to n. For instance, there are 11 zeros in the decimal representations of the numbers less than or equal to 100, in the numbers 10, 20, 30, 40, 50, 60, 70, 80, 90, and twice in the number 100. Expect n to be as large as 10^10000.

Your task is to write a program to count zeros. Most efficient algorithm wins (Big O complexity)

EDIT: post your code and your answer for n = 9223372036854775807.

Xinbi

Posted 2014-01-09T18:24:11.207

Reputation: 79

7How are you going to measure the speed of the submissions? – Howard – 2014-01-09T18:42:47.477

Most efficient algorithm. – Xinbi – 2014-01-09T19:11:18.150

2@Xinbi most efficient isn't particularly well defined. Do you mean in Big O complexity? Benchmark? If we measure straight time, it's unfair if I have an 8 core processor and parallelise this. – Cruncher – 2014-01-09T19:26:33.973

I did mean Big O complexity. I'm not interested in the CPU power of the machine it's running on. I'm interested in the algorithm that solves the problem. – Xinbi – 2014-01-09T19:57:45.533

@Xinbi In that case, you're very likely to get ties. Do you have a tie breaking condition? – Cruncher – 2014-01-09T20:21:20.217

4I got an O(1) solution. Do I win now? xD – Cruncher – 2014-01-09T21:03:48.287

4I agree with @Cruncher's approach to game a Big-O challenge with a massive lookup table. It's foolish to not freely spend a free resource if it helps your answer. RAM, code space, and setup time, in this problem specification, are all free. A pure lookup approach may be practically impossible but a hybrid solution of lookup/computation could be the best in practical implementation. – Darren Stone – 2014-01-09T22:38:26.787

You might limit program size to get the actual computation, without too much lookup. In that case, be specific about the bound. – MvG – 2014-01-09T23:13:04.863

Relevant Euler problem... I solved it years ago, my code works for all other digits except zero, so it needs some refactoring.

– swish – 2014-01-10T00:23:21.450

At the moment, you have at least three different answers: 16577013436077779652, 16577013372932555467 and my own 16577013372932555466 which is off by one from that before. Can we establish that for e.g. n = 30270501, the correct answer is 20398096 as per this simple approach? Or does anyone have reason to doubt even that simple approach?

– MvG – 2014-01-10T01:01:57.587

@MvG I think yours is correct. – Xinbi – 2014-01-10T01:06:51.950

Answers

8

Some thoughts up front:

f[i]: number of zeros in [10^i, 2*10^i)
g[i]: number of zeros in [10*10^i, 11*10^i)
h[i]: number of zeros in [0, 10^i)

i   f       g     h
1  1*     10*     *
2  1**    10**    **
3  1***   10***   ***
4  1****  10****  ****

f[i] = 9*f[i - 1] + g[i - 1]
g[i] = f[i] + 10^i
h[i] = h[i - 1] + 9*f[i - 1]

Now the code:

def zeros(n):
    s = str(n)
    f = [0]
    g = [1]
    h = [0]
    for i in range(1, len(s) + 1):
        f.append(9*f[-1] + g[-1])
        g.append(f[-1] + 10**i)
        h.append(h[-1] + 9*f[-2])
    k = len(s) - 1
    a = h[k]                  # all numbers with fewer digits
    a += (int(s[0]) - 1)*f[k] # same number of digits but first digit less
    s = s[1:]
    while s:
        if s[0] == '0':
            a += int(s) + 1
        else:
            a += int(s[0])*f[len(s) - 1] + 10**(len(s) - 1)
        s = s[1:]
    return a

For n = 9223372036854775807 the result is 16577013372932555466. This is different from the currently accepted answer by @user14325, but I have more trust in it since starting at n=1000, @user14325 disagrees from the naive implementation suggested by @user2509848 in which I have a lot of trust due to its simplicity.

Complexity of my code is O(log n), at least if you assume uniform cost for arithmetic operations. Taking bit cost into account, it will probably be O(log(n)3log log(n)) or something like that, depending on how Python does its big integers and the exponentiation on them. If you care about the constants, then there is a lot of room for optimizations. Converting between numbers and strings all the time is hardly efficient. But as long as you only care for asymptotic behavior, I'll leave it at that.

MvG

Posted 2014-01-09T18:24:11.207

Reputation: 726

I agree with your assessment. My O(Log(n)) solution came up with the same answer as yours. (I didn't look close enough at the resulting digits) – Xinbi – 2014-01-10T01:02:41.323

9

Python O(1):

Since the challenge has been specified to be using Big O. I concluded that the best way to do it(since the upper bound is limited by a constant) is to just to create a giant look-up array.

However, this program would be too big to post here, so I made a program that generates it:

def zeroesUpToN(n):
    zeros = 0
    for i in range(n):
        s = str(i+1)
        zeros += s.count('0')
    return zeros

s = "print(["
for i in range(10^10000):
    s += str(zeroesUpToN(i+1)) + ",";
s = s[:-1] + "][int(input())])"

print(s)

So, this will ALWAYS take a long time to run(I mean the resulting program, not the generating program. The generating program would take even longer). But, how long it takes to run will not depend on n. Therefore, by definition it is an O(1) solution.

Here is an actual program that works for smaller outputs(up to 200):

print([0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,6,6,7,7,7,7,7,7,7,7,7,7,8,8,8,8,8,8,8,8,8,8,9,9,9,9,9,9,9,9,9,9,11,12,13,14,15,16,17,18,19,20,21,21,21,21,21,21,21,21,21,21,22,22,22,22,22,22,22,22,22,22,23,23,23,23,23,23,23,23,23,23,24,24,24,24,24,24,24,24,24,24,25,25,25,25,25,25,25,25,25,25,26,26,26,26,26,26,26,26,26,26,27,27,27,27,27,27,27,27,27,27,28,28,28,28,28,28,28,28,28,28,29,29,29,29,29,29,29,29,29,29,31][int(input())])

The real one is obviously much bigger.

P.S This is the problem with Big O problems.

EDIT: Oops, I forgot to give credit to @user2509848 whom I actually stole the code for generating each N.

Cruncher

Posted 2014-01-09T18:24:11.207

Reputation: 2 135

3Unfortunately, I don't think there's a computer out there that can hold an array of ten tremilliatrecendotrigintillion integers. – Xinbi – 2014-01-09T21:04:03.537

2@Xinbi I don't believe a computer being able to run it was in the requirements. This program provably works, even if I can't run it. – Cruncher – 2014-01-09T21:06:02.017

Post the algorithm that comes up with the array, and I'll call it a win. :-) – Xinbi – 2014-01-09T21:07:14.693

@Xinbi I did. However it just produces source code. The program that this produces would absolutely solve your program to the requirements, in O(1). – Cruncher – 2014-01-09T21:08:06.947

I'd contend that your solution is actually O(n+1), since the generation is part of the solution. – Xinbi – 2014-01-09T21:10:19.003

3No. The generation is there, simply because I cannot write the program here. It's vastly too big. It doesn't mean that the program doesn't conceptually exist. (FYI, O(n+1) is nonsense. It's just O(n)) – Cruncher – 2014-01-09T21:12:00.073

3@Xinbi actually. I would argue that the generating code is O(1) as well. It takes NO input, therefore there is no n. So it's O(1) by default. It will take the same amount of time every time you run it. for(i=0;i<1000000;i++){print("hi")} Is O(1) for the same reason. Unless you accept the number as an input. – Cruncher – 2014-01-09T21:19:17.447

What if there were no upper constraint for N in problem definition. This constraint is artificial. Would you generate an O(1) program which solves given task for arbitrary N ? – SergeyS – 2014-01-09T21:53:52.630

2@SergeyS If there was no upper bound constraint, then an O(1) solution would be impossible for this problem. The O(1) is entirely a product of the bound. It's a bit of a rule abuse. Oh well. – Cruncher – 2014-01-09T21:54:48.923

@SergeyS Thanks for pointing out the error in my answer. Deleting. – Dr. belisarius – 2014-01-09T21:56:47.323

4

Java:

public class ZeroCount {
    static final int max = 65498;

    public static void main(String[] args) {
        int res = 0;
        for (int i = 10; i <= max; i*=10) {
            res += max/i + max/(10*i) * (i-1);
            res -= (max % (10*i) < i) ? (i-1) - (max % i) / (i/10) : 0;
        }
        System.out.println(res);
    }
}

Should be O(log(n))

Edit:

For n=9223372036854775807 I decided to use Python instead of Java since 9223372036854775807 is Long.MAX_VALUE in Java and I didn't want to use BigInt

def zeros(n):
    res = 0
    i = 10
    while i <= n:
        res += (n // i) + n // (10 * i) * (i - 1)
        res -= (i - 1) - (n % i) if n % (10 * i) < i else 0
        i *= 10
    return res

print(zeros(9223372036854775807))

> 16577013372932555466

user14325

Posted 2014-01-09T18:24:11.207

Reputation: 349

Won't this skip 101? – Xinbi – 2014-01-09T20:27:34.863

Most efficient algorithm wins (Big O complexity). – Cruncher – 2014-01-09T20:33:37.097

It does. Why didn't I thought of it? – user14325 – 2014-01-09T20:33:51.597

I updated my code and it should be right and faster now. – user14325 – 2014-01-09T22:47:06.123

It gives the wrong answer, why it is accepted? – swish – 2014-01-10T00:48:11.260

@swish agreed. I didn't look close enough at the resulting answer (my brain couldn't handle that many digits at once. :-) – Xinbi – 2014-01-10T01:08:06.393

@swish I fixed the mistake. The basic algorithm is still the same but now it works. Furthermore I replaced the floating point divisions (/) by integer divisions (//). I think there won't be a simpler algorithm with O(log n). – user14325 – 2014-01-10T13:32:36.647

2

Python 3

n = int(input())
zeros = 0
for i in range(n):
  s = str(i+1)
  zeros += s.count('0')

print(zeros)

Fast up to 10,000,000. It gets slower when I bring it to 100,000,000.

user10766

Posted 2014-01-09T18:24:11.207

Reputation:

I borrowed your code for generating my solution btw – Cruncher – 2014-01-09T21:45:33.057

@Cruncher Big deal since I don't know Big O (whatever that is). – None – 2014-01-09T21:46:54.710

http://en.wikipedia.org/wiki/Big_O_notation if you're interested it all. It's very academic. It's a way of formally measuring the complexity of algorithms. Not always a good metric in practice though (as evidenced by my perfect solution.) O(1) is the smallest complexity. It means constant time. – Cruncher – 2014-01-09T21:49:05.773

Sure, thanks. (extra length) – None – 2014-01-09T21:49:48.907

@Cruncher Oh, so my solution is correct, yours just takes forever to run always (a joke solution)? – None – 2014-01-09T21:55:13.113

In all cases, your solution is faster. But since mine is ALWAYS slow, even for small n's, the runtime does not depend on n, so we call it a constant time(O(1)) algorithm. – Cruncher – 2014-01-09T21:57:03.183

@Cruncher Very interesting, you should make it an if-else statement for the size of the number, I won't mind being beaten - I've never won yet, and I won't win either way. – None – 2014-01-09T22:02:52.927

1Theoretically. My solution has lower complexity than yours. And by the winning condition of this question is better than yours. Even though its always slower. My answer was to point out the flaws in the question – Cruncher – 2014-01-09T22:07:18.147

Your code is a very good reference to compare other computations against, because it is so very simple. It simply has to be correct, even if it is slow. – MvG – 2014-01-10T00:51:03.250

@MvG This is why I used his solution to generate mine xD – Cruncher – 2014-01-10T13:38:58.360

2

Haskell - O(n) for 10^n input

import Data.Char
import Data.List

p n = (10^n-10)`div`9 -- number of padded zeroes below 10^n
t n = n*10^(n-1) -- number of zeroes below 10^n with padded
digsToInt = foldl1 (\x y -> x*10+y)

-- number of zeroes below n
f :: Integer -> Integer
f n = h ds (len - 1) - p len 
  where
  ds = map (fromIntegral . digitToInt) $ show n
  len = genericLength ds
h [x] _ = 0
h (x:ys) n | x == 0 = 1 + digsToInt ys + h ys (n-1)
           | otherwise = t (n+1) - (10-x) * t n + h ys (n-1)

Answer for 9223372036854775807 is 16577013372932555466.

swish

Posted 2014-01-09T18:24:11.207

Reputation: 7 484

Comparing your answer to mine, it seems like one of us is off by one. Did you perhaps include the number 0 in the count? Do you get the correct result of 11 for input 100?

– MvG – 2014-01-10T00:59:20.093

@MvG It indeed counts the first 0 as I didn't read the problem carefully, now it's fine. – swish – 2014-01-10T01:09:24.710

@swish now it's the correct value. Yay! – Xinbi – 2014-01-10T01:12:04.407

1

C# O(log n):

using System;

namespace Zeroes
{
    class program
    {
        static void Main(string[] a)
        {
            int n = int.Parse(a[0]);
            int result = 0;
            for (int i = 1; i < Math.Log(n); i++) 
                result += n/(int)Math.Pow(10, i);
            Console.Write(result);
        }
    }
}

Only works for integers, of course, but the logic's sound. I have a feeling there's a better algorithm to be found, though. Nope, there's a bug in this.

Rik

Posted 2014-01-09T18:24:11.207

Reputation: 781

1

Python O(log10 n)

def zerosA(n):
  '''finds the number of zeros from [1, 10**n]'''
  if n < 1: return 0
  return 10*zerosA(n-1) + 10**(n-1) - 9*(n-1)

def zerosB(n):
  '''finds the number of zeros from (10**n, 2*10**n]'''
  if n < 1: return 0
  return 10**n + n*10**(n-1) - 1

def zerosC(n):
  '''finds the number of zeros from (2*10**n, 3*10**n]'''
  if n < 1: return 0
  return n*10**(n-1)

def count_zeros(n):
  p = z = 0
  while n >= 10:
    n, r = divmod(n, 10)
    if r > 0:
      z += zerosB(p) + zerosC(p)*(r-1) + `n`.count('0')*r*10**p
    p += 1
  z += zerosA(p) + zerosC(p)*(n-1)
  return z

if __name__ == "__main__":
  print count_zeros(9223372036854775807)

Output:

 16577013372932555466

For the number 12345, for example, this works in the following manner:

[12341, 12345] ⇒ 0
[12301, 12340] ⇒ 13
[12001, 12300] ⇒ 159
[10001, 12000] ⇒ 1599
[1, 10000] ⇒ 2893

For a total sum of 4664. No more than log10 n iterations are needed to reach the final answer.

primo

Posted 2014-01-09T18:24:11.207

Reputation: 30 891