Reverse Engineer Polling Statistics

22

2

Introduction

Given a set of percentages of choices in a poll, calculate the minimum number of voters there must be in the poll to generate those statistics.

Example: What is your favorite pet?

  • Dog: 44.4%
  • Cat: 44.4%
  • Mouse: 11.1%

Output: 9 (minimum possible # of voters)

Specs

Here are the requirements for your program/function:

  • You are given an array of percentage values as input (on stdin, as function argument, etc.)
  • Each percentage value is a number rounded to one decimal place (e.g., 44.4 44.4 11.1).
  • Calculate the minimum possible number of voters in the poll whose results would yield those exact percentages when rounded to one decimal place (on stdout, or function return value).
  • Bonus: -15 characters if you can solve in a "non-trivial" way (i.e., doesn't involve iterating through every possible # of voters until you find the first one that works)

Example

>./pollreverse 44.4 44.4 11.1
9
>./pollreverse 26.7 53.3 20.0
15
>./pollreverse 48.4 13.7 21.6 6.5 9.8
153
>./pollreverse 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 99.6
2000
>./pollreverse 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 98.7
667
>./pollreverse 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 98.7
2000
>./pollreverse 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 97.8
401

Scoring

This is code-golf, so shortest possible characters wins. Any bonuses are further subtracted from the total character count.

mellamokb

Posted 2012-04-24T04:04:33.953

Reputation: 5 544

2I think this could do with a few more awkward cases for testing. 26.7 53.3 20.0 (4 8 3 of 15), 48.4 13.7 21.6 6.5 9.8 (74 21 33 10 15 of 153) etc. – Gareth – 2012-04-24T09:44:39.910

@Gareth: Good thought. Updated with your test cases. – mellamokb – 2012-04-24T12:06:19.043

shouldn't sum of all votes be 100%? it's not in last four testcases – Ali1S232 – 2012-04-25T18:56:40.273

@Gajet: No it does not always equal 100%. Every time there is a rounding down, you lose up to 0.5% from the total, and every time there is a rounding up, you add up to 0.5% to the total. The last four test cases were purposely constructed to optimally exploit this phenomenon. In the first test case that results in 2000, each of the first 9 entries represents 1 vote (and are all rounded up 0.5%), whereas the last one represents 1991 votes (and is rounded down ~0.5%). If you calculate those percentages manually and round to 1 decimal place, you will see they are all correct. – mellamokb – 2012-04-25T19:14:10.437

I am struggling with the non-trivial answer in VBA (trying since so far, there have been none), but I'm working on it! – Gaffi – 2012-04-26T03:31:58.350

I rescind. [tag:VBA] doesn't like to mod with non-integers, and the code to correct is getting too long. I can create a working, non-trivial function for SIMPLE percentages, at least. :-) – Gaffi – 2012-04-26T17:39:15.303

Answers

2

APL (Dyalog Classic), 48 43 bytes

-5 bytes by Adám

+/0(⊢+{(⌈/⍷⊢)⍺-⍵÷+/⍵})⍣{z≡⍎3⍕⍺÷+/⍺}⍨z←.01×⎕

Full program taking input from stdin.

Try it online! Link is to dfn version.

Ungolfed

normalize ← ⊢ ÷ +/
find_max ← {⍵⍷⍨⌈/⍵}
round ← {⍎3⍕⍵}
increase ← {find_max ⍺ - normalize ⍵}
vote_totals ← {z←⍺ ⋄ ⍺ (⊢+increase)⍣{z ≡ round normalize ⍺} ⍵}
h ← {+/ (.01×⍵) vote_totals 0}

Try it online!

  • normalize divides (÷) all elements of its right argument () by its sum (+/).
  • round(y) rounds y to 3 decimal places by formatting () and then evaluating () each element of y.
  • find_max(y) returns an array with 1 where max(y) is found and 0 elsewhere.
  • increase(x,y) takes x (the goal percentages) and y (the array of current vote totals) and calculates where to add 1 in y to bring the percentages closer to x.
  • vote_totals(x,y) takes x (the goal percentages) and y (the starting vote totals) and executes f repeatedly, adding votes until the percentages round to x.
    • The syntax f ⍣ g means to execute f repeatedly until g(y,f(y)) is true. In this case we ignore f(y).
  • h(x) sets y to 0 (equivalent to an array of 0s due to vectorization), executes g, and sums the final vote totals.

lirtosiast

Posted 2012-04-24T04:04:33.953

Reputation: 20 331

7

Python, 154

def p(x):
 n=[1]*len(x);d=2;r=lambda z:round(1000.*z/d)/10
 while 1:
    if(map(r,n),sum(n))==(x,d):return d
    d+=1
    for i in range(len(x)):n[i]+=r(n[i])<x[i]

It works for the last example now.

Example runs:

>>> p([44.4, 44.4, 11.1])
9
>>> p([26.7, 53.3, 20.0])
15
>>> p([48.4, 13.7, 21.6, 6.5, 9.8])
153
>>> p([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 99.6])
2000
>>> p([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 98.7])
667
>>> p([0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 98.7])
2000
>>> p([0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 97.8])
401

grc

Posted 2012-04-24T04:04:33.953

Reputation: 18 565

I think something may be wrong in your last example; perhaps you meant 99.1 as the last value – Cristian Lupascu – 2012-04-25T06:46:25.260

2I think it's right but it's quite confusing. 1/2000 = 0.05% (0.1% rounded) and 1991/2000 = 99.55% (99.6% rounded). So if there are ten options in a poll and nine of them get voted for once while the last gets 1991 votes, then it would give those percentages. – grc – 2012-04-25T07:32:35.063

You're right. Great solution, BTW. – Cristian Lupascu – 2012-04-27T15:09:57.383

I think you can save 3 more chars by following this tip: http://codegolf.stackexchange.com/a/58/3527

– Cristian Lupascu – 2012-04-27T15:10:37.790

Thanks, w0lf. I've updated it now to include tabs. Tabs show up as four spaces if anyone's wondering. – grc – 2012-04-28T01:53:06.567

4

Python, 154

def r(l):
 v=0
 while 1:
  v+=1;o=[round(y*v/100)for y in l];s=sum(o)
  if s: 
    if all(a==b for a,b in zip(l,[round(y*1000/s)/10for y in o])):return s

Blazer

Posted 2012-04-24T04:04:33.953

Reputation: 1 902

+1 Looks good! http://ideone.com/k2Mgb. I tried to find a pathological case to break it and I couldn't.

– mellamokb – 2012-04-24T19:55:14.647

I can't generate on ideone due to time-limit exceeding, but what result do you get for [0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,99.6]? – mellamokb – 2012-04-24T20:00:56.667

hmm... half an hour and the program is still running. I think its probably safe to say that's a breaker. however, I dont see how that can be a valid answer because it totals 100.5% and not 100% – Blazer – 2012-04-25T05:21:18.363

21/2000 = 0.05% (0.1% rounded) and 1991/2000 = 99.55% (99.6% rounded). So it does actually total 100%, but the rounding makes it really confusing. – grc – 2012-04-25T05:31:17.350

4

J, 57 characters

t=:".>'1'8!:0|:100*%/~i.1001
{.I.*/"1(t{~i.#t)e."1~1!:1[1

Used the trivial method. It takes input from the keyboard. t creates a lookup table and the second line looks for the input within the table. I can provide an expanded explanation of the code if anyone's interested.

I had looked into using the percentage to create a fraction then get the lowest form of the fraction to figure out the number, but I couldn't figure out a way to make it work with the rounding of the results.

Gareth

Posted 2012-04-24T04:04:33.953

Reputation: 11 678

Hmm, this fails for the new test case. I'll have to look for a fix. – Gareth – 2012-04-25T11:06:19.913

3

VBA - 541

This has got some glaring errors, but it was my attempt to find a non-trivial/looping-until-I-get-the-right-number solution. I have not fully golfed it, though I don't think there's much to add in that regard. However, I've spent too much time on this, and it hurts my head now. Not to mention, the rules are probably very broken and apply more or less to these examples only.

This does very well for a lot of simple tests I ran, (i.e. even totals, 2 or 3 inputs) but it fails for some of the tests presented by the challenge. However, I found that if you increase the decimal precision of the input (outside the scope of the challenge), the accuracy improves.

Much of the work involves finding the gcd for the set of numbers provided, and I sort of got that through Function g(), though it is for sure incomplete and likely a source of at least some of the errors in my outputs.

Input is a space-delimited string of values.

Const q=10^10
Sub a(s)
e=Split(s)
m=1
f=UBound(e)
For i=0 To f
t=1/(e(i)/100)
m=m*t
n=IIf(n>t Or i=0,t,n)
x=IIf(x<t Or i=0,t,x)
Next
h=g(n,x)
i=(n*x)/(h)
If Int(i)=Round(Int(i*q)/q) Then
r=i
ElseIf (n+x)=(n*x) Then
r=(1/(n*x))/h/m
ElseIf x=Int(x) Then
r=x*(f+1)
Else
z=((n+x)+(n*x)+m)*h
y=m/(((m*h)/(f+1))+n)
r=IIf(y>z,z,y)
End If
Debug.Print Round(r)
End Sub
Function g(a,b)
x=Round(Int(a*q)/q,3)
y=Round(Int(b*q)/q,3)
If a Then
If b Then
If x>y Then
g=g(a-b,b)
ElseIf y>x Then
g=g(a,b-a)
Else
g=a
End If
End If
Else
g=b
End If
End Function

Testcases (input ==> expected/returned):

Passed:  

"95 5" ==> 20/20
"90 10" ==> 10/10
"46.7 53.3" ==> 15/15
"4.7 30.9 40.4 23.8" ==> 42/42
"44.4 44.4 11.1" ==> 9/9
"26.7 53.3 20.0" ==> 15/15
"48.4 13.7 21.6 6.5 9.8" ==> 153/153
"0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 99.55" ==> 2000/2000
"0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.15 98.65" ==> 2000/2000
"0.149925 0.149925 0.149925 0.149925 0.149925 0.149925 0.149925 0.149925 0.149925 98.65067" ==> 667/667


Failed:  

"0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 99.6" ==> 2000/1000
"0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 98.7" ==> 2000/5000
"0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 98.7" ==> 667/1000
"0.14 0.14 0.14 0.14 0.14 0.14 0.14 0.14 0.14 98.65" ==> 667/10000
"0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 97.8" ==> 401/500
"0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24 0.24 97.75" ==> 401/235
"0.249377 0.249377 0.249377 0.249377 0.249377 0.249377 0.249377 0.249377 0.249377 97.75561" ==> 401/14010

Gaffi

Posted 2012-04-24T04:04:33.953

Reputation: 3 411

you can loose 6 bytes by converting Debug.Print to Debug.? – Taylor Scott – 2017-05-31T16:28:14.910

2

C# (.NET Core), 286 bytes

double M(string[]a){var p=a.Select(double.Parse).ToList();var n=p.Select(x=>1d).ToList();var c=2;for(;;){Func<double,double>f=x=>Math.Round(x*1000/c,(MidpointRounding)1)/10;if(n.Select(f).Zip(p,(x,y)=>x==y).All(z=>z)&&c==n.Sum())return c;c++;n=n.Zip(p,(x,y)=>x+(f(x)<y?1:0)).ToList();}}

Try it online!

Saved lots of bytes thanks to Peter Taylor and Embodiment of Ignorance

Cristian Lupascu

Posted 2012-04-24T04:04:33.953

Reputation: 8 369

How could I modify this to test it on ideone.com? – Gareth – 2012-04-24T09:10:37.300

I think you're missing a } at the end. – grc – 2012-04-24T10:37:25.070

@Gareth I tried to execute it on ideone.com, but I think it's using a .NET framework version earlier than 4.0, because it does not recognize the Linq Zip method. – Cristian Lupascu – 2012-04-24T10:48:51.967

@grc Thanks for pointing that out. Updated. – Cristian Lupascu – 2012-04-24T10:49:07.097

I've got it running on ideone.com: http://ideone.com/GofZ4. Even for the simple test case 44.4 44.4 11.1 it errors out because it takes too long (6+ seconds). Is that expected? Edit: works just fine in LINQPad, maybe mono just isn't well optimized for LINQ.

– mellamokb – 2012-04-24T12:22:09.880

@mellamokb thanks for testing the code. I don't know why it takes so long in ideone. I ran it in Visual Studio and there were no strange delays. – Cristian Lupascu – 2012-04-24T12:34:04.487

I'm not familiar with C#. Is while(1>0) necessary? Does while(1) not work? – Gaffi – 2012-04-24T16:10:30.480

1@Gaffi: No, C# has strict typing (like Java) so it must be a boolean. Since 1>0 is shorter than true, it is preferred. – mellamokb – 2012-04-24T16:29:34.077

Why is the argument called args? That's 6 characters straight down the pan. You can also save 2 chars by moving ++c into the while guard, and 1 by changing the first if guard to s>0. And I'm not sure whether you're counting any unnecessary whitespace, but you seem to be. – Peter Taylor – 2012-04-24T22:05:25.453

@PeterTaylor Thank you for all your great suggestions. I've updated my code and saved 7 characters. Leaving args in there was a really silly mistake. – Cristian Lupascu – 2012-04-25T06:30:47.730

@PeterTaylor the while(++c>0) is an ingenious trick, but unfortunately it won't compile unless I add another return statement, after the while loop. That's because the compiler knows it's never going to get out of while(1>0) unless there's a return in the while, but with the ++c>0 condition it does not have that guarantee at compile time. – Cristian Lupascu – 2012-04-25T06:34:14.683

Looks like this solution also goes into an infinite loop for some of my new test cases. See the update to the question for the new test cases. Here's a LINQPad-ready test case runner: http://pastebin.com/811wZtFG

– mellamokb – 2012-04-25T17:15:41.760

@mellamokb You are right; I'll check why this happens and then fix the code – Cristian Lupascu – 2012-04-26T09:08:28.607

I have fixed the code. – Cristian Lupascu – 2012-04-27T15:05:43.397

I know this is really old, but can't you change the while(1>0) to for(;;)? Also, I'm pretty sure the code is only 300 bytes, without linebreaks. You can also use (MidpointRounding)1 instead of MidpointRounding.AwayFromZero – Embodiment of Ignorance – 2019-02-13T02:39:52.320

@EmbodimentofIgnorance Right, thanks! This is in fact a submission so old that now I realized I missed some basic golfing tricks, like cutting whitespace. Fixed now. – Cristian Lupascu – 2019-02-15T11:51:44.570

0

Python 3, 140 139 137 bytes

f=lambda l,m=1,i=0,c=0,x=0:round(x*100,1)-l[i]and(x<1and f(l,m,i,c,x+1/m)or f(l,m+1))or l[i+1:]and f(l,m,i+1,c+x)or c+x-1and f(l,m+1)or m

Try it online!

Gives the right answer for the first two test cases, and runs into Python's recursion limits for the others. This is not very surprising, since every check is made on a new recursion level. It's short, though...

(An explanation of the used variables can be found in the TIO link)

f=lambda l,m=1,i=0,c=0,x=1:round(x*100,1)-l[i]and(x and f(l,m,i,c,x-1/m)or f(l,m+1))or l[i+1:]and f(l,m,i+1,c+x)or c+x-1and f(l,m+1)or m

should work for 136 bytes, but doesn't due to float precision.

ArBo

Posted 2012-04-24T04:04:33.953

Reputation: 1 416