22
1
Background:
I originally posted this question last night, and received backlash on its vagueness. I have since consulted many personnel concerning not only the wording of the problem, but also its complexity (which is not O(1)). This programming problem is an evil spin on an Amazon interview question.
Question:
Given a String of randomly concatenated integers [0, 250), 0 to 250 exclusive, there is ONE number missing in the sequence. Your job is to write a program that will calculate this missing number. There are no other missing numbers in the sequence besides the one, and that is what makes this problem so difficult, and possibly computationally hard.
Doing this problem by hand on smaller Strings, such as examples 1 and 2 below are obviously very easy. Conversely, computing a missing number on incredibly large datasets involving three-digit or four-digit numbers would be incredibly difficult. The idea behind this problem is to construct a program that will do this process FOR you.
Important Information:
One thing that appeared as rather confusing when I posted this problem last night was: what exactly a missing number is defined as. A missing number is the number INSIDE of the range specified above; NOT necessarily the digit. In example 3, you will see that the missing number is 9, even though it appears in the sequence. There are 3 places the DIGIT 9 will appear in a series of [0, 30): “9”, “19”, and “29”. Your objective is to differentiate between these, and discover that 9 is the missing NUMBER (inside of example 3). In other words, the tricky part lies in finding out which sequences of digits are complete and which belong to other numbers.
Input:
The input is a String S, containing integers from 0 to 249 inclusive, or 0 to 250 exclusive (in other words, [0, 250)). These integers, as stated above, are scrambled up to create a random sequence. There are NO delimiters (“42, 31, 23, 44”), or padding 0’s (003076244029002); the problems are exactly as described in the examples. It is guaranteed that there is only 1 solution in the actual problems. Multiple solutions are not permitted for these.
Winning Criteria:
Whoever has the fastest, and lowest memory usage will be the winner. In the miraculous event that a time ties, lower memory will be used for the time breaker. Please list Big O if you can!
Examples:
Examples 1 and 2 have a range of [0, 10)
Examples 3 and 4 have a range of [0, 30)
(Examples 1-4 are just for demonstration. Your program needn't to handle them.)
Examples 5 has a range of [0, 250)
1. 420137659
- Missing number => 8
2. 843216075
- Missing number => 9
3. 2112282526022911192312416102017731561427221884513
- Missing number => 9
4. 229272120623131992528240518810426223161211471711
- Missing number => 15
5. 11395591741893085201244471432361149120556162127165124233106210135320813701207315110246262072142253419410247129611737243218190203156364518617019864222241772384813041175126193134141008211877147192451101968789181153241861671712710899168232150138131195104411520078178584419739178522066640145139388863199146248518022492149187962968112157173132551631441367921221229161208324623423922615218321511111211121975723721911614865611197515810239015418422813742128176166949324015823124214033541416719143625021276351260183210916421672722015510117218224913320919223553222021036912321791591225112512304920418584216981883128105227213107223142169741601798025
- Missing number => 71
Test Data:
Problem 1: 6966410819610521530291368349682309217598570592011872022482018312220241246911298913317419721920718217313718080857232177134232481551020010112519172652031631113791105122116319458153244261582135510090235116139611641267691141679612215222660112127421321901862041827745106522437208362062271684640438174315738135641171699510421015199128239881442242382361212317163149232839233823418915447142162771412092492141987521710917122354156131466216515061812273140130240170972181176179166531781851152178225242192445147229991613515911122223419187862169312013124150672371432051192510724356172282471951381601241518410318414211212870941111833193145123245188102
Problem 2: 14883423514241100511108716621733193121019716422221117630156992324819917158961372915140456921857371883175910701891021877194529067191198226669314940125152431532281961078111412624224113912011621641182322612016512820395482371382385363922471472312072131791925510478122073722091352412491272395020016194195116236186596116374117841971602259812110612913254255615723013185162206245183244806417777130181492211412431591541398312414414582421741482461036761192272120204114346205712198918190242184229286518011471231585109384415021021415522313136146178233133168222201785172212108182276835832151134861116216716910511560240392170208215112173234136317520219
Problem 3: 1342319526198176611201701741948297621621214122224383105148103846820718319098731271611601912137231471099223812820157162671720663139410066179891663131117186249133125172622813593129302325881203242806043154161082051916986441859042111711241041590221248711516546521992257224020174102234138991752117924457143653945184113781031116471120421331506424717816813220023315511422019520918114070163152106248236222396919620277541101222101232171732231122301511263822375920856142187182152451585137352921848164219492411071228936130762461191564196185114910118922611881888513917712153146227193235347537229322521516718014542248813617191531972142714505519240144
Problem 4: 2492402092341949619347401841041875198202182031161577311941257285491521667219229672211881621592451432318618560812361201172382071222352271769922013259915817462189101108056130187233141312197127179205981692121101632221732337196969131822110021512524417548627103506114978204123128181211814236346515430399015513513311152157420112189119277138882021676618323919018013646200114160165350631262167910238144334214230146151171192261653158161213431911401452461159313720613195248191505228186244583455139542924222112226148941682087115610915344641782142472102436810828123731134321131241772242411722251997612923295223701069721187182171471055710784170217851
1Clarification: I see you tagged [tag:fastest-algorithm], but it's a bit unclear in the description. is this challenge [tag:fastest-algorithm] (as in, lowest time complexity) or [tag:fastest-code] (as in, taking least amount of time on a particular machine)? – JungHwan Min – 2017-12-06T03:15:54.350
2Also, must the program support any values of
N
, not just250
? / What about the232
issue? All possibilities or one any? I realize that you knew about that issue, but I find it unclear in the question. / If this is [tag:fastest-code] there must be a way to measure them. Of course running on a supercomputer is different from running on an old computer. / Because no one said that, -- Welcome to PPCG! – user202729 – 2017-12-06T03:22:19.703Ah, I was wondering what had happened, welcome back! I think the case remains that some problems might have multiple solutions. – fede s. – 2017-12-06T03:23:04.863
Why are the examples have different ranges, but the question specify fixed range (
250
)? Which one do you want? – user202729 – 2017-12-06T03:28:43.820250 is what I want; I was using the others to DEMONSTRATE the idea. Nevertheless, the problem has been solved in Java and C++. – TheProgrammer – 2017-12-06T03:31:10.953
And once again, there is only ONE number missing; it's impossible to have multiple solutions. – TheProgrammer – 2017-12-06T03:32:44.853
The program should be capable of solving N = 250. – TheProgrammer – 2017-12-06T03:34:40.637
@JoshuaCrotts Let's continue discussing in the chatroom.
– user202729 – 2017-12-06T03:39:11.5871This is a fascinating problem, but (at least according to the answers thus far) is too trivial to get enough computational complexity to be able to meaningfully differentiate between the answers to determine a winner, which is a bummer. – AdmBorkBork – 2017-12-06T13:32:59.293
@AdmBorkBork, you are correct... I'm excited to see the solutions though! I do apologize for there not being enough clarity to determine a true WINNER; in all honesty, I just wanted to see if this problem was solvable. – TheProgrammer – 2017-12-06T17:04:50.107
1@JoshuaCrotts you could always raise
N
to, say, 1000 or 10000. – Οurous – 2017-12-06T18:23:09.010@Ourous But wouldn't that invalidate some answers? Would I need to repost the question? – TheProgrammer – 2017-12-07T01:40:13.243
4Congrats on PPCG post #150,000 ;) – ETHproductions – 2017-12-07T02:21:32.737
@JoshuaCrotts as far as I can tell, all the current answers either take
N
as an argument or have it in a variable, which would require almost no change. None have methods dependent on a certain bound of N. – Οurous – 2017-12-07T06:24:41.233@Ourous should I just repost the question? Or should I update it? If I want people to get a fair chance/actually SEE the post, would updating it bump it? – TheProgrammer – 2017-12-08T20:21:07.627
I think, I found an instance which is fairly regular, but seems to be difficult to solve for the programs I've tried (clingo + cpp). Maybe someone can confirm its validity and the results.
– politza – 2017-12-09T00:24:09.197@politza My (fixed) clingo program solves it in ≈ 0.03 seconds, and
clingo -n0
(compute all models) validates that 231 is the only solution. – Anders Kaseorg – 2017-12-09T21:02:59.130@JoshuaCrotts Is there an unstated assumption that no number is repeated (so
23456789100
must parse as 2, …, 9, 10, 0 rather than 2, …, 9, 1, 0, 0)? – Anders Kaseorg – 2017-12-09T21:05:20.623@Anders Kaseorg That is correct; none of the numbers are repeated. – TheProgrammer – 2017-12-11T01:32:15.683
@JoshuaCrotts Great (please edit in this clarification though).
– Anders Kaseorg – 2017-12-11T03:11:08.943