11
1
Challenge
Find the smallest cover of bases (e.g., moduli) whose sets of quadratic residue can be tested via table-lookup to definitively determine whether or not a given non-negative integer n is a perfect square. The bases must all be less than or equal to the square root of the maximum value of n.
The answer with the smallest set of bases for a given category of n wins the challenge. (This means there may potentially be more than one winner.) The categories of n are:
Category Maximum allowed n Maximum allowed modulus/base
------------- -------------------- ----------------------------
8-bit values 255 15
16-bit values 65535 255
32-bit values 4294967295 65535
64-bit values 18446744073709551615 4294967295
In the event of a tie with two sets having equal cardinality, the tie will go to the set having the greater ability to detect non-squares earlier in the sequence.
In the event that no complete covers are found (which is entirely likely for the 32-bit and 64-bit categories), the winner will be the set of bases which statistically or provably rules out the highest percentage of non-squares (without incorrectly reporting squares as non-squares). See below for a discussion of incomplete covers.
Background
In many number theory applications, the question arises whether or not some number n is a perfect square (0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, etc.). One way to test whether n is square is to test whether floor(√n)² = n, that is, whether the rounded-down square root of n, when squared, gives back n. For example, floor(√123)² = 11² = 121, which is not 123, so 123 is not square; but floor(√121)² = 11² = 121, so 121 is square. This method works fine for small numbers, especially when a hardware square-root operation is available. But for large numbers (hundreds or thousands of bits) it can be very slow.
Another way to test for squareness is to rule out non-squares using quadratic residue tables. For example, all squares in base 10 must have a final (ones-place) digit that is either 0, 1, 4, 5, 6, or 9. These values form the set of quadratic residue for base 10. So if a base-10 number ends in 0, 1, 4, 5, 6, or 9, you know that it might be square, and further examination will be required. But if a base-10 number ends in 2, 3, 7, or 8, then you can be certain that it is not square.
So let's look at another base. All squares in base 8 must end in 0, 1, or 4, which conveniently is only 3 of 8 possibilities, meaning a 37.5% chance of a random number being possibly square, or 62.5% chance of a random number definitely not being square. Those are much better odds than what base 10 gives. (And note that a base-8 modulus operation is simply a logical-and operation, as opposed to base-10 modulus, which is a division by 10 with remainder.)
Are there even better bases? Well, yes, actually. Base 120 has 18 possibilities (0, 1, 4, 9, 16, 24, 25, 36, 40, 49, 60, 64, 76, 81, 84, 96, 100, and 105), which represents only a 15% chance of possibly being square. And base 240 is better yet, with only 24 possibilities, representing only 10% chance of possibly being square.
But no single base alone can definitively determine squareness (unless it is larger than the maximum number being tested, which is eminently impractical). A single base alone can only rule out squareness; it cannot conclusively verify squareness. Only a carefully selected set of bases, working together, can conclusively verify squareness over a range of integers.
So, the question becomes: What set of bases forms a minimal cover which together permit the definitive deduction of squareness or non-squareness?
Example of a correct but non-minimal cover
The cover 16-base cover { 3, 4, 5, 7, 8, 9, 11, 13, 16, 17, 19, 23, 25, 29, 31, 37 } is sufficient to definitively determine squareness or non-squareness of all 16-bit values 0 to 65535. But it is not a minimal cover, because at least one 15-base cover exists that is also easily discoverable. In fact, it is likely that far smaller covers exist — perhaps with as few as 6 or 7 bases.
But for illustration, let's take a look at testing a sample value of n using this 16-base cover set. Here are the sets of quadratic residues for the above set of bases:
Base m Quadratic residue table specific to base m
------ ----------------------------------------------------
3 {0,1}
4 {0,1}
5 {0,1,4}
7 {0,1,2,4}
8 {0,1,4}
9 {0,1,4,7}
11 {0,1,3,4,5,9}
13 {0,1,3,4,9,10,12}
16 {0,1,4,9}
17 {0,1,2,4,8,9,13,15,16}
19 {0,1,4,5,6,7,9,11,16,17}
23 {0,1,2,3,4,6,8,9,12,13,16,18}
25 {0,1,4,6,9,11,14,16,19,21,24}
29 {0,1,4,5,6,7,9,13,16,20,22,23,24,25,28}
31 {0,1,2,4,5,7,8,9,10,14,16,18,19,20,25,28}
37 {0,1,3,4,7,9,10,11,12,16,21,25,26,27,28,30,33,34,36}
Now let's test the number n = 50401 using this set of bases, by converting it into each base. (This is not the most efficient way examine residue, but it is sufficient for explanatory purposes.) It's the 1's place that we're interested in here (marked below in parentheses):
Base "Digits" in base m
m m^9 m^8 m^7 m^6 m^5 m^4 m^3 m^2 m^1 ( m^0 )
---- -----------------------------------------------------------------
3 2 1 2 0 0 1 0 2 0 ( 1 ) ✓
4 3 0 1 0 3 2 0 ( 1 ) ✓
5 3 1 0 3 1 0 ( 1 ) ✓
7 2 6 6 6 4 ( 1 ) ✓
8 1 4 2 3 4 ( 1 ) ✓
9 7 6 1 2 ( 1 ) ✓
11 3 4 9 5 ( 10 )
13 1 9 12 3 ( 0 ) ✓
16 12 4 14 ( 1 ) ✓
17 10 4 6 ( 13 ) ✓
19 7 6 11 ( 13 )
23 4 3 6 ( 8 ) ✓
25 3 5 16 ( 1 ) ✓
29 2 1 26 ( 28 ) ✓
31 1 21 13 ( 26 )
37 36 30 ( 7 ) ✓
So we can see that in 13 of these bases, the residue matches a known quadratic residue (call this a "hit" in the table), and in 3 of these bases, the residue does not match a known quadratic residue (call this a "miss"). All it takes is 1 miss to know that a number is non-square, so we could stop at 11, but for illustrative purposes, we've examined all 16 of the bases here.
Example of an incomplete cover
Technically, an incomplete cover is not a cover, but that's beside the point. The set of bases { 7, 8, 11, 15 } nearly covers all the 8-bit values of n from 0 to 255 correctly, but not quite. In particular, it incorrectly identifies 60 and 240 as being square (these are false positives) — but it does correctly identify all of the actual squares (0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, and 225) and makes no other false positives. So this is a 4-set which almost succeeds as a solution, but ultimately fails, because an incomplete cover is not a valid solution.
For 8-bit n, the set of bases { 7, 8, 11, 15 } is one of two sets of 4 bases which produce two errors, and there are seven sets of 4 bases which only produce one error. No sets of 4 bases actually exist which form a complete and accurate cover of 8-bit values. Can you find a set of 5 bases which produce no errors, covering all 8-bit values correctly? Or do you need 6 or more? (I do know the answer for 8-bit n, but I'm not going to give it away. I do not know the answer for 16-bit, 32-bit, or 64-bit, and I believe even the 16-bit case is impossible to solve via brute-force searching. Solving the 32-bit and 64-bit cases will certainly require genetic, heuristic, or other search techniques.)
A comment on cryptographically large numbers
Beyond 64-bit numbers — up in the hundreds or thousands of binary digits — this is where a quick squareness check really comes in most handy, even if the cover is incomplete (which it most certainly will be for really large numbers). How could a test like this still be useful even if it is insufficiently decisive? Well, imagine you had an extremely fast test for squareness which worked correctly 99.9% of the time and gave false negatives the remaining 0.1% of the time and never gave false positives. With such a test, you would be able to determine non-squareness of a number nearly instantly, and then in the exceptional cases of indecisiveness, you could then resort to a slower method to resolve the unknown a different way. This would save you quite a bit of time.
For example, the set { 8, 11, 13, 15 } is correct 99.61% of the time for 8-bit values of n from 0 to 255, is correct 95.98% of the time for 16-bit values of n from 0 to 65535, and is correct 95.62% of the time for 24-bit values of n from 0 to 16777215. As n goes to infinity, the percentage of correctness for this set of bases goes down, but it asymptotically approaches and never drops below 95.5944% correctness.
So even this very tiny set of 4 small bases is useful to nearly immediately identify about 22 out of 23 arbitrarily large numbers as being non-squares — obviating the need for further inspection of those numbers by slower methods. The slower methods then need only be applied in the small percentage of cases which could not be ruled out by this quick test.
It is interesting to note that some 16-bit bases achieve better than 95% all on their own. In fact, each of the bases below is able to weed out better than 97% of all numbers up to infinity as not being square. The quadratic residue set for each of these bases can be represented as a packed-bit array using only 8192 bytes.
Here are the 10 most powerful single bases less than 2^16:
Rank Base Prime factorization Weeds out
---- ------------------------------ ---------
1. 65520 = 2^4 x 3^2 x 5 x 7 x 13 97.95%
2. 55440 = 2^4 x 3^2 x 5 x 7 x 11 97.92%
3. 50400 = 2^5 x 3^2 x 5^2 x 7 97.56%
4. 52416 = 2^6 x 3^2 x 7 x 13 97.44%
5. 61200 = 2^4 x 3^2 x 5^2 x 17 97.41%
6. 44352 = 2^6 x 3^2 x 7 x 11 97.40%
7. 63360 = 2^7 x 3^2 x 5 x 11 97.39%
8. 60480 = 2^6 x 3^3 x 5 x 7 97.38%
9. 63840 = 2^5 x 3 x 5 x 7 x 19 97.37%
10. 54720 = 2^6 x 3^2 x 5 x 19 97.37%
See anything interesting that these bases all have in common? There's no reason to think they might be useful in combination together (maybe they are, maybe they aren't), but there are some good clues here as to what bases are likely to be most influential for the larger categories of numbers.
Side challenge: One of the most influential bases (if not the most) up to 2^28 is 245044800, which alone can correctly weed out 99.67% of non-squares, or about 306 of 307 random numbers thrown at it. Can you find the most influential single base less than 2^32?
Related
There are some very nice ideas in the following questions that are closely related, as well as several micro-optimization tricks for making certain operations faster. Although the linked questions do not specifically set out to find the strongest set of bases, the idea of strong bases is implicitly central some of the optimization techniques used there.
How will you determine the tie-breaker short of testing every single number in the given range and counting how many checks were made in total? – Martin Ender – 2014-08-03T09:18:20.397
I'll look at the cardinality of the sets of quadratic residues for each base. For example, 4 is a better base than 3, because only half of the values modulo 4 are quadratic residues, whereas two-thirds of the values modulo 3 are quadratic residues. Thus, 4 has a greater ability to weed out numbers earlier. The worst base is 2, because it cannot rule out any number, and the best base less than 256 is 240, which is capable of ruling out 90% of numbers. Might have to do Monte Carlo sampling for really large bases. – Todd Lehman – 2014-08-03T09:23:23.973
Yeah, that makes sense. But will you decide the tie just by the first base whose probability differs, or how will you figure out the efficiency of the entire set based on the probabilities? I'm also thinking that the probabilities aren't independent any more once you've checked other bases. – Martin Ender – 2014-08-03T09:30:33.167
2In the case of large n spaces, I think I will have to decide the tie based on the overall estimated efficiency, as calculated by multiplying the probabilities predicted by each residue set. For example, the bases {8,11,13,15} have probabilities of 0.375, 0.545455, 0.538462, and 0.4, respectively, which multiply to 0.044056. Subtracting from 1, this gives 0.955944, which agrees very closely with the exhaustive counting result of 95.62% as measured over all n in [0,2^24-1]. – Todd Lehman – 2014-08-03T09:47:13.173