20
I was intrigued by the design of this graphic from the New York Times, in which each US state is represented by a square in a grid. I wondered whether they placed the squares by hand or actually found an optimal placement of squares (under some definition) to represent the positions of the contiguous states.
Your code is going to take on a small part of the challenge of optimally placing squares to represent states (or other arbitrary two-dimensional shapes.) Specifically, it's going to assume that we already have all the geographic centers or centroids of the shapes in a convenient format, and that the optimal representation of the data in a diagram like this is the one in which the total distance from the centroids of the shapes to the centers of the squares that represent them is minimal, with at most one square in each possible position.
Your code will take a list of unique pairs of floating-point X and Y coordinates from 0.0 to 100.0 (inclusive) in any convenient format, and will output the non-negative integer coordinates of unit squares in a grid optimally placed to represent the data, preserving order. In cases where multiple arrangements of squares are optimal, you can output any of the optimal arrangements. Between 1 and 100 pairs of coordinates will be given.
This is code golf, shortest code wins.
Examples:
Input: [(0.0, 0.0), (1.0, 1.0), (0.0, 1.0), (1.0, 0.0)]
This is an easy one. The centers of the squares in our grid are at 0.0, 1.0, 2.0, etc. so these shapes are already perfectly placed at the centers of squares in this pattern:
21
03
So your output should be exactly these coordinates, but as integers, in a format of your choice:
[(0, 0), (1, 1), (0, 1), (1, 0)]
Input: [(2.0, 2.1), (2.0, 2.2), (2.1, 2.0), (2.0, 1.9), (1.9, 2.0)]
In this case all the shapes are close to the center of the square at (2, 2), but we need to push them away because two squares cannot be in the same position. Minimizing the distance from a shape's centroid to the center of the square that represents it gives us this pattern:
1
402
3
So your output should be [(2, 2), (2, 3), (3, 2), (2, 1), (1, 2)]
.
Test cases:
[(0.0, 0.0), (1.0, 1.0), (0.0, 1.0), (1.0, 0.0)] -> [(0, 0), (1, 1), (0, 1), (1, 0)]
[(2.0, 2.1), (2.0, 2.2), (2.1, 2.0), (2.0, 1.9), (1.9, 2.0)] -> [(2, 2), (2, 3), (3, 2), (2, 1), (1, 2)]
[(94.838, 63.634), (97.533, 1.047), (71.954, 18.17), (74.493, 30.886), (19.453, 20.396), (54.752, 56.791), (79.753, 68.383), (15.794, 25.801), (81.689, 95.885), (27.528, 71.253)] -> [(95, 64), (98, 1), (72, 18), (74, 31), (19, 20), (55, 57), (80, 68), (16, 26), (82, 96), (28, 71)]
[(0.0, 0.0), (0.1, 0.0), (0.2, 0.0), (0.0, 0.1), (0.1, 0.1), (0.2, 0.1), (0.0, 0.2), (0.1, 0.2), (0.2, 0.2)] -> [(0, 0), (1, 0), (2, 0), (0, 1), (1, 1), (2, 1), (0, 2), (1, 2), (2, 2)]
[(1.0, 0.0), (1.0, 0.1), (1.0, 0.2), (1.0, 0.3)] -> [(1, 0), (0, 0), (2, 0), (1, 1)] or [(1, 0), (2, 0), (0, 0), (1, 1)]
[(3.75, 3.75), (4.25, 4.25)] -> [(3, 4), (4, 4)] or [(4, 3), (4, 4)] or [(4, 4), (4, 5)] or [(4, 4), (5, 4)]
Total distance from the centroids of shapes to the centers of the squares that represent them in each case (please let me know if you spot any errors!):
0.0
3.6
4.087011
13.243299
2.724791
1.144123
Just for fun:
Here's a representation of the geographic centers of the contiguous United States in our input format, at roughly the scale the Times used:
[(15.2284, 3.1114), (5.3367, 3.7096), (13.0228, 3.9575), (2.2198, 4.8797), (7.7802, 5.5992), (20.9091, 6.6488), (19.798, 5.5958), (19.1941, 5.564), (17.023, 1.4513), (16.6233, 3.0576), (4.1566, 7.7415), (14.3214, 6.0164), (15.4873, 5.9575), (12.6016, 6.8301), (10.648, 5.398), (15.8792, 5.0144), (13.2019, 2.4276), (22.3025, 8.1481), (19.2836, 5.622), (21.2767, 6.9038), (15.8354, 7.7384), (12.2782, 8.5124), (14.1328, 3.094), (13.0172, 5.3427), (6.142, 8.8211), (10.0813, 6.6157), (3.3493, 5.7322), (21.3673, 7.4722), (20.1307, 6.0763), (7.5549, 3.7626), (19.7895, 7.1817), (18.2458, 4.2232), (9.813, 8.98), (16.8825, 6.1145), (11.0023, 4.2364), (1.7753, 7.5734), (18.8806, 6.3514), (21.3775, 6.6705), (17.6417, 3.5668), (9.9087, 7.7778), (15.4598, 4.3442), (10.2685, 2.5916), (5.3326, 5.7223), (20.9335, 7.6275), (18.4588, 5.0092), (1.8198, 8.9529), (17.7508, 5.4564), (14.0024, 7.8497), (6.9789, 7.1984)]
To get these, I took the coordinates from the second list on this page and used 0.4 * (125.0 - longitude)
for our X coordinate and 0.4 * (latitude - 25.0)
for our Y coordinate. Here's what that looks like plotted:
The first person to use the output from their code with the above coordinates as input to create a diagram with actual squares gets a pat on the back!
I believe the last point in your second example should be
(1, 2)
, not(1, 1)
. – Tim Pederick – 2016-01-18T12:08:47.237Good catch, thanks! – Luke – 2016-01-18T12:50:28.037
Can you please also post the total of all the distances in each test case? This is certainly a nontrivial problem, and that would allow us to verify whether an alternative solution is actually also optimal. – flawr – 2016-01-18T16:26:58.000
PS: Have you actually tested that the given map is actualy a valid result of your optimization problem? Because intuitively I do not think it is. – flawr – 2016-01-18T16:27:02.887
I can add the total distances. The map the Times used is almost certainly not optimal. – Luke – 2016-01-18T16:48:49.170
The given map also includes the District of Columbia, which isn't a state. – mbomb007 – 2016-10-18T20:07:43.117