The size of the moon mystery

I am sure you have heard that the moon changes its size. When you're in love and you're lucky, the moon is almost twice in size compared to normal situations. Some people say the reason is the atmosphere which acts as a lens. Others think that it is only a matter of comparison to other objects such as trees nearby. Whatever explanation you read, it's quite subjective.

The size of the moon science

Ok, we're programmers, aren't we? We rely on facts, right? So here's the experiment:

Take a nice camera which supports setting time and aperture manually.
Set your camera to maximum zoom level.
Go out, take some photographs of the moon in order to detect the best settings so that the moon is sharp and lighting is just fine.
Remember the settings
Take photos of the moon with those settings every time you think the moon is large or small.
Calculate the size of the moon in pixels

The camera won't lie, would it? By counting the bright pixels we can effectively measure the size of the moon - at least in pixels.

If the size is the same across all photos, then it's a bug in our brain. If the size differs, then there's room for speculation

the moon really grows (but what does it eat?)
there's an atmospherical lens effect
the moon has an elliptical curve and is sometimes nearer, sometimes further away from earth
...

But I'll leave that open until your task is completed. Of course you want to know in advance if your software can calculate the moon size accurately.

The task

Given a few optimized pictures of the moon, please calculate the size of the moon. The optimization is: the pixels are either black or white. Nothing in between. No antialiasing. That makes it easy, doesn't it?

The caveat: the moon is not always full, you know ... it can be a sickle! But even in shape of a sickle, the size of the moon is larger. So you'll calculate the full size, please.

Your program takes a PNG as input, e.g. as file name command line argument, piped into stdin or as a Bitmap object (of a standard framework library) if you write a function instead of a program.
Your program works with any reasonable input bitmap size, not necessarily square. Minimum width and height of 150 pixels are guaranteed.
The full moon covers at least 25% of the picture.
Your program outputs the calculated size of the moon in pixels as if it were a full moon.
We assume that the moon is a perfect sphere.
The exact size is always an integer number, but you can output a decimal number if your calculation returns that.
The accuracy should be between 98% and 102%. (That's rather a guess than something I could guarantee to be achievable. If you think it's too hard to reach, please leave a comment.)

Update:

The center of the moon isn't necessarily in the middle of the picture.
The minimum visible area is 5% of the moon or 1.25% of the total number of pixels.
The picture is taken in a way that the whole moon would fit the image, i.e. the total number of pixels is an upper boundary for the moon size.
The moon will not be cropped / clipped.

The samples

You can generate your own samples using the blend file if you like. I have created the following pictures for you. You can count pixels in a PNG file using WhitePixelCounter.exe (needs .NET) to check whether the image contains black and white pixels only and how many of them.

The following 256x256 pixel images differ in the amount of white pixels, but should all result in a calculated moon size of 16416 pixels.

Full moon Moon Moon Moon Moon Moon

And these 177x177 pixel images should return 10241 pixels. The images are basically the same, but this time a camera with a different focal length was used.

Moon Moon Moon Moon Moon Moon

Non-square and non-centered samples with a result of 9988:

Moon in a non-square frame

Oh, I don't have a reference implementation for now and I even don't know whether I am able to implement something. But in my brain there is a strong belief that tells me it must be mathematically solvable.

The rules

This is Code Golf. The shortest code on 2015-03-30 gets accepted.

Thomas Weller

Posted 2015-03-12T23:13:43.693

Reputation: 1 925

9In all the examples, the center of the moon appears to be centered within the picture. Can we assume the moon will always be centered? – Digital Trauma – 2015-03-12T23:24:57.043

1your accuracy of +/-2% on area corresponds to +/-1% on diameter: example r=100 pixels, area = 10000pi; r=101 pixels, area = 10201pi. Your smaller image has r=72 therefore d=144 so it should just be possible. For images below d=100 I think the accuracy could not be met, however. – Level River St – 2015-03-12T23:49:32.197

@DigitalTrauma: the center needn't be in the middle. – Thomas Weller – 2015-03-13T16:58:14.960

@MartinBüttner: the minimum visible percentage is 5% of the moon or 1,25% of the picture. – Thomas Weller – 2015-03-13T17:17:49.910

@MartinBüttner: ok, I've updated the question, updated the blend file to produce non-square, non-centered images by default. You can download all images here (*.png.zip). Updated pixel counter as well: outputs some more information and checks the 1.25% rule.

– Thomas Weller – 2015-03-13T20:27:10.287

@steveverrill: I have written a kind of "reference implementation". It fails if the sickle is pointing straight upwards, but for all other pictures, the area is from -1.66% to +1.88%. – Thomas Weller – 2015-03-13T22:17:07.440

Answers

Mathematica 126 119 109 bytes

Mathematica can measure the elongation of a component in an image. A full moon, being perfectly symmetric, has an elongation of 0, on a scale of 0 to 1.

A diminishing moon becomes increasingly elongated, to a maximum of roughly 0.8.

0.998 -0.788 x-0.578 x^2 was the empirically determined model (based on the large photos) for `predicting the fullness of the moon (by area), given its elongation.

I adjusted the model to 1- 0.788 x -0.578 x^2 so that with exactly zero elongation (full moon) the model will return 1 for the pixel scale factor. It saves 4 bytes and still stays within the accuracy limits.

This model is used for any size images. The moon image does not need to be centered. It also does not need to cover a fixed proportion of the photo.

Here are the data points (elongation, displayedMoonPixels/fullMoonPixels) for the large images and the parabolic model that was generated to fit the data. Linear models fit ok, but the quadratic model is dead on, within limits (see below).

Here the data are from the large pictures. So is the model

large crescents

Below, the data (the red points) are from the small pictures. The model (the blue curve) is the one generated by the large pictures, the same one as displayed above.

The smallest crescent has 7.5% the area of a full moon. (The smallest crescent among the large photos is 19% of a full moon.) If the quadratic model had been based on the small photos the fit below would be better, only because it accommodated the small crescent. A robust model, that would stand up under a wide range of conditions, including very small crescents, would be better made from a larger variety of pictures.

The closeness of fit shows that the model was not hard-coded for the given pictures. We can be fairly certain that the elongation of a moon is independent of the size of the photo, as one would expect.

small crescents

f takes the image, i, as input and outputs the predicted size of the full moon, in pixels. It works for off-center shots.

As the data below show, it all of the test cases except one. The moons were arranged from full to most diminished.

i_~c~t_ := Max@ComponentMeasurements[i, t][[All, 2]];
f@i_ := i~c~"Count"/(1 - 0.788 x - 0.578 x^2 /. x -> i~c~"Elongation")

More than one image component may turn up in a photo. Even a single pixel separated from the others will be considered a distinct component. For this reason, it is necessary to search "all" components, to find the one that has the greater number of pixels. (One of the small photos has more than one image component.)

Large pictures

Predictions of moon size made from the large photos were uniformly accurate.

{"predicted size of full moon", f[#] & /@ large}
{"accuracy", %[[2]]/16416}

{"predicted sizes of full moon", {16422., 16270.9, 16420.6, 16585.5, 16126.5, 16151.6}}

{"accuracy", {1.00037, 0.991161, 1.00028, 1.01033, 0.982367, 0.983891}}

Small pictures

Predictions of moon size made from the small photos were uniformly, with one great exception, the final picture. I suspect the issue stems from the fact that the crescent is very narrow.

{"predicted sizes of full moon", f[#] & /@ small}
{"accuracy", %[[2]]/10241}

{"predicted sizes of full moon",{10247.3, 10161., 10265.6, 10391., 10058.9, 7045.91}}
{"accuracy", {1.00061, 0.992192, 1.0024, 1.01465, 0.982221, 0.68801}}

DavidC

Posted 2015-03-12T23:13:43.693

Reputation: 24 524

Seems like I should learn Mathematica one day. How long did it take you to solve it without golfing? – Thomas Weller – 2015-03-13T22:47:57.957

1@Thomas W It took a 2-3 hours experimenting with various sorts of image processing features and other (linear) models until I obtained the graph you see posted. The coding was not very difficult. And there is almost no golfing other than uniting separate functions into a single function. – DavidC – 2015-03-13T23:22:56.053

104: i_~c~t_:=Max[#2&@@@i~ComponentMeasurements~t];f@i_:=i~c~"Count"/(1-0.788x-0.578x^2/.x->i~c~"Elongation") – Martin Ender – 2015-03-14T10:58:45.647

For unknown reasons, the #2&@@@ suggestion does not work – DavidC – 2015-03-14T12:47:24.343

Huh, I'll look into that later. Another way to shorten c is c=Max@ComponentMeasurements[##][[All,2]]& – Martin Ender – 2015-03-14T13:29:55.647

I can't get c=Max@ComponentMeasurements[##][[All,2]]& to work either. – DavidC – 2015-03-14T14:03:16.393

@ThomasW. The code now calls ComponentMeasures twice, once for determining the size (Count) and again for obtaining the elongation of the displayed moon. For that reason, what was a single function is now two functions, c (for ComponentMeasures) and f (which predicts the size of the full moon). – DavidC – 2015-03-14T14:28:19.593

J, 227 207 bytes (maximal error 1.9%)

My main idea is that if we can find 3 points on the contour of the moon which are on the contour of the full moon too we can compute the circumcircle of these points. That circumcircle will be to full moon.

If we find two white points with maximal distance those will always be such points as they will be either a real diagonal in the full moon or the endpoints of the crescent. ~~We can find the pair of points with the greatest distance in any graph by selecting the point furthest from any given starting point and then selecting the point furthest from the selected one.~~

We find a third point with a maximal value of the products of the distances from the previous points. This will always be on the contour and on the outer side of a crescent or the bigger side of a gibbous.

The diameter of the circumcircle is computed as the length of one side divided by the sinus of the opposite angle.

The time-complexity of this method is linear in the size of the input image.

Code

f=.3 :0
load'graphics/png'
i=.readpng y
p=.(,i=_1)#|:,"%.0 1|:,"0/&>/<@i."*$i
s=.%:+/|:*:(-1|.]) (([,],:m@(*&d))(m@d))(m=.p{~(i.>./)@])(d=.+/@:*:@((|:p)-])) 0{p
o.*:-:({.s)%0 o.((+/-2*{.)*:s)%2**/}.s
)

The function expects the input filename as a string.

(For a (little) more readable version check revision history.)

Code explanation

p is a list of white pixel coordinates (called points in the future)
function d computes distances between elements of p and a given point
the second part of the definition of s creates a 3-point list:
- A is the furthest point from the first point in the list
- B is the furthest point from A
- C is a point with a maximal value of distance form A times distance from B
s is the side-lengths of triangle ABC
the last line computes the area of the circumcircle of ABC which is the full moon

Results

The largest error is 1.9%.

Images are in the same order as in the question.

Output  Accuracy
----------------
  16407 0.999453 NB. Large images
16375.3 0.997523
16223.9 0.988301
16241.5 0.989369
16262.6 0.990654
16322.1 0.994279
10235.3 0.999445 NB. Small images
10235.3 0.999444
10221.2 0.998067
10220.3 0.997978
  10212 0.997169
10229.6  0.99889
9960.42 0.997239 NB. Offset images
9872.22 0.988408
10161.8   1.0174
9874.95 0.988681
 9805.9 0.981768

randomra

Posted 2015-03-12T23:13:43.693

Reputation: 19 909

+1 for participating and mentioning the approach. I'm sorry I didn't specify that the center needn't be in the middle. Accidentally the sample images are all centered. That's my fault. – Thomas Weller – 2015-03-13T17:00:36.557

@ThomasW. Temporally deleted my answer until I correct it. – randomra – 2015-03-13T19:08:00.530

Matlab 162 156 (not quite in the current error margin)

First of all: The accuracy is under 2% for all but one image in each of the two series, where it is greater (about 5% and 14%). My approach was finding the two pixels of the moon that are the furthest away from each other, and then using this as an estimate for the diameter.

a=imread(input(''));                 %read input image
b=a(:,:,1)>0;                        %binarize red channel
s=size(b);                           %get size of the image
[x,y]=meshgrid(1:s(1),1:s(2));       
z=(x+i*y).*b;z=z(z~=0);              %find the coordinates of all white pixels (as a list)
o=ones(size(z(:)))*z(:)';            
disp(max(max(abs(o-o.').^2))*pi/4);  %calculate the maximum of the distance between each possible pair and evaluate area formula

These are the accuracy results (reltative deviation 1 - (predicted size / real size))

0.0006 0.0025 0.0169 0.0500 0.0521 0.0113 0.0006 0.0006 0.0026 0.0472 0.1383 0.0131

flawr

Posted 2015-03-12T23:13:43.693

Reputation: 40 560

C# - 617

This solution does not work for all images, because on one of the images, the slope (m) becomes infinity.

The principle was mentioned before:

Find two points with maximum distance (red)
Imagine a line between them (red)
Imagine a line with rectangular angle in the middle (green)
Find white points on the green line
Use the one with maximum distance from other points (green)
Calculate the area of a circle from three points

Explanation

The problematic case is this one, where the slope is infinity. It is possible to workaround by rotating the image 90° or in code, loop over the y axis instead of x.

Problematic moon

double A(Bitmap b){var a=new List<P>();for(var y=0;y<b.Height;y++)for(var x=0;x<b.Width;x++)if(b.GetPixel(x,y).R>0)a.Add(new P{x=x,y=y});double c=0.0,d=0.0,e=0.0,f=0.0,g=0.0,n=double.MaxValue;foreach(var h in a)foreach(var i in a){var t=Math.Sqrt(Math.Pow(h.x-i.x,2)+Math.Pow(h.y-i.y,2));if(t>c){d=h.x;f=i.x;e=h.y;g=i.y;c=t;}}c=(f-d)/(e-g);for(int x=0;x<b.Width;x++){int y=(int)(c*x+((e+g)/2-c*(d+f)/2));if(y>=0&&y<b.Height&&b.GetPixel(x,y).R>0){var s=(g-e)/(f-d);var q=(y-g)/(x-f);var j=(s*q*(e-y)+q*(d+f)-s*(f+x))/(2*(q-s));var k=-(j-(d+f)/2)/s+(e+g)/2;var l=(j-d)*(j-d)+(k-e)*(k-e);if(l<n)n=l;}}return Math.PI*n;}

The minimum accuracy is

+1,89% for the 256 pixel images
-0.55% for the 177 pixel images
-1.66% for the non-square images

Thomas Weller

Posted 2015-03-12T23:13:43.693

Reputation: 1 925

Calculate the size of the moon

The size of the moon mystery

The size of the moon science

The task

The samples

The rules

Answers

Mathematica 126 119 109 bytes

J, 227 207 bytes (maximal error 1.9%)

Code

Code explanation

Results

Matlab 162 156 (not quite in the current error margin)

C# - 617