Computing the geometric median

7

0

Given a list of integer points in an n-dimensional space, output the geometric median. Note that the output is not necessarily integer. Your answer must be precise to at least two digits after the decimal point.

You may assume that n < 16 and that every component of each point fits in a 32-bit signed integer.

You may write a program or a function. You may take input and give output in any reasonable format. You may optionally pass n as a separate argument, instead of having to deduce it.

You must implement an algorithm to compute the geometric median, and not brute force over the output domain.

Examples:

(5), (14), (10), (7)                         -> any 7 <= x <= 10 is valid
(2, 3), (8, 4), (12, -3), (0, 0), (0, 0)     -> (2.17, 1.29)
(0, 0, 3), (20, 7, 3), (13, 14, 15)          -> (13.18, 7.94, 7.23)
(2, 20, 3, 30), (20, 2, 20, 2), (1, 2, 3, 4) -> (5.46, 5.12, 6.84, 8.06)

The answer is only unique if the set of points do not lie on a line. If the answer is not unique, you may return any valid answer.

orlp

Posted 2015-08-23T14:32:09.013

Reputation: 37 067

Any requirement on complexity? Double-loop from min to max with step 0.01 will probably do the job. Is this allowed? (not sure whether it's less or more fun) – anatolyg – 2015-08-23T14:47:31.520

@anatolyg You must not brute force over the output domain. (Previously I stated that your solution mustn't rely on the input domain, but this unfairly rules out some probabilistic / hill climbing algorithms). – orlp – 2015-08-23T14:55:04.270

An accuracy of two decimal points isn't meaningful without some bound on the input coordinates. Perhaps when you say "integers" you mean 32-bit signed integers? Also, how do you expect us to confirm that our iterative algorithm is sufficiently accurate? Is it OK to just check this on the test points? I haven't found a proof of a formal quantitative bound of the Weiszfeld's algorithm in the worst case. – xnor – 2015-08-23T18:19:38.110

@xnor 32-bit signed integers as input bound is fine. Similarly, let's say that n < 16. Sadly I do not know the answer to your second question. I do not consider just these 4 test cases enough, though. If someone finds another counterexample to your answer you must fix it. – orlp – 2015-08-23T18:33:35.950

@xnor A proven algorithm does exist however: http://www.pnas.org/content/97/4/1423.full.pdf. You can see my implementation (without weights, just like the challenge) here: http://stackoverflow.com/a/30305181/565635.

– orlp – 2015-08-23T18:39:01.097

@orlp I'm not seeing a rate of convergence in that paper though, just a claim of the correct limit. – xnor – 2015-08-23T18:42:31.297

Answers

1

Matlab, 77 75

I just use the definition of the geometric median with a numerical minimization. In this case I make use of the Nelder-Mead simplex algorithm. It sould always converge as we try to find the minimum of a convex function (according to Wikipedia).

n=@(m)trace((m'*m).^.5);f=@(m)fminsearch(@(y)n(bsxfun(@minus,m,y)),m(:,1))

Old version: n=@(m)sum(diag(m'*m).^.5);f=@(m)fminsearch(@(y)n(bsxfun(@minus,m,y)),m(:,1))

The input format is a matix with the input vectors as columns, e.g:

m = [[2; 20; 3; 30],[20; 2; 20; 2],[1; 2; 3; 4]];

flawr

Posted 2015-08-23T14:32:09.013

Reputation: 40 560