MM algorithm

The MM algorithm is an iterative optimization method which exploits the convexity of a function in order to find their maxima or minima. The MM stands for “Majorize-Minimization” or “Minorize-Maximization”, depending on whether the desired optimization is a maximization or a minimization. MM itself is not an algorithm, but a description of how to construct an optimization algorithm.

The expectation–maximization algorithm can be treated as a special case of the MM algorithm.[1][2] However, in the EM algorithm conditional expectations are usually involved, while in the MM algorithm convexity and inequalities are the main focus, and it is easier to understand and apply in most cases.

History

The historical basis for the MM algorithm can be dated back to at least 1970, when Ortega and Rheinboldt were performing studies related to line search methods.[3] The same concept continued to reappear in different areas in different forms. In 2000, Hunter and Lange put forth "MM" as a general framework.[4] Recent studies have applied the method in a wide range of subject areas, such as mathematics, statistics, machine learning and engineering.

Algorithm

The MM algorithm works by finding a surrogate function that minorizes or majorizes the objective function. Optimizing the surrogate function will drive the objective function upward or downward until a local optimum is reached.

Taking the minorize-maximization version, let $f(\theta )$ be the objective concave function to be maximized. At the $m$ step of the algorithm, $m=0,1...$ , the constructed function $g(\theta |\theta _{m})$ will be called the minorized version of the objective function (the surrogate function) at $\theta _{m}$ if

g(\theta |\theta _{m})\leq f(\theta ){\text{ for all }}\theta

g(\theta _{m}|\theta _{m})=f(\theta _{m})

Then, maximize $g(\theta |\theta _{m})$ instead of $f(\theta )$ , and let

\theta _{m+1}=\arg \max _{\theta }g(\theta |\theta _{m})

The above iterative method will guarantee that $f(\theta _{m})$ will converge to a local optimum or a saddle point as $m$ goes to infinity.[5] By the above construction

f(\theta _{m+1})\geq g(\theta _{m+1}|\theta _{m})\geq g(\theta _{m}|\theta _{m})=f(\theta _{m})

The marching of $\theta _{m}$ and the surrogate functions relative to the objective function is shown in the figure.

MM algorithm

Majorize-Minimization is the same procedure but with a convex objective to be minimised.

Constructing the surrogate function

One can use any inequality to construct the desired majorized/minorized version of the objective function. Typical choices include

Jensen's inequality
Convexity inequality
Cauchy–Schwarz inequality
Inequality of arithmetic and geometric means
Quadratic majorization/mininorization via second order Taylor expansion of twice-differentiable functions with bounded curvature.

gollark: Oh hypermemetic beeoids, what sort of horrible system doesn't even have *nano*?

gollark: Does anyone know some Android developers, by any chance? Because ææææææææææææÆÆÆÆÆÆÆÆÆÆÆÆÆÆÆæææÆÆÆÆÆÆÆÆÆæææÆÆÆÆÆa all is bee.

gollark: ```console=tty0 console=ttyS0,921600n1 vmalloc=400M slub_debug=OFZPU page_owner=on swiotlb=noforce androidboot.hardware=mt6765 maxcpus=8 loop.max_part=7 firmware_class.path=/vendor/firmware has_battery_removed=0 androidboot.boot_devices=bootdevice,11230000.mmc root=/dev/ram androidboot.verifiedbootstate=orange bootopt=64S3,32N2,64N2 androidboot.selinux=permissive buildvariant=eng androidboot.meta_log_disable=0 lpddr_used_index=2 prize_ddr_hardinfo= hall_up_cali_data=-16-16-16 hall_down_cali_data=-16-16-16 printk.disable_uart=1 bootprof.pl_t=4627 bootprof.lk_t=6878 bootprof.logo_t=619 androidboot.serialno=3082SH1001014876 androidboot.bootreason=kernel_panic gpt=1 usb2jtag_mode=0 mrdump_ddrsv=yes mrdump_cb=0x10e800,0x1400 androidboot.dtb_idx=0 androidboot.dtbo_idx=0```The kernel command line is quite something.

gollark: Based on the unhelpfulness of the internet I *may* actually have managed to create an entirely new class of horrible Android issue.

gollark: I also seem to not have blkid or lsblk.

References

Lange, Kenneth. "The MM Algorithm" (PDF).
Kenneth Lange: "MM Optimization Algorithms", SIAM, ISBN 978-1-611974-39-3 (2016).
Ortega, J.M.; Rheinboldt, W.C. (1970). Iterative Solutions of Nonlinear Equations in Several Variables. New York: Academic. pp. 253–255. ISBN 9780898719468.
Hunter, D.R.; Lange, K. (2000). "Quantile Regression via an MM Algorithm". Journal of Computational and Graphical Statistics. 9 (1): 60–77. CiteSeerX 10.1.1.206.1351. doi:10.2307/1390613. JSTOR 1390613.
Wu, C. F. Jeff (1983). "On the Convergence Properties of the EM Algorithm". Annals of Statistics. 11 (1): 95–103. doi:10.1214/aos/1176346060. JSTOR 2240463.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Lange, Kenneth. "The MM Algorithm" (PDF).

[2] Kenneth Lange: "MM Optimization Algorithms", SIAM, ISBN 978-1-611974-39-3 (2016).

[3] Ortega, J.M.; Rheinboldt, W.C. (1970). Iterative Solutions of Nonlinear Equations in Several Variables. New York: Academic. pp. 253–255. ISBN 9780898719468.

[4] Hunter, D.R.; Lange, K. (2000). "Quantile Regression via an MM Algorithm". Journal of Computational and Graphical Statistics. 9 (1): 60–77. CiteSeerX 10.1.1.206.1351. doi:10.2307/1390613. JSTOR 1390613.

[5] Wu, C. F. Jeff (1983). "On the Convergence Properties of the EM Algorithm". Annals of Statistics. 11 (1): 95–103. doi:10.1214/aos/1176346060. JSTOR 2240463.