Smooth maximum

In mathematics, a smooth maximum of an indexed family x1, ..., xn of numbers is a smooth approximation to the maximum function meaning a parametric family of functions such that for every α, the function is smooth, and the family converges to the maximum function as . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, as and as . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.

Examples

Smoothmax applied on '-x' and x function with various coefficients. Very smooth for =0.5, and more sharp for =8.

For large positive values of the parameter , the following formulation is a smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.

has the following properties:

  1. as
  2. is the arithmetic mean of its inputs
  3. as

The gradient of is closely related to softmax and is given by

This makes the softmax function useful for optimization techniques that use gradient descent.

LogSumExp

Another smooth maximum is LogSumExp:

This can also be normalized if the are all non-negative, yielding a function with domain and range :

The term corrects for the fact that by canceling out all but one zero exponential, and if all are zero.

p-Norm

Another smooth maximum is the p-norm:

which converges to as .

An advantage of the p-norm is that it is a norm. As such it is "scale invariant" (homogeneous): , and it satisfies the triangular inequality.

Use in numerical methods

Other choices of smoothing function

gollark: I don't think it's *impossible* but it would probably be hard to do that.
gollark: https://i.kym-cdn.com/photos/images/original/001/265/329/e83.png
gollark: I recently saw what I'm told was the original "virgin vs chad" meme, hold on.
gollark: I have no idea. Besides, I am too busy with the [REDACTED].
gollark: JS, but more so.

See also

References

    M. Lange, D. Zühlke, O. Holz, and T. Villmann, "Applications of lp-norms and their smooth approximations for gradient based learning vector quantization," in Proc. ESANN, Apr. 2014, pp. 271-276. (https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-153.pdf)

    This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.