Taxicab geometry

A taxicab geometry is a form of geometry in which the usual distance function or metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian coordinates. The taxicab metric is also known as rectilinear distance, L1 distance, L1 distance or norm (see Lp space), snake distance, city block distance, Manhattan distance or Manhattan length, with corresponding variations in the name of the geometry.[1] The latter names allude to the grid layout of most streets on the island of Manhattan, which causes the shortest path a car could take between two intersections in the borough to have length equal to the intersections' distance in taxicab geometry.

Taxicab geometry versus Euclidean distance: In taxicab geometry, the red, yellow, and blue paths all have the same shortest path length of 12. In Euclidean geometry, the green line has length and is the unique shortest path.

The geometry has been used in regression analysis since the 18th century, and today is often referred to as LASSO. The geometric interpretation dates to non-Euclidean geometry of the 19th century and is due to Hermann Minkowski.

Formal definition

The taxicab distance, , between two vectors in an n-dimensional real vector space with fixed Cartesian coordinate system, is the sum of the lengths of the projections of the line segment between the points onto the coordinate axes. More formally,

where are vectors

For example, in the plane, the taxicab distance between and is

Properties

Taxicab distance depends on the rotation of the coordinate system, but does not depend on its reflection about a coordinate axis or its translation. Taxicab geometry satisfies all of Hilbert's axioms (a formalization of Euclidean geometry) except for the side-angle-side axiom, as two triangles with equally "long" two sides and an identical angle between them are typically not congruent unless the mentioned sides happen to be parallel.

Circles

Circles in discrete and continuous taxicab geometry

A circle is a set of points with a fixed distance, called the radius, from a point called the center. In taxicab geometry, distance is determined by a different metric than in Euclidean geometry, and the shape of circles changes as well. Taxicab circles are squares with sides oriented at a 45° angle to the coordinate axes. The image to the right shows why this is true, by showing in red the set of all points with a fixed distance from a center, shown in blue. As the size of the city blocks diminishes, the points become more numerous and become a rotated square in a continuous taxicab geometry. While each side would have length using a Euclidean metric, where r is the circle's radius, its length in taxicab geometry is 2r. Thus, a circle's circumference is 8r. Thus, the value of a geometric analog to is 4 in this geometry. The formula for the unit circle in taxicab geometry is in Cartesian coordinates and

in polar coordinates.

A circle of radius 1 (using this distance) is the von Neumann neighborhood of its center.

A circle of radius r for the Chebyshev distance (L metric) on a plane is also a square with side length 2r parallel to the coordinate axes, so planar Chebyshev distance can be viewed as equivalent by rotation and scaling to planar taxicab distance. However, this equivalence between L1 and L metrics does not generalize to higher dimensions.

Whenever each pair in a collection of these circles has a nonempty intersection, there exists an intersection point for the whole collection; therefore, the Manhattan distance forms an injective metric space.

Applications

Measures of distances in chess

In chess, the distance between squares on the chessboard for rooks is measured in taxicab distance; kings and queens use Chebyshev distance, and bishops use the taxicab distance (between squares of the same color) on the chessboard rotated 45 degrees, i.e., with its diagonals as coordinate axes. To reach from one square to another, only kings require the number of moves equal to their respective distance; rooks, queens and bishops require one or two moves (on an empty board, and assuming that the move is possible at all in the bishop's case).

Compressed sensing

In solving an underdetermined system of linear equations, the regularization term for the parameter vector is expressed in terms of the -norm (taxicab geometry) of the vector.[2] This approach appears in the signal recovery framework called compressed sensing.

Differences of frequency distributions

Taxicab geometry can be used to assess the differences in discrete frequency distributions. For example, in RNA splicing positional distributions of hexamers, which plot the probability of each hexamer appearing at each given nucleotide near a splice site, can be compared with L1-distance. Each position distribution can be represented as a vector where each entry represents the likelihood of the hexamer starting at a certain nucleotide. A large L1-distance between the two vectors indicates a significant difference in the nature of the distributions while a small distance denotes similarly shaped distributions. This is equivalent to measuring the area between the two distribution curves because the area of each segment is the absolute difference between the two curves' likelihoods at that point. When summed together for all segments, it provides the same measure as L1-distance.[3]

History

The L1 metric was used in regression analysis in 1757 by Roger Joseph Boscovich.[4] The geometric interpretation dates to the late 19th century and the development of non-Euclidean geometries, notably by Hermann Minkowski and his Minkowski inequality, of which this geometry is a special case, particularly used in the geometry of numbers, (Minkowski 1910). The formalization of Lp spaces is credited to (Riesz 1910).

gollark: ...
gollark: Okay, so autogen up to is-one-thousand.
gollark: Or make a global Is-Fourteen web API.
gollark: Autogenerate all packages up to is-one-hundred?
gollark: Ridiculous. They should have factored it out into a microservice.

See also

Notes

  1. Black, Paul E. "Manhattan distance". Dictionary of Algorithms and Data Structures. Retrieved October 6, 2019.
  2. Donoho, David L. (March 23, 2006). "For most large underdetermined systems of linear equations the minimal -norm solution is also the sparsest solution". Communications on Pure and Applied Mathematics. 59 (6): 797–829. doi:10.1002/cpa.20132.
  3. Lim, Kian Huat; Ferraris, Luciana; Filloux, Madeleine E.; Raphael, Benjamin J.; Fairbrother, William G. (July 5, 2011). "Using positional distribution to identify splicing elements and predict pre-mRNA processing defects in human genes". Proceedings of the National Academy of Sciences of the United States of America. 108 (27): 11093–11098. Bibcode:2011PNAS..10811093H. doi:10.1073/pnas.1101135108. PMC 3131313. PMID 21685335.
  4. Stigler, Stephen M. (1986). The History of Statistics: The Measurement of Uncertainty before 1900. Harvard University Press. ISBN 9780674403406. Retrieved October 6, 2019.

References

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.