Canberra distance

The Canberra distance is a numerical measure of the distance between pairs of points in a vector space, introduced in 1966^[1] and refined in 1967^[2] by G. N. Lance and W. T. Williams. It is a weighted version of L₁ (Manhattan) distance.^[3] The Canberra distance has been used as a metric for comparing ranked lists^[3] and for intrusion detection in computer security.^[4]

Definition

The Canberra distance d between vectors p and q in an n-dimensional real vector space is given as follows:

d(\mathbf {p} ,\mathbf {q} )=\sum _{i=1}^{n}{\frac {|p_{i}-q_{i}|}{|p_{i}|+|q_{i}|}}

where

\mathbf {p} =(p_{1},p_{2},\dots ,p_{n}){\text{ and }}\mathbf {q} =(q_{1},q_{2},\dots ,q_{n})

The Canberra metric, Adkins form, divides the distance d by (n-Z) where Z is the number of attributes that are 0 for p and q.

↑ Lance, G. N.; Williams, W. T. (1966). "Computer programs for hierarchical polythetic classification ("similarity analysis")". Computer Journal. 9 (1): 60–64. doi:10.1093/comjnl/9.1.60.
↑ Lance, G. N.; Williams, W. T. (1967). "Mixed-data classificatory programs I.) Agglomerative Systems". Australian Computer Journal: 15–20.
1 2 Jurman G, Riccadonna S, Visintainer R, Furlanello C: Canberra Distance on Ranked Lists. In Proceedings, Advances in Ranking – NIPS 09 Workshop Edited by Agrawal S, Burges C, Crammer K. 2009, 22–27.
↑ Emran, Syed Masum; Ye, Nong (2002). "Robustness of chi-square and Canberra distance metrics for computer intrusion detection". Quality and Reliability Engineering International. 18 (1): 19–28. doi:10.1002/qre.441.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.