Orca
Orca: A Program for Mining Distance-Based Outliers. Orca is a program for mining outliers in large multivariate data sets. An outlier is an example that is substantially different from the examples in the reminder of the data. An outlier may have values for an attribute that are unusually large or small, or it may have an unusual combination of values that are rarely seen together. Orca mines distance-based outliers. That is, Orca uses the distance from a given example to its nearest neighbors to determine its unusuallness. The intuition is that if there are other examples that are close to the candidate in the feature space, then the example is probably not an outlier. If the nearest examples are substantially different, then the example is likely to be an outlier. Probabilistically, one can view distance-based outliers as identifying candidates that lie at points where the nearest neighbor density estimate is small.
References in zbMATH (referenced in 32 articles )
