This algorithm uses an approach like K-nearest neighbors (KNN) to quantify the dissimilarity of an observation compared to points in the same local region of data. At a high level, it compares the local density of an observation to that of its neighbors, and like Isolation Forest, lower values indicate the observation is less likely to be an outlier. It requires that the number of neighbors be provided as a hyperparameter. An advantage of this method is that it is better suited to identify outliers using a local perspective of the data, meaning that if an observation does not appear to be an outlier from a global view, if it is far enough away from the density of its neighbors, this algorithm is more likely to detect it. It depends on the context if it is of interest to detect outliers in this fashion.
What is Local Outlier Factor?
Help us improve this post by suggesting in comments below:
– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic
Partner Ad