Spatial outlier detection approaches identify outliers by first defining a spatial neighborhood. However, existing approaches suffer from two issues: (1) they primarily consider autocorrelation alone in forming the neighborhood, but ignore heterogeneity among spatial objects. (2) they do not consider interrelationships among the attributes for identifying how distinct the object is with respect to its neighbors, but consider them independently (either single or multiple). As a result, one may not identify truly unusual spatial objects and may also end up with frivolous outliers. In this paper, we revisit the computation of the spatial neighborhood and propose an approach to address the above two issues. We begin our approach with identifying a spatially related neighborhood, capturing autocorrelation. We then consider interrelationships between attributes and multiple, multilevel distributions within these attributes, thus considering autocorrelation and heterogeneity in various forms. Subsequently, we identify outliers in these neighborhoods. Our experimental results in various datasets (North Carolina SIDS data, New Mexico Leukemia data, etc.) indicate that our approach indeed correctly identifies outliers in heterogeneous neighborhoods.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Vision and Pattern Recognition
- Artificial Intelligence