Papers
Topics
Authors
Recent
Search
2000 character limit reached

Eclipse: Generalizing kNN and Skyline

Published 14 Jun 2019 in cs.DB | (1906.06314v1)

Abstract: $k$ nearest neighbor ($k$NN) queries and skyline queries are important operators on multi-dimensional data points. Given a query point, $k$NN query returns the $k$ nearest neighbors based on a scoring function such as a weighted sum of the attributes, which requires predefined attribute weights (or preferences). Skyline query returns all possible nearest neighbors for any monotonic scoring functions without requiring attribute weights but the number of returned points can be prohibitively large. We observe that both $k$NN and skyline are inflexible and cannot be easily customized. In this paper, we propose a novel \emph{eclipse} operator that generalizes the classic $1$NN and skyline queries and provides a more flexible and customizable query solution for users. In eclipse, users can specify rough and customizable attribute preferences and control the number of returned points. We show that both $1$NN and skyline are instantiations of eclipse. To process eclipse queries, we propose a baseline algorithm with time complexity $O(n22{d-1})$, and an improved $O(n\log {d-1}n)$ time transformation-based algorithm, where $n$ is the number of points and $d$ is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness of the eclipse operator and the efficiency of our eclipse algorithms.

Citations (6)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.