Papers
Topics
Authors
Recent
Search
2000 character limit reached

Eclipse: Practicability Beyond kNN and Skyline

Published 5 Jul 2017 in cs.DB | (1707.01223v2)

Abstract: The $k$ nearest neighbor ($k$NN) query is a fundamental problem in databases. Given a set of multidimensional data points and a query point, $k$NN returns the $k$ nearest neighbors based on a scoring function such as weighted sum given an attribute weight vector. However, the attribute weight vector can be difficult to specify in practice. Skyline returns the points including all possible nearest neighbors without requiring the exact attribute weight vector or a scoring function but the number of returned points can be prohibitively large for practical use. In this paper, we propose a novel \emph{eclipse} definition which provides a more flexible and customizable definition than the classic $1$NN and skyline. In eclipse, users can specify a range of attribute weights and control the number of returned points. We show that both $1$NN and skyline are instantiations of eclipse. To compute eclipse points, we propose a baseline algorithm with time complexity of $O(n22{d-1})$, and an improved $O(n\log {d-1}n)$ time transformation-based algorithm by transforming the eclipse problem to the skyline problem, where $n$ is the number of points and $d$ is the number of dimensions. Furthermore, we propose a novel index-based algorithm utilizing duality transform with much better efficiency. The experimental results on the real NBA dataset and the synthetic datasets demonstrate the effectiveness and efficiency of our eclipse algorithms.

Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.