Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

K-Dominant Skyline Join Queries: Extending the Join Paradigm to K-Dominant Skylines (1702.03390v1)

Published 11 Feb 2017 in cs.DB

Abstract: Skyline queries enable multi-criteria optimization by filtering objects that are worse in all the attributes of interest than another object. To handle the large answer set of skyline queries in high-dimensional datasets, the concept of k-dominance was proposed where an object is said to dominate another object if it is better (or equal) in at least k attributes. This relaxes the full domination criterion of normal skyline queries and, therefore, produces lesser number of skyline objects. This is called the k-dominant skyline set. Many practical applications, however, require that the preferences are applied on a joined relation. Common examples include flights having one or multiple stops, a combination of product price and shipping costs, etc. In this paper, we extend the k-dominant skyline queries to the join paradigm by enabling such queries to be asked on joined relations. We call such queries KSJQ (k-dominant skyline join queries). The number of skyline attributes, k, that an object must dominate is from the combined set of skyline attributes of the joined relation. We show how pre-processing the base relations helps in reducing the time of answering such queries over the naive method of joining the relations first and then running the k-dominant skyline computation. We also extend the query to handle cases where the skyline preference is on aggregated values in the joined relation (such as total cost of the multiple legs of the flight) which are available only after the join is performed. In addition to these problems, we devise efficient algorithms to choose the value of k based on the desired cardinality of the final skyline set. Experiments on both real and synthetic datasets demonstrate the efficiency, scalability and practicality of our algorithms.

Citations (7)

Summary

We haven't generated a summary for this paper yet.