Papers
Topics
Authors
Recent
Search
2000 character limit reached

Mining Area Skyline Objects from Map-based Big Data using Apache Spark Framework

Published 4 Apr 2024 in cs.DC | (2404.03254v1)

Abstract: The computation of the skyline provides a mechanism for utilizing multiple location-based criteria to identify optimal data points. However, the efficiency of these computations diminishes and becomes more challenging as the input data expands. This study presents a novel algorithm aimed at mitigating this challenge by harnessing the capabilities of Apache Spark, a distributed processing platform, for conducting area skyline computations. The proposed algorithm enhances processing speed and scalability. In particular, our algorithm encompasses three key phases: the computation of distances between data points, the generation of distance tuples, and the execution of the skyline operators. Notably, the second phase employs a local partial skyline extraction technique to minimize the volume of data transmitted from each executor (a parallel processing procedure) to the driver (a central processing procedure). Afterwards, the driver processes the received data to determine the final skyline and creates filters to exclude irrelevant points. Extensive experimentation on eight datasets reveals that our algorithm significantly reduces both data size and computation time required for area skyline computation.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.