Papers
Topics
Authors
Recent
2000 character limit reached

BOD: Blindly Optimal Data Discovery

Published 11 Jan 2024 in cs.DB | (2401.05712v3)

Abstract: Combining discovery and augmentation is important in the era of data usage when it comes to predicting the outcome of tasks. However, having to ask the user the utility function to discover the goal to achieve the optimal small rightful dataset is not an optimal solution. The existing solutions do not make good use of this combination, hence underutilizing the data. In this paper, we introduce a novel goal-oriented framework, called BOD: Blindly Optimal Data Discovery, that involves humans in the loop and comparing utility scores every time querying in the process without knowing the utility function. This establishes the promise of using BOD: Blindly Optimal Data Discovery for modern data science solutions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Finding k-dominant skylines in high dimensional space. In Proceedings of the ACM SIGMOD International Conference on Management of Data.
  2. On high dimensional skylines. In Advances in Database Technology-EDBT 2006. Springer, 478–495.
  3. Top-k bounded diversification. In Proceedings of the 2012 International Conference on Management of Data.
  4. Metam: Goal-Oriented Data Discovery. 2023 IEEE 39th International Conference on Data Engineering (ICDE) (2023), 2780–2793. https://api.semanticscholar.org/CorpusID:258187398
  5. M. Goncalves and M. Yidal. 2005. Top-k skyline: a unified approach. In On the Move to Meaningful Internet System 2005.
  6. Ver: View Discovery in the Wild. 2023 IEEE 39th International Conference on Data Engineering (ICDE) (2021), 503–516. https://api.semanticscholar.org/CorpusID:252545411
  7. Personalized top-k skyline queries in high-dimensional space. Information Systems 34, 1 (2009), 45–61.
  8. X. Lian and L. Chen. 2009. Top-k dominating queries in uncertain databases. In Proceedings of International Conference on Extending Database Technology: Advances in Database Technology.
  9. Selecting stars: The k most representative skyline operator. In Proceedings of International Conference on Data Engineering.
  10. D. Mindolin and J. Chomicki. 2009. Discovering relative importance of skyline attributes. In Proceedings of the VLDB Endowment.
  11. Marrying Top-k with Skyline Queries: Relaxing the Preference Input While Producing Output of Controllable Size. In Proceedings of the 2021 International Conference on Management of Data (Virtual Event, China) (SIGMOD/PODS ’21). Association for Computing Machinery, New York, NY, USA, 1317–1330.
  12. Progressive skyline computation in database systems. In ACM Transactions on Database Systems (TODS), Vol. 30. ACM, 41–82.
  13. Diversifying top-k results. In Proceedings of the VLDB Endowment.
  14. Top-k query processing in uncertain databases. In Proceedings of International Conference on Data Engineering. IEEE, 896–905.
  15. Distance-based representative skyline. In Proceedings of International Conference on Data Engineering.
  16. Efficient Skyline and Top-k Retrieval in Subspaces. In TKDE.
  17. On skylineing with flexible dominance relation. In Proceedings of International Conference on Data Engineering.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.