Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 90 tok/s
Gemini 2.5 Pro 29 tok/s Pro
GPT-5 Medium 14 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 101 tok/s Pro
Kimi K2 195 tok/s Pro
GPT OSS 120B 456 tok/s Pro
Claude Sonnet 4 39 tok/s Pro
2000 character limit reached

Using Positional Sequence Patterns to Estimate the Selectivity of SQL LIKE Queries (2002.01164v1)

Published 4 Feb 2020 in cs.DB and cs.DS

Abstract: With the dramatic increase in the amount of the text-based data which commonly contains misspellings and other errors, querying such data with flexible search patterns becomes more and more commonplace. Relational databases support the LIKE operator to allow searching with a particular wildcard predicate (e.g., LIKE 'Sub%', which matches all strings starting with 'Sub'). Due to the large size of text data, executing such queries in the most optimal way is quite critical for database performance. While building the most efficient execution plan for a LIKE query, the query optimizer requires the selectivity estimate for the flexible pattern-based query predicate. Recently, SPH algorithm is proposed which employs a sequence pattern-based histogram structure to estimate the selectivity of LIKE queries. A drawback of the SPH approach is that it often overestimates the selectivity of queries. In order to alleviate the overestimation problem, in this paper, we propose a novel sequence pattern type, called positional sequence patterns. The proposed patterns differentiate between sequence item pairs that appear next to each other in all pattern occurrences from those that may have other items between them. Besides, we employ redundant pattern elimination based on pattern information content during histogram construction. Finally, we propose a partitioning-based matching scheme during the selectivity estimation. The experimental results on a real dataset from DBLP show that the proposed approach outperforms the state of the art by around 20% improvement in error rates.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.