Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Subsequence Matching and Analysis Problems for Formal Languages (2410.07992v1)

Published 10 Oct 2024 in cs.FL and cs.DS

Abstract: In this paper, we study a series of algorithmic problems related to the subsequences occurring in the strings of a given language, under the assumption that this language is succinctly represented by a grammar generating it, or an automaton accepting it. In particular, we focus on the following problems: Given a string $w$ and a language $L$, does there exist a word of $L$ which has $w$ as subsequence? Do all words of $L$ have $w$ as a subsequence? Given an integer $k$ alongside $L$, does there exist a word of $L$ which has all strings of length $k$, over the alphabet of $L$, as subsequences? Do all words of $L$ have all strings of length $k$ as subsequences? For the last two problems, efficient algorithms were already presented in [Adamson et al., ISAAC 2023] for the case when $L$ is a regular language, and efficient solutions can be easily obtained for the first two problems. We extend that work as follows: we give sufficient conditions on the class of input-languages, under which these problems are decidable; we provide efficient algorithms for all these problems in the case when the input language is context-free; we show that all problems are undecidable for context-sensitive languages. Finally, we provide a series of initial results related to a class of languages that strictly includes the regular languages and is strictly included in the class of context-sensitive languages, but is incomparable to the of class context-free languages; these results deviate significantly from those reported for language-classes from the Chomsky hierarchy.

Summary

  • The paper establishes sufficient decidability conditions for subsequence matching across different formal language classes.
  • It develops efficient algorithms for context-free languages to tackle existence and universal subsequence queries.
  • The study demonstrates that subsequence problems become undecidable for context-sensitive languages, extending prior research methods.

The paper "Subsequence Matching and Analysis Problems for Formal Languages" explores algorithmic problems concerning subsequences in strings derived from formal languages. The paper assumes these languages are succinctly represented by grammars that generate them or automata that accept them.

Key Problems Explored

  1. Existence Problem: Given a string ww and a language LL, the paper investigates whether there exists a word in LL that contains ww as a subsequence.
  2. Universal Subsequence Problem: The paper examines whether all words in LL contain ww as a subsequence.
  3. kk-Universal Problem: Given an integer kk and language LL, it explores whether there exists a word in LL that contains all strings of length kk, over LL's alphabet, as subsequences.
  4. Universal kk-Subsequence Problem: It assesses whether all words in LL have all strings of length kk as subsequences.

Key Contributions

  • Decidability Conditions: The paper presents sufficient conditions for the class of input languages under which these subsequence problems are decidable.
  • Efficient Algorithms for Context-Free Languages: The authors provide efficient algorithms specifically for context-free languages across all the problems studied.
  • Undecidability for Context-Sensitive Languages: The research demonstrates that all the problems become undecidable when applied to context-sensitive languages.

Additional Insights

The paper extends previous work, notably from "Adamson et al., ISAAC 2023," which dealt with efficient solutions for the kk-universal problems in regular languages. This work further contributes by addressing similar problems for broader language classes.

Finally, the paper explores a unique class of languages that is strictly larger than regular languages, strictly smaller than context-sensitive languages, and incomparable to context-free languages, offering initial results that differ significantly from those reported for other classes in the Chomsky hierarchy. This new perspective potentially opens avenues for further exploration beyond traditional hierarchies.