Extremal Fitting CQs do not Generalize (2312.03407v1)
Abstract: A fitting algorithm for conjunctive queries (CQs) produces, given a set of positively and negatively labeled data examples, a CQ that fits these examples. In general, there may be many non-equivalent fitting CQs and thus the algorithm has some freedom in producing its output. Additional desirable properties of the produced CQ are that it generalizes well to unseen examples in the sense of PAC learning and that it is most general or most specific in the set of all fitting CQs. In this research note, we show that these desiderata are incompatible when we require PAC-style generalization from a polynomial sample: we prove that any fitting algorithm that produces a most-specific fitting CQ cannot be a sample-efficient PAC learning algorithm, and the same is true for fitting algorithms that produce a most-general fitting CQ (when it exists). Our proofs rely on a polynomial construction of relativized homomorphism dualities for path-shaped structures.
- Computational Learning Theory. Cambridge University Press, 1992.
- The complexity of reverse engineering problems for conjunctive queries. In Proc. of ICDT, pages 7:1–7:17, 2017.
- Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4):929–965, 1989.
- Balder ten Cate and Victor Dalmau. Conjunctive queries: Unique characterizations and exact learnability. ACM Trans. Database Syst., 47(4):14:1–14:41, 2022.
- Extremal fitting problems for conjunctive queries. In Proc. of PODS, 2023.
- Fitting algorithms for conjunctive queries. SIGMOD Rec. to appear.
- SAT-based PAC learning of description logic concepts. In Proc. IJCAI, pages 3347–3355, 2023.
- On the non-efficient PAC learnability of conjunctive queries. Inf. Process. Lett., 183:106431, 2024.
- Jörg-Uwe Kietz. Some lower bounds for the computational complexity of inductive logic programming. In Proc. of ECML, pages 115–123. Springer, 1993.
- Query from examples: An iterative, data-driven approach to query construction. Proc. VLDB Endow., 8(13):2158–2169, 2015.
- Query reverse engineering. The VLDB Journal, 23(5):721–746, 2014.
- Leslie G. Valiant. A theory of the learnable. Commun. ACM, 27:1134–1142, 1984.
- Moshé M. Zloof. Query by example. In Proc. of AFIPS NCC, pages 431–438. AFIPS Press, May 1975.