Structure Learning of Probabilistic Logic Programs by Searching the Clause Space (1309.2080v1)

Published 9 Sep 2013 in cs.LG and cs.AI

Abstract: Learning probabilistic logic programming languages is receiving an increasing attention and systems are available for learning the parameters (PRISM, LeProbLog, LFI-ProbLog and EMBLEM) or both the structure and the parameters (SEM-CP-logic and SLIPCASE) of these languages. In this paper we present the algorithm SLIPCOVER for "Structure LearnIng of Probabilistic logic programs by searChing OVER the clause space". It performs a beam search in the space of probabilistic clauses and a greedy search in the space of theories, using the log likelihood of the data as the guiding heuristics. To estimate the log likelihood SLIPCOVER performs Expectation Maximization with EMBLEM. The algorithm has been tested on five real world datasets and compared with SLIPCASE, SEM-CP-logic, Aleph and two algorithms for learning Markov Logic Networks (Learning using Structural Motifs (LSM) and ALEPH++ExactL1). SLIPCOVER achieves higher areas under the precision-recall and ROC curves in most cases.

Citations (74)

View on Semantic Scholar

Summary

The paper introduces SLIPCOVER, a novel algorithm that learns both the structure and parameters of LPADs through a two-phase process combining beam search for clause selection and greedy theory construction.
The method leverages Expectation-Maximization and Binary Decision Diagrams to efficiently manage uncertainty and the combinatorial complexity of clause configurations.
Experimental results demonstrate SLIPCOVER's superior performance over benchmark SRL systems, achieving higher AUCPR and AUCROC on diverse datasets like UW-CSE and Mutagenesis.

Structure Learning of Probabilistic Logic Programs by Searching the Clause Space

The paper presented by Bellodi and Riguzzi introduces SLIPCOVER, a sophisticated algorithm aimed at learning both the structure and parameters of Logic Programs with Annotated Disjunctions (LPADs). This research is positioned within the broader field of Statistical Relational Learning (SRL) that endeavors to integrate logic and probability to model complex and uncertain relationships among entities.

Algorithm Overview

SLIPCOVER advances the field by employing a nuanced approach splitting the learning task into two distinct phases:

Clause Selection: It begins by identifying promising candidate clauses in the clause space using beam search techniques. Brilliantly, SLIPCOVER leverages an ILP-based mechanism to generate "bottom clauses" which serve as generic templates for potential candidates.
Theory Construction: Subsequently, a greedy algorithm builds on these clauses to construct a coherent probabilistic theory. This construction is guided by maximizing the log likelihood (LL) of the data, which provides a robust heuristic measure of the theory’s quality.

In essence, SLIPCOVER first narrows the focal space by selecting promising clauses and then incrementally builds a powerful probabilistic logic program using these candidates.

Technical Strengths

SLIPCOVER’s architecture showcases several technical advancements:

Use of Expectation Maximization: The algorithm employs EMBLEM, an Expectation-Maximization (EM) method, for parameter estimation. This technique is recognized for efficiently dealing with missing or incomplete information by iteratively refining expectations and maximizing parameters until convergence.
Management of Complexity through BDDs: By implementing Binary Decision Diagrams (BDDs), the algorithm can efficiently handle the combinatorial explosion of possible clause configurations. BDDs are instrumental in consistently estimating the full path space and enhancing the tractability of probabilistic logic program inference.
Flexible Language Bias: A notable innovation in SLIPCOVER is its support for mode declarations that enable the construction of disjunctive heads and diverse body predicates, which significantly enrich the expressiveness and flexibility of the resulting theory.

Experimental Results

The authors evaluate SLIPCOVER against benchmark SRL systems including SLIPCASE, SEM-CP-Logic, and others like LSM and ALEPH++ExactL1. The datasets, ranging from bioinformatics (HIV) to academic settings (UW-CSE), demonstrate SLIPCOVER's superior ability to achieve higher Area Under the Curve for Precision-Recall (AUCPR) and Receiver Operating Characteristic (AUCROC) in most scenarios.

On the UW-CSE dataset, considered difficult due to its complex relational structure, SLIPCOVER excels over others by generating theories with higher coverage and predictive power. In bioinformatics contexts like Mutagenesis, SLIPCOVER shows competitive performance against other state-of-the-art methods like ALEPH++ExactL1, underscoring its efficacy across domains.

Implications and Future Work

SLIPCOVER represents a significant contribution to SRL, particularly regarding its hybridization of ILP and probabilistic methodologies. This framework not only broadens the applicability of LPADs but also enriches the toolkit for researchers working on complex probabilistic models.

The implications of this work are twofold. Practically, it fosters the development of robust decision-support systems across domains demanding intricate relationship modeling, such as bioinformatics, natural language processing, and knowledge representation. Theoretically, it prompts further inquiry into optimizing clause refinement processes and integrating diverse probabilistic reasoning methodologies.

For future directions, the paper propounds potential advancements through local search strategies and integrations with methods from related paradigms like Markov Logic Networks, potentially amplifying SLIPCOVER's efficacy and robustness.

In conclusion, the research by Bellodi and Riguzzi intricately structures the search for optimal logic program representations, offering a powerful tool for both academic paper and practical application within SRL and beyond.

PDF Markdown