Lifted Relational Neural Networks
- Lifted Relational Neural Networks (LRNNs) are a declarative framework that integrates symbolic logic with neural networks by grounding weighted first-order rules.
- They instantiate ground feed-forward computation graphs with shared parameters, enabling hierarchical, interpretable feature and relational pattern learning.
- Empirical results show LRNNs outperform traditional ILP-statistical hybrids and recover behaviors of GNNs while offering faster per-epoch efficiency.
Lifted Relational Neural Networks (LRNNs) are a declarative framework for integrating symbolic first-order logic with neural computation. By specifying weighted first-order rules as templates, LRNNs instantiate ground feed-forward neural networks tailored to specific relational examples. All groundings of a rule share parameters, coupling statistical learning with relational abstraction. This architecture enables hierarchical, interpretable feature learning and expressive modeling for relational domains such as molecular graphs, knowledge bases, and multi-relational data (Sourek et al., 2015, Sourek et al., 2017, Sourek et al., 2020).
1. Formal Framework
An LRNN is defined as a finite set of weighted definite clauses: where each rule is a function-free first-order Horn clause and is its associated learnable weight. The unweighted rule set is .
Given a set of ground facts (the relational example), the template is grounded over the active domain, generating all rule groundings whose bodies are true in the least Herbrand model. The grounded LRNN,
serves as a blueprint for constructing a computation graph with four primary node types:
- Fact neurons : output fixed fact values.
- Rule neurons : encode logical conjunctions.
- Aggregation neurons : pool multiple groundings for the same head.
- Atom neurons : fuse incoming outputs for each ground atom.
Activation functions approximate fuzzy logic connectives with parameterized sigmoids. For example, a “disjunction” uses
while a “conjunction” is realized by
with learnable or fixed , hyperparameters. Average or max can be used for aggregations (Sourek et al., 2015, Sourek et al., 2017, Sourek et al., 2020).
2. Network Instantiation and Weight Sharing
For each example, the LRNN template dynamically unfolds into a computation graph reflecting the structure of the specific data instance. This “ground network” consists of neurons corresponding to each ground predicate, rule, and aggregation node. The same rule (template clause) may generate multiple groundings across different data instances, but all share a common weight, ensuring parameter tying and facilitating generalization across examples (Sourek et al., 2015).
Learning takes place via standard back-propagation and gradient descent. Where each example yields a set of “query atoms” with targets, the overall loss across a minibatch or dataset aggregates the cross-entropy or squared loss over all queries: where denotes the activation for the i-th query in the j-th example. The training loop alternates grounding, forward pass, loss computation, backward pass, and weight updates, with parameters co-evolving across all instances where their rule applies (Sourek et al., 2017, Sourek et al., 2015).
3. Hierarchical and Deep Relational Modeling
Stacking rules whose heads appear in further rule bodies produces hidden layers of relational “latent concepts.” Intermediate predicates serve as learned feature transformations, analogous to hidden units in conventional deep nets but organized by logical abstraction. For instance, in molecular domains, first-layer rules may softly cluster atoms by type; higher-layer rules define complex motifs or relational patterns (e.g., “chains” or “rings”) by recursively combining lower-level soft concepts (Sourek et al., 2015, Sourek et al., 2017).
The structure can encode various architectures depending on the logical connectivity:
- Depth equals the longest chain of rules connecting predicates to targets.
- Each layer corresponds to aggregating over a set of rules at a given logical abstraction.
- Learned weights define “soft clusters” and complex, multi-hop relational features.
This architectural flexibility enables not just deep composition of relations but also interpretable intermediates. Learned rules and weights retain human-readability and can be analyzed post hoc to reveal which soft concepts or composite features contribute to final predictions (Sourek et al., 2017).
4. Automated Structure Learning and Predicate Invention
While early LRNN applications relied on expert-crafted rule templates, subsequent work integrated automated structure learning. The process alternates between weight optimization and top-down beam search in the space of Horn clauses, iteratively expanding the hypothesis set by adding rules and inventing new latent predicates (“soft concepts”). Typical structure learning proceeds as follows:
- Initialize unary predicate clusters for first-layer soft concepts.
- Perform beam search to propose rules, scoring each by retraining current weights and evaluating log-loss.
- After the best rule is added, invent higher-arity latent predicates if warranted, with rules unifying variables accordingly.
- Retrain weights, then repeat until a stopping criterion is met (Sourek et al., 2017).
Predicate invention generalizes conventional predicate invention in ILP, inventing new latent predicates with learned semantics defined by combinations of existing predicates. This yields deep, multi-layer LRNNs with 4–7 layers, 20–50 latent concepts, and rule-chain lengths typically around 3.
5. LRNNs as a Declarative Deep Learning Language
LRNNs abstract neural computation with a declarative Datalog-style logic language. Each clause is annotated with a (possibly tensor-valued) weight, and the interpreter grounds and unfolds this template into a differentiable computation graph per data instance. This approach enables:
- Compact programmatic representations of advanced neural models (e.g., GNNs) without explicit procedural graph construction.
- Uniform treatment of varying relational structures.
- Seamless extension to higher-arity, heterogeneous, or structured-relational constructs through logic rules (Sourek et al., 2020).
Table: Comparison of LRNN and GNN template expressiveness
| Feature | GNN Frameworks | LRNNs |
|---|---|---|
| Relational order | Binary edges only | Arbitrary arity |
| Propagation type | Fixed (message passing) | User-defined by rules |
| Program abstraction | Imperative | Declarative Datalog |
LRNNs can compactly encode GNN architectures such as GCN, GraphSAGE, GIN, and arbitrary relational extensions (motifs, edge-embeddings, heterogeneous graphs) by specifying appropriate logic programs. At runtime, the interpreter ensures efficient grounding and weight sharing (Sourek et al., 2020).
6. Empirical Results and Computational Properties
Experiments across molecular property benchmarks (NCI datasets, Mutagenesis, PTC toxicity) demonstrate that LRNNs outperform classical ILP-statistical hybrids such as kFOIL and nFOIL (e.g., average error 0.18 for learned LRNN vs. 0.25–0.28 for baselines across 72 NCI datasets) (Sourek et al., 2017, Sourek et al., 2015). Learned LRNNs nearly match the performance of carefully handcrafted deep LRNNs and reveal meaningful latent clusters and hierarchical motifs in molecular data.
For neural graph learning, LRNNs recover the exact behaviors of GCN, GraphSAGE, and GIN—training loss curves are indistinguishable under matched templates. LRNNs exhibit greater per-epoch efficiency (up to 10x faster than PyTorch Geometric and 50–100x over DGL for moderate-sized datasets), with a one-time grounding cost amortized over fast per-example computation. Nevertheless, beam-search-based structure learning can be costly on very large predicate sets, and overall generalization is limited primarily by data and rule template expressivity rather than inference tractability (Sourek et al., 2020, Sourek et al., 2017).
7. Advantages, Limitations, and Future Directions
LRNNs provide a fully automated, end-to-end differentiable, and interpretable framework for relational deep learning. Key properties and open directions include:
- Full automation: Eliminates the need for manual rule engineering.
- Hierarchical concept learning: Facilitates learning of deep soft relational abstractions.
- Interpretability: Retains readable rules and fuzzy logic semantics.
- Expressiveness: Captures and extends GNN paradigms, enabling higher-order and heterogeneous relational modeling.
- Scalability: Effective for moderate-sized relational domains; reasoning and search costs present challenges for very large/complex schemas.
- Guidance for rule search: There is ongoing research into heuristics, meta-learning, and background knowledge integration for more efficient structure learning.
- Future extensions: Include probabilistic or non-definite clauses, multi-class settings, and achieving fully differentiable joint structure-parameter optimization (Sourek et al., 2017).
Collectively, LRNNs unify logic-based relational abstraction with neural weight sharing, offering a rigorous, extensible platform for interpretable and scalable relational machine learning (Sourek et al., 2015, Sourek et al., 2017, Sourek et al., 2020).