Extended Relational Model (R++)
- Extended Relational Model (R++) is a declarative framework that generalizes Codd’s relational model by integrating complete relations, constraint operators, and inherited attributes.
- It formalizes complete and σ–complete relations to enable intensional, potentially infinite query definitions while supporting navigation-free inherited attribute queries.
- R++ provides expressiveness equivalent to first-order constraint programming, streamlining complex data transformations and query optimization strategies.
The Extended Relational Model (R++, also denoted R⁺⁺) generalizes Codd’s original relational framework by integrating both set-oriented data transformation and constraint-based computation into a single declarative algebra. R++ encompasses innovations such as “complete relations,” σ–complete relations, inherited attributes (stored + inherited relations), and constraint-oriented operators, offering expressive power equivalent to first-order constraint programming while enriching schema modeling and query ergonomics.
1. Formalization of Complete Relations and σ–Complete Construction
R++ introduces the formal notion of a complete relation, which differs from Codd’s finite extensionally-defined set of tuples. In R++, a relation of arity may be defined as the entire domain cross-product, possibly infinite, and restricted intensionally by a Boolean predicate. The key notations are:
$\text{COMPLETE}(A_1:\Dom_1,\dots,A_n:\Dom_n) = \Dom_1 \times \Dom_2 \times \cdots \times \Dom_n$
$\Sigma\textsf{–COMPLETE}(A_1:\Dom_1,\dots,A_n:\Dom_n;\;P(A_1,\dots,A_n)) = \sigma_{P}(\Dom_1\times\cdots\times\Dom_n)$
This construction allows arbitrary intensional, possibly infinite, extensions grounded by σ–selections (Pratten et al., 2023). The complete relation operator acts as a universal template, from which the selection operator extracts relevant subsets defined by predicates .
2. Extended Operators and Algebraic Generalization
R++ extends the standard relational algebra (RA) suite—, , , , , , —with:
- Complete-relation constructor:
$\textsf{COMPLETE}(A_1:\Dom_1,\dots,A_n:\Dom_n)$
- σ–complete selection (intensional constraint application):
- Constraint-join (generalized join/LATERAL semantics):
$R \bowtie_\theta \sigma_P(\Dom(\dots)) \equiv \{t \mid t \in R \times \Dom(\dots) \wedge \theta(t) \wedge P(t)\}$
- Aggregation beyond grouping, with higher-order set-aggregator operator :
These extensions enable nonprocedural declarative computation and general constraint application, subsuming both relational grouping and complex computations within one algebra (Pratten et al., 2023).
3. Constraint Programming Equivalence and Expressiveness
R++ directly maps to first-order constraint satisfaction problems (CSPs). For any FO constraint system
there is an R++ expression
$\pi_{\text{outputs}}(\sigma_P(\Dom_1 \times ... \times \Dom_n)),$
where each variable corresponds to an attribute, its domain to a relation domain, and specifies constraints. Formal soundness and completeness lemmas establish that R++’s expressive power matches FO constraint-solving engines such as SMT solvers (Pratten et al., 2023).
$\operatorname{Evaluate}(S \wedge G) = \{ t \in \Dom_1 \times ... \times \Dom_n \mid P(t) \wedge G(t) \}$
Constraint pushdown optimizations apply, subsuming the semantics of constraint propagation in CP systems.
4. Schema Extension: Stored + Inherited Attributes
R++ (in the SIR model) also generalizes the schema model to include inherited attributes (IAs) alongside stored attributes (SAs). For a base relation , the extended schema is , with each derived by an inheritance expression referencing and other relations via foreign keys. At query time, realizes as: where is a join of and referenced relations (Litwin, 2019). Inheritance expressions are formally projections (potentially involving joins) that yield navigation-free queries.
5. Declarative Query Patterns and Examples
Declarative modeling and query patterns in R++ eliminate prescriptive sequence and explicit navigation, supporting multidirectional reuse:
- System of Linear Equations:
$S = \sigma_{x+y=3 \wedge 2x-y=0}(\Dom(x)=\mathbb{Z}, \Dom(y)=\mathbb{Z}); \quad \pi_{(x,y)}(S) = \{(1,2)\}$
- Aggregation Beyond GROUP BY:
$R = \{ (n,\, \mathrm{AGG}_{\mathrm{sum},\,\pi_i}(\sigma_{1 \le i \le n}(\Dom(n)\times\Dom(i)))) \mid n \in \mathbb{N} \}$
For inherited attribute queries, navigation-free querying is typified. For classical supplier-part schemas, the extended schema allows: This operates without explicit join navigation (Litwin, 2019).
6. Implementation and Query Optimization Strategies
Popular DBMSs implement R++ by decomposing extensions into stored tables and views. The algorithmic pattern involves storing SAs in a base table, defining IAs via view projections and join/outer-join expressions, and installing a rewrite layer (“SIR-layer”) that transparently translates schema and query operations:
- Parse DDL for attributes, IAs, and inheritance expressions.
- CREATE TABLE for SAs.
- CREATE VIEW for extension, joining referenced relations as required.
- Rewrite SELECTs to operate over the view; DML targets the base table.
- If supported, IAs may map to generated/virtual columns, but cross-table semantics typically require view-based evaluation.
Query optimization involves predicate pushdown into joins/views, materializing IA values only as necessary. For σ–complete relations, cost modeling includes constraint propagation (), enumeration (), and projection (), integrating CP heuristics (backtracking, domain filtering) with classical RA techniques (Pratten et al., 2023). The optimizer may elect pure-relational or constraint-solver-based evaluation contingent on domain cardinality and constraint tightness.
7. Benefits, Tradeoffs, and Impact
Adoption of the extended relational model yields:
- Reusability and Multidirectionality: σ–complete definitions serve forward, reverse, or goal-seeking queries without procedural dichotomy; inherited attributes offer navigation-free access.
- Holistic Optimization: Unified algebra enables cost-driven selection among classical RA rewrites and constraint-based propagation, with parallel evaluation of σ–complete components by independent CP/SMT engines.
- Conceptual Schema Enrichment: All business attributes (stored and inherited) materialize in schema, preserving normalization guarantees (functional dependency rules apply to SAs only) (Litwin, 2019).
- Simplified DDL and Querying: Inherited attributes are declared via minimal IE clauses, streamlining schema evolution and reducing procedural overhead.
- Potential Drawbacks: DBMS engine extensions are required (SIR-layer, query rewrite). Query performance depends on optimizer support for predicate inlining; write operations must ignore IAs, occasionally necessitating triggers or special handling. Backward compatibility issues can arise with legacy schema conventions.
A plausible implication is that R++ supports a transition towards more declarative, logic-centric database paradigms and constraint-integrated data modeling. Its synthesis of relational transformation and computation positions R++ as a uniform framework enabling expressive and efficient data-centric computation at scale (Pratten et al., 2023, Litwin, 2019).