Papers
Topics
Authors
Recent
2000 character limit reached

Similar-Concept Interfaces (SCI)

Updated 11 November 2025
  • Similar-Concept Interfaces (SCI) are interfaces with overlapping public method signatures that expose redundant API declarations in contract-driven architectures like Java.
  • Quantitative metrics such as IS, IIS, IC, IIC, and IRIH are used to detect these overlaps, guiding systematic refactoring of redundant service declarations.
  • Automated detection in large-scale systems shows that SCI increases maintenance overhead and cognitive load while violating principles like DRY, necessitating careful remediation.

A Similar-Concept Interface (SCI) describes the phenomenon wherein two or more software interfaces declare overlapping or redundant sets of public method signatures. Abdeen and Shata (2013) formalize SCIs as a form of design anomaly distinctive to the contract-driven architecture of modern programming languages, particularly Java, where interfaces articulate module boundaries. The quantitative characterization and detection of SCIs—distinguished from broader class-level or semantic design anomalies—enables systematic identification of redundant service declarations, interface clones, and hierarchy redundancies, which can introduce maintenance overhead, increase cognitive load for clients, and contravene established software engineering principles such as the “DRY” (Don’t Repeat Yourself) doctrine (Abdeen et al., 2013).

1. Formal Definition of Similar-Concept Interfaces

An interface ii is defined as the set of its public method signatures, denoted is(i)is(i). Two interfaces i1i_1 and i2i_2 are said to be similar-concept interfaces (SCI) if is(i1)is(i2)is(i_1) \cap is(i_2) \neq \emptyset, implying overlap in the services they contractually define. This overlap can range from a single shared method to complete duplication. SCIs are not limited to direct duplication: any substantial intersection qualifies, regardless of the rest of the interface contents.

The motivation for identifying SCIs lies in the client perspective—developers must reconcile multiple interface identifiers to utilize or reason about essentially the same behavioral API. This drift toward redundant patterns tends to worsen as systems evolve and new functionalities are layered onto interface hierarchies without careful coordination.

Abdeen and Shata introduce metrics to rigorously quantify interface similarity, clones, and hierarchy redundancies, facilitating automatic detection and prioritization for refactoring. These are summarized in the table below:

Metric Formula Targeted Anomaly
IS IS(i1,i2)=is(i1)is(i2)max(is(i1),is(i2))IS(i_1,i_2) = \frac{|is(i_1) \cap is(i_2)|}{\max(|is(i_1)|, |is(i_2)|)} Shared similarity between i1i_1 and i2i_2
IIS IIS(i)=maxxsim(i)IS(i,x)IIS(i) = \max_{x \in sim(i)} IS(i,x), sim(i)={xIS(i,x)>0}sim(i)=\{x|IS(i,x) > 0\} Aggregate shared similarity for ii
IC IC(i1,i2)=is(i1)is(i2)is(i1)IC(i_1, i_2) = \frac{|is(i_1) \cap is(i_2)|}{|is(i_1)|} Fractional clone of i1i_1 in i2i_2
IIC IIC(i)=maxxsim(i)IC(i,x)IIC(i) = \max_{x \in sim(i)} IC(i,x) Aggregate interface cloning for ii
IRIH IRIH(i)=rep(i)subH(i)IRIH(i) = \frac{rep(i)}{|subH(i)|} Redundancy in interface hierarchy
  • IS (Interface Similarity): Measures normalized pairwise signature overlap; IS=1IS=1 iff both interfaces are identical in contents.
  • IIS (Index of Interface Similarity): Collapses all pairwise IS values for an interface to its highest overlap, capturing the worst duplication.
  • IC (Interface Clone): Directional measure—fraction of i1i_1’s methods hard-copied in i2i_2.
  • IIC (Index of Interface Clone): Maximum IC of ii with any other overlapping interface.
  • IRIH (Index of Redundancy in Interface Hierarchy): Proportion of redundant “implements” or inheritance relationships in the interface’s type sub-hierarchy.

Thresholds recommended for refactoring are IIS>0.5IIS > 0.5 and IIC>0.5IIC > 0.5, indicating that more than half of an interface is duplicated in some form, and IRIH>0IRIH > 0 (with higher urgency for >0.3>0.3–$0.5$).

3. Detection and Classification of Interface Design Defects

Three principal interface-level design defects are operationalized via the above metrics:

  1. Shared Similarity (SCI): Two or more interfaces redundantly declare substantial subsets of API methods. Detected through IS and IIS. Any interface with IIS>0.5IIS > 0.5 is flagged as a likely SCI suspect, implying the majority of its declared services appear elsewhere.
  2. Interface Clones: A special subclass of SCIs where one interface is entirely (or mostly) copied verbatim into another. The IC and IIC metrics quantify this directionally—if IIC(i)>0.5IIC(i) > 0.5 for an interface, more than half its methods are repeated elsewhere, typically justified only by intentional extension (otherwise violating DRY).
  3. Redundancy in Interface Sub-Hierarchy: Owing to multiple interface inheritance (as in Java), classes may “implement” the same contract through multiple inheritance paths, leading to redundant, statically linked dependencies. The IRIH metric computes for each interface the proportion of its implementing types that redundantly reference it. Any IRIH above zero is extraneous; values above $0.3–0.5$ signal critical anomalies.

Typical detection involves computing IS and IC between all pairs (which may be optimized using index-based techniques), aggregating into IIS and IIC per interface, and evaluating IRIH for each.

4. Empirical Findings in Large-Scale Java Systems

An empirical assessment was performed on three substantial Java applications (JBoss, Vuze, Hibernate), yielding the following aggregated results:

System avg IIS avg IIC avg IRIH
JBoss 0.27 0.27 ~0.25
Vuze 0.24 0.18 ~0.30
Hibernate 0.17 0.16 ~0.25
  • In each codebase, multiple interfaces exceeded the 0.5 thresholds for IIS and IIC. For example, JBoss contained a pair (HAManagementServiceMBean and MEJB, both with 15 methods) that were identical (IIC=1IIC=1).
  • Vuze exhibited several interfaces with significant overlap (IIS ≈ 0.75) differing only by packaging, each declaring similar methods such as getPieceNumber():int, getLength():int, and getOffset():int.
  • In Vuze, an interface with nine implementing classes had IRIH=0.55IRIH=0.55, indicating that the majority of “implements” clauses were redundant.
  • The prevalence of such patterns, with system-wide IIS and IIC means approaching 0.2–0.3 and IRIH values of 0.25–0.30, highlights the widespread nature of SCI-related design debt.

5. Automated Detection and Refactoring Approaches

Automated detection of SCIs and related anomalies proceeds through the following stages:

  1. Extraction: Parse the Java AST to collect for each interface ii its set of declared signatures is(i)is(i) and its sub-hierarchy subH(i)subH(i).
  2. Metric Computation: For each interface pair, compute IS and IC; for each interface, compute IIS, IIC, and IRIH.
  3. Identification: Report all interfaces with IIS>0.5IIS>0.5, IIC>0.5IIC>0.5, or IRIH>0IRIH>0 as candidates for refactoring.
  4. Assistance with Refactoring: For detected SCIs, suggest merging interfaces so that shared methods reside in a single contract and all client “implements” clauses are redirected accordingly. For clones, recommend replacing hard-copied interfaces with subtype relationships (e.g., interface i₂ extends i₁). For hierarchy redundancy, remove superfluous implements hops.

The authors caution that interfaces serve as public APIs; thus, numerical thresholds are guidelines and should be tailored to domain or legacy constraints. Refactoring must honor semantic versioning and update all client references. The metrics operate solely on signature structure and do not consider method semantics: two methods with identical signatures but divergent intent will be conflated, necessitating manual review for possible false positives.

Staged remediation is advised: begin with high IIS/IIC/IRIH values in smaller hierarchies to mitigate backward-compatibility risks before addressing broader subsystems.

6. Significance and Limitations

By provisioning concrete, quantitative indices for SCI, interface clones, and hierarchy redundancy, these metrics afford a foundation for systematic auditing and prioritization of interface-level design debt. The abstraction level makes them especially suitable for integration into automated tooling and continuous integration pipelines.

Limitations include the exclusive reliance on method signatures (ignoring semantic equivalence or intent), the potential for false positives where overlapping signatures are intentional, and the requirement for context-sensitive threshold tuning. The authors highlight that interface evolution is backward-sensitive, and that careless application of automated merging or removal may disrupt client code, reinforcing the necessity for careful, context-aware adoption of these techniques.

A plausible implication is that while SCI and related anomalies are structural and tool-detectable, their resolution demands architectural foresight and developer intervention, particularly in public API evolution contexts.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Similar-Concept Interfaces (SCI).