Graph Pattern Matching Preserving Label-Repetition Constraints (1804.04260v1)
Abstract: Graph pattern matching is a routine process for a wide variety of applications such as social network analysis. It is typically defined in terms of subgraph isomorphism which is NP-Complete. To lower its complexity, many extensions of graph simulation have been proposed which focus on some topological constraints of pattern graphs that can be preserved in polynomial-time over data graphs. We discuss in this paper the satisfaction of a new topological constraint, called Label-Repetition constraint. To the best of our knowledge, existing polynomial approaches fail to preserve this constraint, and moreover, one can adopt only subgraph isomorphism for this end which is cost-prohibitive. We present first a necessary and sufficient condition that a data subgraph must satisfy to preserve the Label-Repetition constraints of the pattern graph. Furthermore, we define matching based on a notion of triple simulation, an extension of graph simulation by considering the new topological constraint. We show that with this extension, graph pattern matching can be performed in polynomial-time, by providing such an algorithm. Our algorithm is sub-quadratic in the size of data graphs only, and quartic in general. We show that our results can be combined with orthogonal approaches for more expressive graph pattern matching.