Papers
Topics
Authors
Recent
Search
2000 character limit reached

DLGNet: Hyperedge Classification through Directed Line Graphs for Chemical Reactions

Published 9 Oct 2024 in cs.LG and cs.AI | (2410.06969v1)

Abstract: Graphs and hypergraphs provide powerful abstractions for modeling interactions among a set of entities of interest and have been attracting a growing interest in the literature thanks to many successful applications in several fields. In particular, they are rapidly expanding in domains such as chemistry and biology, especially in the areas of drug discovery and molecule generation. One of the areas witnessing the fasted growth is the chemical reactions field, where chemical reactions can be naturally encoded as directed hyperedges of a hypergraph. In this paper, we address the chemical reaction classification problem by introducing the notation of a Directed Line Graph (DGL) associated with a given directed hypergraph. On top of it, we build the Directed Line Graph Network (DLGNet), the first spectral-based Graph Neural Network (GNN) expressly designed to operate on a hypergraph via its DLG transformation. The foundation of DLGNet is a novel Hermitian matrix, the Directed Line Graph Laplacian, which compactly encodes the directionality of the interactions taking place within the directed hyperedges of the hypergraph thanks to the DLG representation. The Directed Line Graph Laplacian enjoys many desirable properties, including admitting an eigenvalue decomposition and being positive semidefinite, which make it well-suited for its adoption within a spectral-based GNN. Through extensive experiments on chemical reaction datasets, we show that DGLNet significantly outperforms the existing approaches, achieving on a collection of real-world datasets an average relative-percentage-difference improvement of 33.01%, with a maximum improvement of 37.71%.

Summary

  • The paper introduces DLGNet, a spectral Graph Neural Network designed for hyperedge classification using a novel Directed Line Graph representation.
  • DLGNet utilizes complex-valued edge weights within its Directed Line Graph to capture and leverage the directional information present in directed hypergraphs.
  • Experimental results demonstrate that DLGNet significantly outperforms existing methods across chemical reaction datasets, highlighting the value of modeling directionality.

The paper introduces Directed Line Graph Network (DLGNet), a spectral-based Graph Neural Network (GNN) designed for hyperedge classification in directed hypergraphs, with a specific application to chemical reaction classification.

The authors define the concept of a Directed Line Graph (DLG) associated with a directed hypergraph H⃗\vec H. In this DLG(H⃗\vec H), vertices represent the hyperedges of H⃗\vec H, and edges connect vertices if their corresponding hyperedges in H⃗\vec H share at least one vertex. Complex-valued edge weights in DLG(H⃗\vec H) encode the directionality of interactions within H⃗\vec H.

Key contributions include:

  • A formal definition of a directed line graph associated with a directed hypergraph H⃗\vec H, denoted as DLG(H⃗\vec H).
  • The Directed Line Graph Laplacian L⃗N\mathbb{\vec L}_N, a Hermitian matrix capturing both directed and undirected relationships between hyperedges in a directed hypergraph via its DLG. The paper proves that L⃗N\mathbb{\vec L}_N possesses spectral properties such as being positive semidefinite.
  • DLGNet, a spectral-based GNN designed to operate on directed line graphs, convolving hyperedge features.

The paper defines an undirected hypergraph as an ordered pair H⃗\vec H0, with H⃗\vec H1 and H⃗\vec H2, where H⃗\vec H3 is the set of vertices and H⃗\vec H4 is the set of hyperedges. The hyperedges' weights are stored in the diagonal matrix H⃗\vec H5, where H⃗\vec H6 is the weight of hyperedge H⃗\vec H7. The vertex degree H⃗\vec H8 and hyperedge degree H⃗\vec H9 are defined as H⃗\vec H0 for H⃗\vec H1, and H⃗\vec H2 for H⃗\vec H3, stored in diagonal matrices H⃗\vec H4 and H⃗\vec H5. For 2-uniform hypergraphs, the adjacency matrix H⃗\vec H6 is defined such that H⃗\vec H7 for each H⃗\vec H8 and H⃗\vec H9 otherwise. Directed hypergraph H⃗\vec H0 is defined as a hypergraph where each hyperedge H⃗\vec H1 is partitioned in a head set H⃗\vec H2 and a tail set H⃗\vec H3.

The relationship between vertices and hyperedges in a undirected hypergraph H⃗\vec H4 is classically represented via an incidence matrix H⃗\vec H5 of size H⃗\vec H6, where

Bve={1amp;if v∈e 0amp;otherwisev∈V,e∈E.{B_{ve} = \begin{cases} 1 &amp; \text{if } v \in e \ 0 &amp; \text{otherwise} \end{cases} \qquad v \in V, e \in E.}\vec H$7B$\vec H$8Q$\vec H9QN9Q_N\vec H0Q:=BWB<sup>⊤0Q := B W B<sup>\top\vec H1QN</sup>:=DvBWDe<sup>−1</sup>B<sup>⊤</sup>Dv1Q_N</sup> := {D_v} B W D_e<sup>{-1}</sup> B<sup>\top</sup> {D_v},

where H⃗\vec H2 are the diagonal matrices defined above.

The Laplacian for a general undirected hypergraph is defined as:

H⃗\vec H3.

Given a Laplacian matrix H⃗\vec H4 of a hypergraph H⃗\vec H5 that admits an eigenvalue decomposition H⃗\vec H6, where H⃗\vec H7 represents the eigenvectors, H⃗\vec H8 is its conjugate transpose, and H⃗\vec H9 is the diagonal matrix containing the eigenvalues, the convolution H⃗\vec H0 between H⃗\vec H1 and another graph signal H⃗\vec H2 is defined in the frequency space as H⃗\vec H3.

The adjacency matrix of H⃗\vec H4 is defined as:

H⃗\vec H5,

where H⃗\vec H6 is the Signless Laplacian of H⃗\vec H7. The normalized Signless Laplacian H⃗\vec H8 and the normalized Laplacian H⃗\vec H9 are defined as:

H⃗\vec H0, H⃗\vec H1, and H⃗\vec H2.

The complex-valued incidence matrix H⃗\vec H3 preserves the directionality of H⃗\vec H4:

B⃗ve:={1amp;if v∈H(e), −iamp;if v∈T(e), 0amp;otherwise.v∈V,e∈E.\vec{B}_{ve} := \begin{cases} 1 &amp; \text{if } v \in H(e), \ -i &amp; \text{if } v \in T(e), \ 0 &amp; \text{otherwise}. \end{cases} \qquad v \in V, e \in E.\vec H5A(DLG(H⃗))=WB⃗<sup>∗</sup>B⃗W−WDe5A(DLG(\vec{H})) = {W}\vec{B}<sup>*</sup> \vec{B}{W} - W D_e.

The normalized Signless Laplacian H⃗\vec H6 and the normalized Laplacian H⃗\vec H7 of H⃗\vec H8 are:

H⃗\vec H9 and H⃗\vec H0.

The scalar form of H⃗\vec H1 for a pair of hyperedges H⃗\vec H2 is:

$\mathbb{\vec L}<em>N(ij)= \left{ \begin{array}{lr} \displaystyle 1 - \sum</em>{ u \in i}\frac{w_i}{d_u\delta_i}&amp; i = j\ \displaystyle \left(-\hspace{-.4cm}\sum_{\substack{%i,j \in E: \ u \in H(i) \cap H(j) \ \vee u \in T(i) \cap T(j)} \hspace{-.5cm}\frac{w_i} {w_j}{d_u} - i \left(\sum_{\substack{%i, j \in E:\ u \in H(i) \cap T(j)}%\ \wedge u \in T(j)} \hspace{-.3cm}\frac{w_i} {w_j}{d_u} - \sum_{\substack{%i,j \in E:\ u \in T(i) \cap H(j)} %\ \wedge u \in H(j)} \hspace{-.3cm}\frac{w_i} {w_j}{d_u}\right)\right) \frac{1}{\delta_i} \frac{1}{\delta_j} &amp; i \neq j \end{array} \right.$\vec H3L⃗<em>N3\mathbb{\vec L}<em>N\vec H4x=a+ib∈C<sup>n4x = a + i b \in \mathbb{C}<sup>{n}\vec H512</sup>∑</em>u∈V1d(u)∑i,j∈Ew(i)(((aiδ(i)−ajδ(j))<sup>2</sup>+(biδ(i)−bjδ(j))<sup>2)</sup>1<em>u∈H(i)∩H(j)∨u∈T(i)∩T(j) +((aiδ(i)−bjδ(j))<sup>2</sup>+(ajδ(j)+biδ(i))<sup>2)</sup>1</em>u∈H(i)∩T(j) +((aiδ(i)+bjδ(j))<sup>2</sup>+(ajδ(j)−biδ(i))<sup>2)</sup>1u∈T(i)∩H(j))w(j)5\frac{1}{2}</sup> \sum</em>{u \in V} \frac{1}{d(u)} \sum_{i, j \in E} {w(i)} \Bigg( \left(\left(\frac{a_i}{\delta(i)} - \frac{a_j}{\delta(j)} \right)<sup>2</sup> + \left(\frac{b_i}{\delta(i)} - \frac{b_j}{\delta(j)}\right)<sup>2\right)</sup> \mathbf{1}<em>{ u \in H(i) \cap H(j) \vee u \in T(i) \cap T(j)}\ + \left(\left(\frac{a_i}{\delta(i)} - \frac{b_j}{\delta(j)} \right)<sup>2</sup> + \left(\frac{a_j}{\delta(j)} + \frac{b_i}{\delta(i)}\right)<sup>2\right)</sup> \mathbf{1}</em>{u\in H(i) \cap T(j)} \ + \left( \left(\frac{a_i}{\delta(i)} + \frac{b_j}{\delta(j)} \right)<sup>2</sup> + \left(\frac{a_j}{\delta(j)} - \frac{b_i}{\delta(i)}\right)<sup>2\right)</sup> \mathbf{1}_{u\in T(i) \cap H(j)} \Bigg) {w(j)}.

The convolution operator is defined as H⃗\vec H6.

Given H⃗\vec H7 as a H⃗\vec H8-dimensional graph signal, the feature matrix for the vertices of H⃗\vec H9 is defined as L⃗N\mathbb{\vec L}_N0, where L⃗N\mathbb{\vec L}_N1 is the feature matrix of the nodes of L⃗N\mathbb{\vec L}_N2.

The convolution is computed as:

L⃗N\mathbb{\vec L}_N3,

where L⃗N\mathbb{\vec L}_N4 is a complex ReLU activation function, and L⃗N\mathbb{\vec L}_N5 are learnable parameters.

The paper presents experiments conducted on three real-world chemical reaction datasets: {\tt Dataset-1} (50K reactions from USPTO granted patents), {\tt Dataset-2} (5300 reactions from five different sources), and {\tt Dataset-3} (649 competitive reactions extracted from \cite{von2020thousands}). Node features are based on Morgan Fingerprints (MFs).

The results demonstrate that DLGNet outperforms existing methods, achieving an average relative percentage difference improvement of 33.01\% over the second-best method across three real-world datasets. Specifically, DLGNet achieves the best improvement on {\tt Dataset-3}, with an average RPD improvement of approximately 37.71\% and an average additive improvement of 31.65 percentage points.

An ablation study demonstrates the importance of directionality, showing that DLGNet consistently outperforms its undirected counterpart.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.