Multi-Frequency Graph Convolutional Network (MF-GCN)

Updated 3 July 2026

MF-GCN is a graph convolution architecture that employs adaptive low-, mid-, and high-frequency filters to capture diverse spectral features.
It uses learned subspace decompositions, adaptive polynomial filter banks, and gating mechanisms to preserve localized message passing and mitigate over-smoothing.
MF-GCN variants show improved performance on tasks like node classification, regression, depression detection, and EEG seizure detection with reduced model complexity.

Multi-Frequency Graph Convolutional Network (MF-GCN) denotes a class of graph convolutional architectures that depart from fixed low-pass message passing by modeling multiple spectral responses on graphs, typically through low-, mid-, and high-frequency filters, adaptive filter banks, or channel-specific frequency responses. In the literature, the designation covers both general-purpose graph representation models—such as BankGCN, AutoGCN, and AdaGNN—and task-specific systems, including tri-modal depression detection and frequency-aware EEG seizure detection (Gao et al., 2021, Wu et al., 2021, Dong et al., 2021, Rahman et al., 19 Nov 2025, Jibon et al., 31 Mar 2026). The common objective is to preserve graph-localized propagation while overcoming the restriction that conventional graph convolution is essentially low-pass, a restriction associated with loss of mid- and high-frequency information and, in deeper models, over-smoothing.

1. Conceptual scope and research lineage

Early 2021 work framed the central problem in explicitly spectral terms: most graph convolutional networks employ fixed low-pass filters, so potentially useful middle and high frequency components of graph signals are ignored, and the bandwidth of existing graph convolutional filters is fixed. BankGCN addressed this by decomposing multi-channel signals into learned subspaces and assigning a distinct adaptive filter to each subspace; AutoGCN introduced learnable low-, middle-, and high-pass branches with automatic bandwidth adjustment; AdaGNN introduced a trainable frequency-response filter spanning multiple layers and varying across feature channels (Gao et al., 2021, Wu et al., 2021, Dong et al., 2021).

The literature suggests that MF-GCN is better understood as a design principle than as a single canonical layer. In some formulations, “multi-frequency” means a bank of polynomial filters with different spectral responses; in others, it means explicit low/mid/high branches; in others still, it means per-channel adaptive responses that can preserve selected non-low-pass components over depth. Later application papers retained the name MF-GCN while instantiating it differently: the depression-detection model uses a two-channel low/high Multi-Frequency Filter Bank Module (MFFBM), whereas the seizure-detection model trains separate GCNs on five EEG frequency bands and a broadband condition (Rahman et al., 19 Nov 2025, Jibon et al., 31 Mar 2026).

2. Spectral foundations and message-passing interpretation

The shared mathematical starting point is the normalized graph Laplacian. For an undirected graph with adjacency matrix $A$ and degree matrix $D$ , the symmetric normalized Laplacian is

$L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$

with eigenvalues in $[0,2]$ . A graph signal $x$ is transformed to the graph Fourier domain by $\hat x = U^T x$ , and spectral convolution with filter $g$ is

$x * g = U g(\Lambda) U^T x.$

In this setting, small eigenvalues correspond to smooth, low-frequency components, whereas large eigenvalues correspond to rapidly varying, high-frequency components (Wu et al., 2021, Dong et al., 2021).

MF-GCN formulations preserve this spectral interpretation while avoiding explicit eigendecomposition. BankGCN applies a $K$ -th order polynomial filter

$H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$

to each learned subspace $D$ 0, where $D$ 1 are Chebyshev or related polynomials; the resulting operator is equivalent to a local $D$ 2-hop message-passing scheme in the spatial domain. AutoGCN pushes spectral filter families back onto $D$ 3 so that the resulting kernels are polynomials in $D$ 4 of order at most $D$ 5, giving strict 1-hop or 2-hop locality. AdaGNN realizes its response through repeated application of

$D$ 6

so that each channel $D$ 7 acquires the spectral response

$D$ 8

These constructions show that multi-frequency modeling is compatible with localized message passing rather than opposed to it (Gao et al., 2021, Wu et al., 2021, Dong et al., 2021).

3. Representative MF-GCN formulations

The principal formulations differ in how they parameterize spectral diversity.

Formulation	Frequency mechanism	Distinctive property
BankGCN	Learned subspace decomposition plus adaptive polynomial filter bank	Joint learning of subspaces and diverse filters
AutoGCN	Parallel low-, mid-, and high-pass branches with learnable bandwidths	Complementary gating and over-parameterization
AdaGNN	Layer- and channel-specific adaptive response $D$ 9	Multi-frequency behavior across depth
Depression MF-GCN	Low/high MFFBM on a 3-node tri-modal graph	Cross-modal interaction and channel-wise fusion
EEG MF-GCN	Separate GCN per EEG band	Band-specific interpretability

BankGCN, introduced in “Message Passing in Graph Convolution Networks via Adaptive Filter Banks,” decomposes an input $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 0 into $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 1 learned subspaces by

$L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 2

then filters each subspace with its own polynomial response. A residual is added to stabilize high-pass filters, and the outputs are concatenated and passed through ReLU. The defining feature is that signal decomposition and the filter bank are jointly learned, with different subspaces specializing to different spectral bands (Gao et al., 2021).

AutoGCN, introduced in “Beyond Low-pass Filtering: Graph Convolutional Networks with Automatic Filtering,” explicitly constructs three filter families:

$L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 3

with learnable magnitude $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 4 and bandwidth parameter $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 5. In spatial form these become

$L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 6

followed by a complementary gating rule that mixes the three branches. The model thus combines smoothing, sharpening, and medium-scale pattern extraction within a single layer (Wu et al., 2021).

AdaGNN, introduced in “AdaGNN: Graph Neural Networks with Adaptive Frequency Response Filter,” does not separate branches into named passbands. Instead, it equips each feature channel and layer with its own trainable smoothness coefficient through diagonal matrices $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 7, making the overall filter response channel-specific. Observation 1 identifies GCN and GraphSAGE as special cases under particular choices of $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 8 and Laplacian normalization. Theoretical analysis further states that fixed low-pass filters force convergence toward the constant signal, whereas AdaGNN’s learned $L = I - D^{-\frac12} A D^{-\frac12} = U \Lambda U^T,$ 9 can avoid the zero-limit for selected frequencies and therefore naturally mitigate over-smoothing (Dong et al., 2021).

4. Optimization, regularization, and computational profile

BankGCN optimizes all parameters end-to-end through

$[0,2]$ 0

where $[0,2]$ 1 is the task loss and $[0,2]$ 2 penalizes cosine similarity among filter-coefficient vectors,

$[0,2]$ 3

The regularizer is intended to encourage filters in different subspaces to focus on different spectral bands without imposing perfect-reconstruction constraints. For a layer with $[0,2]$ 4 subspaces, output width $[0,2]$ 5, and polynomial order $[0,2]$ 6, the parameter count is

$[0,2]$ 7

compared with $[0,2]$ 8 for ChebNet. The reported computational cost is $[0,2]$ 9 per layer, with no eigendecomposition, and empirical comparisons indicate up to $x$ 0 fewer parameters than ChebNet while remaining only marginally larger than GCN or GraphSAGE (Gao et al., 2021).

AutoGCN uses two additional mechanisms beyond its branch design. First, it employs over-parameterization: instead of a single pair $x$ 1 per branch, it fixes $x$ 2 over $x$ 3 equally spaced points in $x$ 4 and learns nonnegative coefficients $x$ 5. The cited theorems state that the resulting linear combination remains within the same low-, high-, or middle-pass family, while increasing parameter dimensionality from $x$ 6 to $x$ 7 and stabilizing training. Second, complementary gating forms

$x$ 8

which lets each branch weigh itself by the support of the other two. Because the kernels are at most second-order polynomials in $x$ 9, each layer remains localized to 1-hop or 2-hop neighborhoods with total $\hat x = U^T x$ 0 cost (Wu et al., 2021).

AdaGNN uses a sparse first-layer transformation and then repeated adaptive filtering without additional weights or nonlinearities in deeper layers. Its regularized loss is

$\hat x = U^T x$ 1

This architecture directly targets the depth pathology of fixed low-pass GNNs: the theoretical discussion states that AdaGNN alleviates over-smoothing without explicit residuals or masking, precisely because the learned channel-wise frequency response need not annihilate all nontrivial eigencomponents as depth grows (Dong et al., 2021).

5. Empirical performance on benchmark graph tasks

On generic graph benchmarks, BankGCN was evaluated on seven TU datasets for graph classification—ENZYMES, DD, PROTEINS, NCI1, NCI109, MUTAG, and FRANKENSTEIN—plus CIFAR-10 superpixel graphs and ogbg-molhiv. The reported protocol used stratified 80/10/10 splits, 20 random runs for TU datasets, four 64-channel hidden layers, mean/max concatenation for readout, Adam optimization, and early stopping. The reported outcome is that BankGCN outperforms GCN, GraphSage, GAT, GIN, and ChebNet in almost all settings, with about $\hat x = U^T x$ 2– $\hat x = U^T x$ 3 absolute on TU datasets, $\hat x = U^T x$ 4 on CIFAR-10, and $\hat x = U^T x$ 5 ROC-AUC on ogbg-molhiv. The model is also described as particularly robust in low-data regimes, with only a $\hat x = U^T x$ 6 drop on CIFAR-10-1000 versus $\hat x = U^T x$ 7 for the comparison methods (Gao et al., 2021).

AutoGCN was evaluated on node-classification tasks including PubMed, SBM-PATTERN, SBM-CLUSTER, Arxiv-year, YelpChi, and Squirrel, and on graph-level regression/classification tasks including ZINC, MNIST, and CIFAR-10. On PubMed with a random 60/20/20 split, the reported accuracies are $\hat x = U^T x$ 8 for GCN, $\hat x = U^T x$ 9 for ChebNet, and $g$ 0 for AutoGCN. On the non-homophilous Squirrel graph, pure low-pass methods are reported to collapse, whereas AutoGCN regains accuracy above $g$ 1 compared with GCN’s approximately $g$ 2. On ZINC, the reported solubility MAE is $g$ 3 for GCN, $g$ 4 for ChebNet, and $g$ 5 for AutoGCN. Ablation on SBM-CLUSTER and ZINC shows that removing any one branch, disabling over-parameterization, or removing gating degrades performance by $g$ 6– $g$ 7 points, and the learned effective bandwidths converge differently across branches during training (Wu et al., 2021).

AdaGNN was tested on BlogCatalog, Flickr, ACM, Cora, Citeseer, and Pubmed under semi-supervised node-classification protocols, with layers varied from $g$ 8 up to $g$ 9 and, on ACM, up to $x * g = U g(\Lambda) U^T x.$ 0. The reported finding is that AdaGNN-R and AdaGNN-S consistently top the listed baselines—GCN, GraphSAGE, SGC, DropEdge-GCN, and PairNorm-GCN—and remain within $x * g = U g(\Lambda) U^T x.$ 1– $x * g = U g(\Lambda) U^T x.$ 2 of their shallow best even at depths where conventional GCN, GraphSAGE, and SGC collapse. Visualizations further report that fixed-filter SGC exhibits a single band-stop curve, whereas AdaGNN-S learns channel-specific curves that preserve mid- or high-frequency bands associated with classification gains (Dong et al., 2021).

6. Domain-specific variants, misconceptions, and open directions

The 2025 paper titled “MF-GCN: A Multi-Frequency Graph Convolutional Network for Tri-Modal Depression Detection Using Eye-Tracking, Facial, and Acoustic Features” uses MF-GCN for a 3-node fully connected undirected graph whose nodes correspond to audio, video, and gaze embeddings, each of dimension $x * g = U g(\Lambda) U^T x.$ 3. Its MFFBM defines a low-pass branch

$x * g = U g(\Lambda) U^T x.$ 4

a high-pass branch

$x * g = U g(\Lambda) U^T x.$ 5

and the fusion

$x * g = U g(\Lambda) U^T x.$ 6

After global average pooling across nodes, the cross-modal vector is concatenated with the three unimodal embeddings and passed to fully connected layers and softmax. On a trimodal dataset of 103 subjects with subject-wise 10-fold cross-validation, the reported binary-classification performance is sensitivity $x * g = U g(\Lambda) U^T x.$ 7, specificity $x * g = U g(\Lambda) U^T x.$ 8, $x * g = U g(\Lambda) U^T x.$ 9 score $K$ 0, and precision $K$ 1; in three-class classification it reports sensitivity $K$ 2, specificity $K$ 3, $K$ 4 $K$ 5, and precision $K$ 6. On CMDC, the paper reports overall $K$ 7, and ablation states that removing the MFFBM drops $K$ 8 by $K$ 9– $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 0 points. The same source lists limitations: dataset size, only three modalities, hand-set $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 1, $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 2, and $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 3, and shallow depth of $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 4– $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 5 GCN layers (Rahman et al., 19 Nov 2025).

The 2026 seizure-detection paper uses the same label, MF-GCN, in a different sense: separate GCNs are trained on EEG signals decomposed into the five bands $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 6, $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 7, $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 8, $H_p(L)=\sum_{k=0}^K \alpha_p^{(k)} T_k(L)$ 9, and $D$ 00, with 23 EEG channels as graph nodes and 11 statistical and spectral features per channel per 6-second segment. The reported accuracies are $D$ 01 for delta, $D$ 02 for theta, $D$ 03 for alpha, $D$ 04 for lower beta, and $D$ 05 for higher beta, with broadband accuracy of $D$ 06. The same results include sensitivities of $D$ 07, $D$ 08, $D$ 09, $D$ 10, and $D$ 11 across those respective bands, but specificity in higher beta drops to $D$ 12, and the paper describes the higher-beta band as nearly uninformative. Its stated contribution is interpretability: by isolating bands and respecting a 23-node montage graph defined by physical scalp adjacency, the model reveals which bands and channels carry ictal information (Jibon et al., 31 Mar 2026).

A recurrent misconception is that graph convolution is inherently low-pass or that multi-frequency processing requires explicit eigendecomposition. The cited work rejects both propositions in different ways: BankGCN learns diverse polynomial filters and subspace decompositions without eigendecomposition; AutoGCN preserves 1- or 2-hop spatial locality with polynomial kernels in $D$ 13; AdaGNN implements adaptive spectral responses through repeated layerwise operators rather than explicit spectral transforms (Gao et al., 2021, Wu et al., 2021, Dong et al., 2021). Another misconception is that high-frequency content is uniformly noise. The empirical record is more conditional: AutoGCN reports gains on non-homophilous graphs, BankGCN learns filters emphasizing $D$ 14, and the depression MF-GCN explicitly attributes complementary diagnostic cues to abrupt inter-modal contrasts, whereas the seizure paper finds higher-beta nearly uninformative in that particular application (Rahman et al., 19 Nov 2025, Jibon et al., 31 Mar 2026). Open directions explicitly proposed in the literature include relating multi-frequency $D$ 15-hop schemes to Weisfeiler–Leman tests, extending universal filter-bank ideas to point-clouds, manifold data, and edge-attribute GCNs, learning channel-balance weights end-to-end in task-specific MFFBM designs, and exploring higher-order polynomial filter banks within those designs (Gao et al., 2021, Rahman et al., 19 Nov 2025).