HeMLGOP: Heterogeneous Generalized Perceptrons

Updated 11 May 2026

HeMLGOP is a neural architecture that extends classical MLPs by allowing each neuron to select its own transformation, pooling, and activation operators.
The progressive network construction strategy adapts both depth and width based on operator candidate performance, resulting in compact and efficient models.
Empirical results show that HeMLGOP achieves top accuracy with significantly smaller model sizes and faster training times compared to traditional deep neural networks.

Heterogeneous Multilayer Generalized Operational Perceptrons (HeMLGOP) define a neural network architecture that extends classical multilayer perceptrons (MLPs) by replacing the standard linear-threshold neuron with Generalized Operational Perceptrons (GOPs). In HeMLGOP, each neuron, regardless of its depth or layer, may independently select its synaptic transformation, dendritic pooling, and nonlinearity from predefined operator libraries. This neuron-level heterogeneity, coupled with a progressive algorithm for optimizing topology and operator selection, enables networks that are more compact and expressive than conventional deep neural models. HeMLGOP inherits foundational principles from Operational Neural Networks (ONNs), generalizing them to multilayer settings with automatic, data-driven architecture and operator adaptation (Tran et al., 2018, Kiranyaz et al., 2019).

1. The Generalized Operational Perceptron Formalism

Each GOP neuron in HeMLGOP generalizes the McCulloch–Pitts model by introducing three configurable stages: a nodal operator (synaptic transformation), a pooling operator (aggregation), and an activation function (nonlinearity). For neuron $i$ in layer $\ell+1$ , with weights $w_{ki}^{(\ell+1)}$ , bias $b_i^{(\ell+1)}$ , nodal operator $\psi_i^{(\ell+1)} \in \Psi$ , pooling operator $\rho_i^{(\ell+1)} \in P$ , and activation $f_i^{(\ell+1)} \in F$ , the computation is:

$z_{ki}^{(\ell+1)} = \psi^{(\ell+1)}_i(y^{(\ell)}_k,\, w_{ki}^{(\ell+1)})$
$x_i^{(\ell+1)} = \rho^{(\ell+1)}_i(z^{(\ell+1)}_{1i}, \dots, z^{(\ell+1)}_{N_\ell,i}) + b_i^{(\ell+1)}$
$y_i^{(\ell+1)} = f^{(\ell+1)}_i(x_i^{(\ell+1)})$

The libraries may include:

Nodal $\ell+1$ 0: multiply, exp, sin, quadratic, Gaussian, difference-of-Gaussians, etc.
Pooling $\ell+1$ 1: sum, 1-correlation, max, etc.
Activation $\ell+1$ 2: sigmoid, tanh, ReLU, softplus, ELU, etc.

When all operators are set to standard choices (e.g., multiply/sum/sigmoid), the model reduces to an MLP neuron (Tran et al., 2018, Kiranyaz et al., 2019).

2. Motivation and Representational Advantages

The classical MLP is constrained to uniform, affine nonlinear transformations, which limits diversity and modeling power, especially for highly nonlinear and multimodal tasks. Biological neurons exhibit diverse synaptic and dendritic transformations; standard MLPs do not reflect this heterogeneity. GOP-based architectures, by allowing each neuron to select its own operator triple $\ell+1$ 3, drastically expand the class of nonlinear functions that can be represented, often with fewer neurons and layers. Fixing operator sets per layer or per network, as in conventional architectures, results in networks that are often oversized or under-optimized for real-world tasks. The capacity for per-neuron heterogeneity is a central driver of HeMLGOP's high representational efficiency (Tran et al., 2018).

3. Progressive, Neuron-Level Network Construction

HeMLGOP adopts a progressive, block-based search strategy to jointly optimize both depth (number of layers) and width (number of neurons per layer), as well as the set of operators for each new neuron. The learning process proceeds as follows:

Initialize with input $\ell+1$ 4 and targets $\ell+1$ 5. Set layer index $\ell+1$ 6.
Layer Growth: For each new layer, start with $\ell+1$ 7 GOPs.
Width Growth:
- For each candidate operator set $\ell+1$ $ℓ + 1$ 8:
  - Randomly initialize candidate weights, compute corresponding hidden outputs, and concatenate with existing activations.
  - Solve linear regression (using Moore-Penrose pseudoinverse or Tikhonov regularization) to fit output.
  - Compute candidate loss ( $\ell+1$ 9).
- Select operator set $w_{ki}^{(\ell+1)}$ 0 achieving lowest loss, fine-tune the new neurons with backpropagation (BP) for a small number of epochs.
- Assess relative improvement $w_{ki}^{(\ell+1)}$ 1; if below threshold $w_{ki}^{(\ell+1)}$ 2, stop adding neurons to current layer; otherwise, append and repeat.
Depth Growth: After width stops, compute layer-level improvement $w_{ki}^{(\ell+1)}$ 3. If $w_{ki}^{(\ell+1)}$ 4, halt; otherwise, add next layer with new inputs, and repeat.
Final Fine-tuning: Unfreeze all parameters and optionally perform full-network BP (Tran et al., 2018).

Pseudocode for the progression is explicitly detailed in (Tran et al., 2018), incorporating randomized evaluation, batch normalization, regularization, and stopping criteria for compactness.

4. Training Methodology and Operator Search

Operator and weight selection is divided into two cooperating stages:

Randomized Network (RN) Evaluation: For each candidate operator triple, assign random weights, normalize hidden outputs, and solve for closed-form output weights. This enables rapid, low-cost evaluation across many operator candidates before committing to gradient-based optimization.
Block Fine-Tuning: Only the most recent block of selected neurons undergoes BP-based optimization, with prior layers fixed, during the progressive growth. This implicitly regularizes new parameters against prior structure.
Full-Network Optimization: Optionally, a final global fine-tuning is performed with all weights unfrozen. Batch normalization is essential, as concatenation of heterogeneous neuron blocks can skew activation statistics (Tran et al., 2018, Kiranyaz et al., 2019).

Regularization is implemented using weight decay or $w_{ki}^{(\ell+1)}$ 5-norm constraints, with dropout applied variably (0.1–0.5). Loss functions typically include mean squared error or negative log-likelihood, with the same protocol applied for classification (Tran et al., 2018).

5. Empirical Performance and Architectural Properties

HeMLGOP demonstrates state-of-the-art or near state-of-the-art results on 11 real-world classification tasks, with datasets ranging from 768 to 60,000 samples, input dimensions 8–512, and class counts from 2 to 500. Benchmarks include the Progressive Operational Perceptron (POP), progressive MLP (PMLP), Progressive Learning Network (PLN), and Broad Learning System (BLS), as well as HeMLGOP variants (e.g., HoMLRN, HeMLRN, HoMLGOP).

Key outcomes:

HeMLGOP typically achieves top accuracy.
Model sizes are $w_{ki}^{(\ell+1)}$ 6– $w_{ki}^{(\ell+1)}$ 7 smaller than POP/PMLP and more compact than PLN/BLS.
Training time is up to $w_{ki}^{(\ell+1)}$ 8 faster than POP, and competitive with other variants.
Inference FLOPs are among the lowest measured, facilitating efficient deployment.
Operator selection in heterogeneous layers is highly diverse; common choices are $w_{ki}^{(\ell+1)}$ 9multiply, $b_i^{(\ell+1)}$ 0sum, $b_i^{(\ell+1)}$ 1ReLU/ELU (see operator distribution analysis in Fig. 2 of (Tran et al., 2018)).

The progressive, information-gain-based stopping criteria effectively prevent overgrowth and underfitting, producing networks that are both computationally and parametrically efficient.

Method	Model Size	Accuracy	Training Time
HeMLGOP	3–50× smaller than POP	Top/near-top	$b_i^{(\ell+1)}$ 2 faster than POP
POP/PMLP	Baseline	Varies	Reference
PLN/BLS	Larger, slower	Varies	Slower/more complex

6. Relation to Operational Neural Networks and Theoretical Context

HeMLGOP is a direct generalization of the Operational Neural Networks (ONNs) framework, which introduced the notion of heterogeneity at a neuron and layer level by allowing arbitrary choices of nodal, pooling, and activation operators. The ONN neuron is mathematically equivalent—differing only in architectural application (vanilla feedforward vs. progressive multilayer search)—and both approaches demand specialized backpropagation through arbitrary operator chains.

HeMLGOP extends these principles by:

Progressively searching over architecture and operator candidates at fine granularity (per neuron).
Employing RN-based operator evaluation, enabling efficient exploration of expressive building blocks.
Simultaneously optimizing network width and depth via explicit stopping rules based on loss improvement.

Operational heterogeneity is shown to amplify representational power per parameter and enable compact networks adaptive to modality-specific nonlinearities (e.g., in vision tasks). Notably, ONNs matched or outperformed CNNs of similar size on vision benchmarks, indicating that the core operator-diversity principle is instrumental even outside multilayer progressive architectures (Kiranyaz et al., 2019).

7. Implementation Considerations and Practical Guidelines

Practitioners are advised to:

Begin with modest initial block sizes ( $b_i^{(\ell+1)}$ 3– $b_i^{(\ell+1)}$ 4) and increment ( $b_i^{(\ell+1)}$ 5– $b_i^{(\ell+1)}$ 6), tuning $b_i^{(\ell+1)}$ 7 and $b_i^{(\ell+1)}$ 8 to balance compactness against accuracy.
Design operator libraries that span relevant linear and nonlinear relations for the target domain.
Rely on RN evaluation to filter operator sets efficiently, shifting to backpropagation fine-tuning only for promising candidates.
Always use normalization (e.g., Batch Normalization) when joining heterogeneous structures.
Employ short final full-network fine-tuning as necessary, reserving further epochs only where overfitting is controlled by validation.

This approach unifies structural flexibility (depth/width), algorithmic efficiency (progressive/RN evaluation), and functional diversity (per-neuron operator selection), yielding compact, high-performance neural networks with broad applicability (Tran et al., 2018).

Markdown Report Issue Upgrade to Chat

References (2)

Heterogeneous Multilayer Generalized Operational Perceptron (2018)

Operational Neural Networks (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Heterogeneous Multilayer Generalized Operational Perceptrons (HeMLGOP).