Gabliterated-v1 Model Series
- Gabliterated-v1 Model Series is a dual-approach framework that integrates VOneNet ensembles for visual robustness and adaptive projections in LLMs for targeted behavioral control.
- The vision branch ensembles eight VOneNet variants using diverse Gabor filter parameters to achieve up to a 38% improvement in robustness against image corruptions.
- The language branch employs adaptive multi-directional projections to reduce refusal rates by nearly 0.87 while preserving overall model accuracy.
The Gabliterated-v1 Model Series comprises two distinct lines of research spanning visual robustness in convolutional neural networks and targeted behavioral modification in LLMs. In both, "Gabliteration" refers to the systematic combination or adaptation of system components—either front-end biophysical neuron models or model neural weights—to achieve selective improvements (robustness or refusal suppression) while minimizing undesirable side effects on general task performance (Gülmez, 21 Dec 2025, Baidya et al., 2021). Both approaches employ rigorous empirical benchmarks, precise mathematical formalisms, and focus on preserving overall model capability.
1. Model Series Overview
The term "Gabliterated-v1" designates:
- In vision, ensembles of convolutional networks with V1-inspired front-end variants (VOneNets), where each variant models distinct aspects of primate V1 (Table 1). Their ensemble increases out-of-distribution robustness (Baidya et al., 2021).
- In language modeling, a family of transformer-based LLMs, fine-tuned via the Gabliteration method, enabling selective suppression of specific behaviors (e.g., refusals) with minimal collateral loss in language understanding or generation capabilities. Model sizes range from 0.6B to 4B parameters, with identical dense transformer architectures as their Qwen2.5 or Llama3 baselines (Gülmez, 21 Dec 2025).
Table 1: Gabliterated-v1 Model Variants and Key Parameters
| Series | Key Variant(s) | Parameterization |
|---|---|---|
| Vision (CNN/V1) | 8 VOneBlock front-ends | SF bands, nonlinearity, noise |
| Language (LLM) | 0.6B, 1.2B, 2B, 4B variants | d=2048–5120, 24–48 layers |
Variant checkpoints and code are available at: https://huggingface.co/Goekdeniz-Guelmez/gabliterated-v1
2. Methodological Foundations
2.1. Vision: VOneBlock Variant Ensembling
Gabliterated-V1 for vision ensembles eight VOneNet models, each differing in Gabor spatial frequency band coverage, simple/complex channel ratio, and stochasticity (Poisson/noise parameters). Each input image is processed by a linear Gabor filter bank, followed by canonical simple- or complex-cell nonlinearities, and optionally corrupted by channel-wise Poisson-like noise:
- Standard: cpd, $256/256$ simple/complex, Poisson noise
- Others: variations across low/mid/high spatial frequency bands, purely simple/complex, low/no noise (Baidya et al., 2021)
Logits from each variant are averaged to produce classification predictions: .
2.2. Language: Adaptive Multi-Directional Projection (Gabliteration)
For LLMs, Gabliteration implements an adaptive multi-directional projection update on weight matrices , targeting layers selected by maximal behavior-feature separability. For each selected , the top singular vectors from hidden state difference are extracted, forming , and a ridge-regularized projector:
Model weights are updated:
Layer-specific scaling is adaptively set based on position and β hyperparameter.
3. Layer Selection, Scaling, and Theoretical Guarantees
Regularized layer selection employs a two-step process:
- Compute separability for each candidate layer
- Effective layers are those for which the temporary modification does not excessively increase refusal (), only these are updated
Scaling factors are adapted according to the normalized position within :
Theoretical analysis shows that when the target and refusal subspaces are nearly orthogonal (principal angle ), interference with general capabilities is negligible. Regularization (λ) controls the proximity of the applied projection to the unregularized , with approximation error bounded as .
4. Empirical Performance and Analysis
4.1. Vision
On Tiny ImageNet-C, the Variants Ensemble achieves:
- Clean accuracy: (on par or better than baseline ResNet18)
- Robustness to corruptions: 38% mean relative improvement across all 75 corruption sets versus ResNet18 baseline (Baidya et al., 2021)
- Control ensembles (seeds, data augmentation) yield smaller gains
Knowledge distillation from the ensemble to a no-noise VOneNet variant compresses much of the gain, yielding relative corruption accuracy improvement and minimal sacrifice in clean performance.
4.2. Language
Across 0.6B–7B models, standard Gabliteration settings () result in:
- Mean refusal rate reduction
- MMLU accuracy drop Ablation shows SVD-pairing matches Fisher LDA ( vs. ) at 60% lower computational cost. Stronger orthogonalization yields further refusal reduction () but unacceptable MMLU loss.
5. Hyperparameterization, Usage, and Limitations
5.1. Hyperparameter Recommendations
| Model Size | Recommended |
|---|---|
| <3B | (1, 0.2, 0.05, 0.8, 0.5) |
| 3–7B | (2, 0.3, 0.1, 0.8, 0.5) |
| >7B | (3, 0.4, 0.15, 0.8, 0.6) |
Grid search over is recommended to balance refusal suppression and general performance.
5.2. Implementation and Practical Details
For the LLM series, standard Hugging Face workflows are supported (PyTorch/Transformers). Model cards provide all hyperparameter defaults and dataset links for reproducibility. For the vision suite, code and evaluation protocols follow those from Dapello et al. and Baidya et al. (Gülmez, 21 Dec 2025, Baidya et al., 2021).
6. Limitations and Open Directions
- For LLMs, computational overhead scales , making Gabliteration less tractable for >30B parameter models.
- Hyperparameter sensitivity remains an open problem; robust automated tuning routines are not established.
- The single-pass update regime may underfit complex behavioral patterns, suggesting iterative extensions as plausible future work.
- Projection regularization assumes ; rank-deficient or increased noise can force larger , diminishing effectiveness.
- The current scope is limited to text-generation in LLMs; multimodal or reinforcement-learning extensions are untested.
- In vision, the biological plausibility of the ensemble/distillation operations is minimal, but the combined effect validates the value of multi-circuit simulation for robustness.
This suggests that the Gabliterated-v1 model family exemplifies the utility of systematically combining or modifying model subcomponents, grounded in mathematical regularization and empirical metric monitoring, to achieve targeted improvements in system behavior and robustness with minimal trade-off in overall performance (Gülmez, 21 Dec 2025, Baidya et al., 2021).