Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 71 tok/s

Gemini 2.5 Pro 46 tok/s Pro

GPT-5 Medium 27 tok/s Pro

GPT-5 High 30 tok/s Pro

GPT-4o 93 tok/s Pro

Kimi K2 207 tok/s Pro

GPT OSS 120B 460 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

MIMONets: Multiple-Input-Multiple-Output Neural Networks Exploiting Computation in Superposition (2312.02829v1)

Published 5 Dec 2023 in cs.LG, cs.AI, and stat.ML

Abstract: With the advent of deep learning, progressively larger neural networks have been designed to solve complex tasks. We take advantage of these capacity-rich models to lower the cost of inference by exploiting computation in superposition. To reduce the computational burden per input, we propose Multiple-Input-Multiple-Output Neural Networks (MIMONets) capable of handling many inputs at once. MIMONets augment various deep neural network architectures with variable binding mechanisms to represent an arbitrary number of inputs in a compositional data structure via fixed-width distributed representations. Accordingly, MIMONets adapt nonlinear neural transformations to process the data structure holistically, leading to a speedup nearly proportional to the number of superposed input items in the data structure. After processing in superposition, an unbinding mechanism recovers each transformed input of interest. MIMONets also provide a dynamic trade-off between accuracy and throughput by an instantaneous on-demand switching between a set of accuracy-throughput operating points, yet within a single set of fixed parameters. We apply the concept of MIMONets to both CNN and Transformer architectures resulting in MIMOConv and MIMOFormer, respectively. Empirical evaluations show that MIMOConv achieves about 2-4 x speedup at an accuracy delta within [+0.68, -3.18]% compared to WideResNet CNNs on CIFAR10 and CIFAR100. Similarly, MIMOFormer can handle 2-4 inputs at once while maintaining a high average accuracy within a [-1.07, -3.43]% delta on the long range arena benchmark. Finally, we provide mathematical bounds on the interference between superposition channels in MIMOFormer. Our code is available at https://github.com/IBM/multiple-input-multiple-output-nets.

Citations (8)

View on Semantic Scholar

Summary

The paper demonstrates that MIMONets reduce computational overhead by processing multiple inputs simultaneously while maintaining competitive accuracy.
It employs variable binding with fixed-width representations and high-dimensional keys to encode inputs into a compositional data structure.
Empirical evaluations on CNN and Transformer variants show significant speed-ups with only marginal accuracy trade-offs on benchmark tasks.

MIMONets: Enhancing Efficiency of Neural Networks

Introduction

Deep learning models have grown increasingly larger to achieve state-of-the-art performance, causing a surge in computational complexity. This presents a challenge for efficient neural network deployment, especially during inference when computation cost is a significant concern. To address this challenge, the concept of Multiple-Input-Multiple-Output Neural Networks (MIMONets) is introduced. MIMONets focus on reducing the computational burden by enabling the processing of multiple inputs concurrently, thus utilizing network capacity more effectively.

Variable Binding in Neural Networks

MIMONets augment existing neural network structures by incorporating variable binding mechanisms. These mechanisms utilize fixed-width distributed representations to encode numerous inputs into a compositional data structure. Such a structure can undergo holistic nonlinear transformations within the network, effectively allowing multiple inputs to be processed in a single pass. The inputs are protected by high-dimensional keys that enable them to occupy quasi-orthogonal subspaces, thus reducing cross-input interference during superposition. After the network processes the combined structure, an unbinding mechanism recovers each input's transformed state.

Dynamic Trade-off between Accuracy and Throughput

MIMONets offer a dynamic trade-off between model accuracy and computational efficiency. They enable the instant switching between different accuracy-throughput operating points within the same model. This flexibility allows the network to instantly adapt to varying computational requirements during inference without the need for loading different parameters. This capability is especially valuable in real-time applications where the computational load may fluctuate.

Empirical Evaluations

MIMONets have undergone experimental evaluations, particularly two variants named MIMOConv and MIMOFormer. MIMOConv applies the MIMONets principle to Convolutional Neural Networks (CNNs), while MIMOFormer extends it to Transformer architectures. Results show that MIMOConv achieves substantial speed-ups with marginal accuracy trade-offs on CIFAR10 and CIFAR100 datasets. Similarly, MIMOFormer can handle multiple inputs at once while maintaining competitive accuracy scores on synthetic sequence modeling tasks.

Conclusion

The introduction of MIMONets represents a significant step towards making deep learning models more computationally efficient. By processing multiple inputs in superposition, MIMONets effectively lower operation counts per input, offering a new method for performance optimization without compromising accuracy. These networks have the potential to enable faster and more dynamic neural network inference, making them suitable for various applications including real-time systems and large-scale AI models.