Filter Trimming in CNNs: ConvV2 & C-K-S

Updated 8 May 2026

Filter trimming is a set of structured techniques that remove redundant filters, kernel slices, or spatial supports in CNNs to reduce computation and memory costs.
ConvV2 and C-K-S utilize methods like matrix sketching and zero-skipping to optimize convolution operations, achieving significant FLOP reductions and hardware efficiency.
Empirical results demonstrate that these approaches yield up to 45-60% FLOP savings with minimal accuracy loss, making them practical for both commodity and specialized hardware.

Filter trimming is a family of structured techniques designed to reduce the computational and memory cost of convolutional neural networks (CNNs) by strategically reducing, or “trimming,” the number of filters, kernel spatial extent, or zero-padding footprints in convolutional layers. The variants known as ConvV2 and C-K-S correspond to algorithmic and architectural methodologies that aggressively target redundant or zero-contributing weights and spatial supports. These strategies enable substantial savings in FLOPs and real-world inference times, while maintaining accuracy and compatibility with existing hardware platforms.

1. Mathematical Foundations of Filter Trimming

Filter trimming methods systematically remove redundancy at one or more structural axes of the convolutional operator:

Channel (C) trimming: removes whole output filters/channels, leading to smaller activation maps and filter banks.
Kernel (K) trimming: reduces the spatial extent of filters, typically by removing outer stripes or “slices” of the kernel’s support.
Spatial (S) trimming: prunes individual weights, possibly resulting in sparse or structured-sparse kernels.

A formalization of channel- and kernel-wise pruning via matrix sketching is given in FilterSketch:

Given pre-trained conv weights $W \in \mathbb{R}^{d \times c}$ (with $d = c_\mathrm{in} \times h \times w$ , $c$ output channels), one seeks a reduced matrix $\Omega \in \mathbb{R}^{d \times \tilde{c}}$ ( $\tilde{c} < c$ ) such that

$\min_{\Omega}\ \| W W^T - \Omega \Omega^T \|_F$

This objective preserves the second-order (covariance) structure of the filters, thus retaining the representational capacity of the layer under aggressive channel or spatial pruning (Lin et al., 2020).

SMOF generalizes spatial support trimming by attaching learnable “filter skeleton” masks to each kernel, and using group-sparsity norms to drive entire spatial stripes to zero:

$L(W, \mathrm{FS}, \mathrm{FM}) = \sum_{(x,y)\in \mathcal{D}} \text{loss}(f(x; W \odot \mathrm{FS} \odot \mathrm{FM}), y) + \sum_{\ell,i} \alpha_i^\ell \| \mathrm{FS}_i^\ell \|_g + \sum_\ell \beta \| \mathrm{FM}^\ell \|_1$

where “ $\odot$ ” is broadcast-multiplication, FS are spatial masks, FM are per-channel gates, and the group norm couples the edges of concentric squares (Liu et al., 2021).

2. ConvV2 and C-K-S Algorithms: Procedures and Implementation

ConvV2 systematically removes zero-padding effects at convolution boundaries by adjusting the computation window for each output spatial position. For any location, only the weights that actually multiply nonzero input entries contribute:

Given inputs $X$ , filters $W$ , stride $d = c_\mathrm{in} \times h \times w$ 0, and padding $d = c_\mathrm{in} \times h \times w$ 1:

$d = c_\mathrm{in} \times h \times w$ 2

where

$d = c_\mathrm{in} \times h \times w$ 3

and similarly for the width (Zhang et al., 2023).

C-K-S extends ConvV2 to backward-propagation and dilated operations, employing:

KS-deconv: Kernel-Split deconvolution transforms stride-s operations into dense convolutions on sub-kernels, eliminating unnecessary multiply-accumulate due to inserted zeros.
Sk-dilated: For dilated convolutions, stride through X/VY at dilation steps, skipping memory and computation for zeros.

These approaches can be combined with other classical acceleration techniques (e.g., Winograd transformations as in (Cariow et al., 2020)).

3. Representative Structured Pruning Frameworks

Below is a comparative summary of several frameworks relevant to ConvV2 and C-K-S trimming:

Approach	Key Axes Pruned	Pruning Principle	Hardware Policy
FilterSketch (Lin et al., 2020)	C, K, S	Matrix sketching via Frequent Directions	General
SMOF (Liu et al., 2021)	C, K (square S)	Learnable masks/stripes + group sparsity	SIMD-aligned, off-the-shelf
C-K-S (Zhang et al., 2023)	Zero-padding, K (split), S (stride)	Algorithmic skipping of zeros	GPU, hardware-efficient
Winograd/Minimal Filtering (Cariow et al., 2020)	K	Algorithmic tile-based transform	FPGA, ASIC

FilterSketch leverages information-theoretic sketching and is axis-agnostic (matrix view adapts to unfolding filters along C, K, or S).
SMOF’s “peeling” is hardware-friendly: kernel-size reductions yield native speedups on ARM, Adreno, and DSPs without width-alignment penalties.
C-K-S integrates filter trimming by removing “dead” zeros (arising from padding or sparsity) and composes deconvolutions/dilations as standard dense convolutions for better SIMD utilization.

4. Practical Algorithms and Pseudocode

C–K–S FilterSketch for ConvV2 (Lin et al., 2020):

$d = c_\mathrm{in} \times h \times w$ 6

C-K-S Algorithm for efficient GPU convolution (Zhang et al., 2023):

$d = c_\mathrm{in} \times h \times w$ 7

5. Computational Complexity and Empirical Results

Filter trimming achieves substantial reductions in resource utilization:

ConvV2/CKS: Arithmetic costs are reduced by factors proportional to the ratio of retained filter support: $d = c_\mathrm{in} \times h \times w$ 4.
KS-deconv/Sk-dilated: Multiplicative cost reduction by $d = c_\mathrm{in} \times h \times w$ 5 for stride/dilated ops.

Empirically, ConvV2 and C-K-S implementations consistently surpass cuDNN/PyTorch for small-to-medium feature maps by 1.1×–1.8× on modern GPUs, while maintaining identical accuracy and convergence profiles (Zhang et al., 2023). On ImageNet, FilterSketch achieves FLOPs reductions of 45.5% with accuracy drops under 1% (ResNet-50: –0.69% Top-5) (Lin et al., 2020). SMOF reports wall-time reductions exceeding the corresponding FLOPs drop on DSP and GPU, due to alignment-friendly kernel size reduction (Liu et al., 2021).

6. Limitations, Hardware Compatibility, and Extensions

Limitations stem from hardware and tiling overheads:

For ConvV2 trimming, benefit is realized only when the fraction of computations due to padding is non-negligible (padded-zero proportion >6%). For large spatial maps with minimal padding, pointer adjustment costs can outweigh computational gains (Zhang et al., 2023).
Channel-trimming alone on CPUs/GPUs may incur alignment penalties due to SIMD width. SMOF, by jointly reducing channel and kernel size in square fashion, is not subject to this, enabling practical acceleration on off-the-shelf hardware (Liu et al., 2021).
Extensions include fusion with Winograd minimal filtering (saving 30–45% of multipliers for 3×3, 5×5 filters on FPGAs/ASICs), grouping and depthwise pruning, and hardware hard-wiring of C–K–S transformations (Cariow et al., 2020).

7. Conclusion and Comparative Perspective

Filter trimming—especially as instantiated in the ConvV2 and C-K-S methodologies—encompasses algorithms that aggressively and efficiently remove structural and arithmetic redundancy from convolutional neural networks. Approaches range from information-theoretic matrix sketching, group-sparsity-based regularization, to direct algorithmic skipping of zeros and kernel support. These methodologies enable FLOPs reductions of 30–60%, maintain accuracy to within 1%, and yield measurable speedups on both commodity and embedded platforms. C-K-S in particular provides a hardware-friendly path for achieving efficiency by unifying architectural and algorithmic pruning into a single, SIMD-optimized policy (Zhang et al., 2023, Lin et al., 2020, Liu et al., 2021, Cariow et al., 2020).

Markdown Report Issue Upgrade to Chat

References (4)

Filter Sketch for Network Pruning (2020)

SMOF: Squeezing More Out of Filters Yields Hardware-Friendly CNN Pruning (2021)

Reduce Computational Complexity for Convolutional Layers by Skipping Zeros (2023)

Minimal Filtering Algorithms for Convolutional Neural Networks (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Filter Trimming (ConvV2, C-K-S).