CompressARC: Advanced Compression Methods
- CompressARC is a suite of advanced compression techniques spanning neural context, visual reasoning, large archive retrieval, and adaptive database indexing.
- The ARC-Encoder method compresses LLM context 4x to 8x by pooling token representations, preserving near open‐book performance while enabling multi-decoder adaptation.
- RLZ-based archive compression and adaptive column strategies offer high compression ratios and rapid random access, while MDL-driven models enable data-efficient visual reasoning.
CompressARC refers to a collection of advanced compression techniques and systems applied across distinct domains, most notably in (1) neural context compression for LLMs (Pilchen et al., 23 Oct 2025), (2) data-efficient learning for ARC-AGI visual reasoning benchmarks (Liao et al., 5 Dec 2025), (3) efficient random-access archiving of large textual datasets via RLZ (relative Lempel-Ziv) (Petri et al., 2016), and (4) adaptive integer column compression for in-memory self-driving databases (Fehér et al., 2022). Each instantiation introduces a unique algorithmic and engineering approach to compression, emphasizing specific trade-offs between efficiency, fidelity, generalization ability, and deployment feasibility.
1. CompressARC for Neural Context Compression in LLMs
CompressARC, also formalized as ARC-Encoder, constitutes a plug-and-play “soft” context compressor designed for transformer-based LLMs without modifying their architectures (Pilchen et al., 23 Oct 2025). The architecture consists of a Transformer-based encoder, a pooling operation in the last self‐attention block, and a two-layer MLP projector.
Given a sequence of tokens, ARC-Encoder emits continuous representations (with or $8$ as typical pooling factors), which are injected directly into the decoder’s embedding layer. The decoder is frozen, and only the encoder/MLP are trainable. Pooling is performed by averaging non-overlapping -length chunks of queries in the last encoder layer:
The encoder can be trained to serve multiple decoder LLMs simultaneously, with decoder-specific MLP adapters representing of encoder parameters.
Training utilizes a mix of two alternating objectives: (a) reconstruction of the original sequence, and (b) continuation—predicting the sequence after a compressed segment. The composite loss is
with optimal. Fine-tuning preserves in-context learning by interleaving few-shot examples and restricts updates to the encoder and MLP only.
Empirical results show that at compression, ARC-Encoder achieves nearly open-book performance on QA, translation, and summarization benchmarks with prefill speed-up ( measured). At compression, there is increased degradation on tasks requiring token-level precision, but performance remains above the no-context (closed-book) baseline. The system generalizes across LLM families such as Llama and Mistral with minimal per-decoder adaptation cost.
2. CompressARC in ARC-AGI Without Pretraining (MDL-driven Visual Reasoning)
CompressARC in the ARC-AGI domain designates a 76,000-parameter model that forgoes pretraining and instead optimizes a Minimum Description Length (MDL) criterion at inference, learning from scratch on a per-puzzle basis (Liao et al., 5 Dec 2025). The objective is to discover the model/program minimizing total length
where and measure the bits to encode the model and the data, respectively. CompressARC converts this to a differentiable variational form using a Gaussian latent seed and a neural decoder , yielding the loss
The architecture leverages multitensor representations and group-equivariant networks, supporting operations such as mean reduce/broadcast, softmax sharpening across spatial or color dimensions, and geometric directional communication. All training proceeds at inference, with no use of any train set or pretrained weights.
Empirical results on the ARC-AGI benchmark show that CompressARC solves 20% of evaluation puzzles (pass@2), a substantial rate considering the zero-shot, no-pretraining condition. The model exhibits strong inductive biases in geometric reasoning (object localization, infilling, directional extension), but is limited in long-range recurrence and multi-step algorithmic reasoning. CompressARC stands out by providing a compression-based alternative to large-scale pretraining for intelligence benchmarks.
3. CompressARC for Large-Scale Archive Compression (RLZ for Random Access)
In high-redundancy, large-scale archives such as web or ARC files, CompressARC refers to an RLZ-based system for efficient compression and rapid random-access retrieval (Petri et al., 2016). The method constructs a semi-static dictionary by uniform sampling from the archive , partitions the data into fixed-length blocks, then applies greedy factorization of each block against .
Each block is encoded as a sequence of offset, length factors referencing , with literals used for unmatched bytes. Factor streams are then encoded with static bit-width and variable-byte codes. The compressed ratio (compressed size over original size) and access cost are derived analytically; empirically, in the 15–25% range is attainable with O(0.3–0.4) ms block random-access latency on SSD for KiB and MiB.
In practical terms, RLZ-based CompressARC provides near-optimal compression with sub-millisecond random-access on SSDs—which is essential for workloads requiring intermittent, partial retrieval, such as web search archives. System integration requires placing and a block index in memory, and, optionally, incorporating block priming or three-stream extensions for further marginal improvements.
4. CompressARC for Adaptive Column Compression in Databases
In columnar in-memory databases, CompressARC specifies an adaptive integer-column compressor based on generalized deduplication (GD) and the "LastBit" transformation (Fehér et al., 2022). Each 32-bit is split as , , where (deviation size) is chosen adaptively, and bases are deduplicated and stored in a sorted array, along with compressed deviation indices.
Four GD variants mediate trade-offs between access speed, scan speed, and compression ratio, using per-value arrays, deduplication, and per-base deviation organization. The segmentation adaptation mechanism profiles all potential , normalizes compression and latency metrics, and selects to maximize an application-weighted utility function:
where are tunable for compression, random access, sequential access, and scan, respectively.
Integrations into systems such as Hyrise yield compression notably superior to PFoR (5–15% better) with only 20–30% query overhead, and $8$– faster access compared to LZ4. CompressARC supports both late- and early-materialization workloads, automatically retraining segments as access patterns change.
5. Performance Trade-offs and Practical Considerations
CompressARC implementations are distinguished by their explicit attention to efficiency and accuracy under domain constraints:
- ARC-Encoder neural compressors maintain few-shot LLM capabilities, with speed-up at context compression and minimal loss in downstream metrics (average EM drops from 49.2 to 45.5 at PF=4).
- RLZ-based archive compressors achieve $16$– compression with random access latency 0.4 ms (SSD) and ms (HDD), suitable for large datasets with sporadic access.
- Adaptive columnar CompressARC attains $47$– compression on synthetic integer data (GD1-LM, default) and up to on real workloads, preserving random-access/query efficiency within of PFoR.
- MDL-driven CompressARC models in ARC-AGI, despite severe data constraints, achieve solve rates, far surpassing baselines that use neither pretraining nor additional data.
Practical limitations include performance degradation with extreme compression factors ( in ARC-Encoder), supply of adequate RAM for the RLZ dictionary, and limited expressive power for MDL-driven models on algorithmically deep ARC puzzles. Configuration parameters—such as pooling factor in LLMs, dictionary/block sizes in RLZ, and deviation size in column compression—should be tuned to balance accuracy, speed, and resource constraints.
6. Research Impact and Future Directions
CompressARC methodologies represent state-of-the-art solutions across diverse compression challenges, with broad implications:
- In LLM applications, portable soft-compression (ARC-Encoder) enables scalable, efficient prompt engineering and retrieval-augmented reasoning, decoupling encoder and decoder development across model families (Pilchen et al., 23 Oct 2025).
- MDL-based puzzle solvers suggest that data-efficient intelligence is feasible via per-instance inference, motivating exploration of richer model classes and more efficient optimization for universal induction in vision (Liao et al., 5 Dec 2025).
- RLZ-based compression is highly relevant for large-scale NLP, scientific, and web archives, providing precise engineering guidance for system builders (Petri et al., 2016).
- Adaptive database compression addresses a longstanding gap in self-driving DBMSs, with automatic balancing of compression and access cost based on workload and data patterns (Fehér et al., 2022).
A plausible implication is that the convergence of soft, universal compression principles (MDL, RLZ, soft neural pooling) with practical system design will become increasingly critical as dataset scale and diversity increase. Further research may address multi-modal context compression, learnable block-dictionary optimization in RLZ, and compression-augmented self-supervised learning.
References
| Application Area | Approach/Implementation | arXiv ID |
|---|---|---|
| Neural context compression (LLMs) | ARC-Encoder (soft compressor) | (Pilchen et al., 23 Oct 2025) |
| Visual reasoning (ARC-AGI puzzles) | MDL-driven, no-pretraining model | (Liao et al., 5 Dec 2025) |
| Archive compression & random access | RLZ-based CompressARC | (Petri et al., 2016) |
| Column-store database compression | GD-based adaptive CompressARC | (Fehér et al., 2022) |