Protosampling: Integrated Sampling & Prototyping

Updated 4 July 2026

Protosampling is a multidisciplinary approach that integrates sampling with prototyping to actively explore and refine large design or signal spaces.
It underpins creative AI workflows, engineering validations, and structured sub-Nyquist acquisition by replacing passive measurement with dynamic, prototype-based sampling.
Its applications range from free-form visual generation and detector calibration to innovative computational simulation, offering scalable and efficient solutions.

Searching arXiv for the topic and closely related uses of the term "protosampling" to ground the article in the literature. Protosampling is a term used in recent arXiv literature for several non-identical but structurally related practices in which sampling is coupled to prototyping, prototype construction, or representative-fragment selection rather than treated as a purely passive measurement step. In the most explicit definitional usage, it denotes the convergence of sampling and prototyping in generative visual creation, where practitioners can move “beyond sampling existing media towards instantly generating and remixing new ones” (Guo et al., 8 Jan 2026). In other literatures, the term or closely aligned formulations designate prototype-based sampling calorimetry R&D, engineering-array validation of large detectors, hardware realization of time-encoding sub-Nyquist acquisition, generalized paths from periodic to irregular sampling, and representative-region methods for sampled architectural simulation (Liu, 2017, Avrorin et al., 2013, Naaman et al., 2023, Lacaze, 2019, Liu et al., 2023).

1. Terminological scope

Across the cited literature, protosampling does not denote a single algorithm. Instead, it appears in at least four recurrent senses: a creative-process concept, an engineering-validation strategy, a structured sub-Nyquist acquisition paradigm, and a sampled-computation methodology. This suggests that the term is best understood as a family of practices in which a provisional construct—a prototype detector, engineering array, synthetic region, or generated artifact—functions simultaneously as an object of testing and as a mechanism for sampling a larger design or state space.

Domain	Usage of protosampling	Representative paper
Visual AI generation	Convergence of sampling and prototyping	(Guo et al., 8 Jan 2026)
Detector R&D	Prototype-based sampling calorimetry and engineering validation	(Liu, 2017)
Asynchronous ADCs	Time-encoding, event-driven sub-Nyquist sampling	(Naaman et al., 2023)
Computer architecture	Representative-region or targeted sampled simulation	(Liu et al., 2023)

A common misconception is that protosampling necessarily refers to either prototype construction or signal sampling alone. The literature instead uses it for coupled workflows in which provisional structures are not ancillary: they are the means by which a larger system, signal class, or creative space is interrogated.

2. Protosampling in canvas-driven visual AI generation

In "Protosampling: Enabling Free-Form Convergence of Sampling and Prototyping through Canvas-Driven Visual AI Generation" (Guo et al., 8 Jan 2026), protosampling is defined as the increasingly intertwined practice of sampling and prototyping in creative work under generative AI. The central claim is that these are no longer cleanly separable stages: practitioners now collect, remix, and generate materials in the same process, and each new generation is both a sample from the evolving design space and a proto-version of the eventual artifact.

The paper grounds this definition in prior creativity and design literature. Sampling is described as “collecting, organizing and transforming materials” and as opportunistic assimilation of relevant material from the world. Prototyping is treated not as only later-stage production but as purposeful making that helps understand the problem, explore a design space, and answer specific questions with incomplete solutions. The paper argues that both are forms of problem construction, and that generative AI collapses the distance between them because material can be instantly generated, immediately reused, remixed, or transformed (Guo et al., 8 Jan 2026).

The system used to operationalize this formulation is Atelier, a canvas-like environment in which references and generated assets co-exist in one space. Its three core properties are explicit. First, it blends the spaces for thinking and creation, so source materials, generated outputs, and intermediate artifacts live together on one canvas. Second, it provides encapsulated technical workflows centered on the activity at hand through “Easels.” Third, it supports navigation of emergence through provenance views, smart search, and collections (Guo et al., 8 Jan 2026).

The workflow abstractions are task-specific. The paper identifies prep easels such as Collage and Sketch, and generation easels such as Draw, Paint, Trace, Modify, and Animate. Provenance is exposed through views including Lineage, History, Trails, Activity, and Timeline. The system also supports drag-and-drop import of images, video, text, audio, and 3D models; direct creation of text and sketches in canvas; automatic image captioning for search; persistent Collections; and an Exhibit gallery (Guo et al., 8 Jan 2026).

The first-use study reported in the paper lasted 4 hours and involved five creative professionals. Participants valued the visibility of process, the ability to return to earlier parameter configurations, and the coexistence of sampled and generated media. The authors’ larger claim is that protosampling reframes creative work to emphasize the process itself and how seemingly disjointed thoughts can tightly interweave into a final solution (Guo et al., 8 Jan 2026).

3. Prototype-based detector engineering

In high-energy and astroparticle instrumentation, protosampling is used in a more engineering-centered sense. The CALICE collaboration’s AHCAL work is described as “protosampling / prototype-based sampling calorimetry R&D” for future hadronic calorimeters (Liu, 2017). Here the objective is not merely to demonstrate calorimeter physics performance, but to validate a technological prototype that can be scaled to a full detector with automated mass production, stable calibration, low power, and embedded electronics.

The AHCAL is a sampling calorimeter with steel or tungsten absorber plates and plastic scintillator tiles read out by silicon photomultipliers as active components. The scintillator tiles have dimensions $30 \times 30 \times 3~\mathrm{mm}^3$ , and iron and tungsten have been used as absorber materials. The front-end electronics are fully integrated into the active layers and designed for power pulsing. The 2015 CERN SPS campaign used a prototype with 14 active layers and 3744 channels, exposed to muon, electron, and hadron beams. LED data were used for channel-by-channel gain extraction, muons provided minimum-ionizing-particle calibration, and preliminary power-pulsing tests showed no observable gain drop with a switch-on time of $60~\mu\mathrm{s}$ relative to continuously running mode (Liu, 2017).

A major outcome of that campaign was the identification of a surface-mounted SiPM concept as the only design suitable for mass assembly. In this geometry, a surface-mounted SiPM is soldered onto the PCB and directly coupled to a scintillator tile with a dome-shaped cavity. The design was optimized with GEANT4 simulations for high and uniform light collection efficiency. Subsequent development included a first proof-of-principle 144-channel SMD-HBU, six new SMD-HBUs in 2016 using new SiPMs and updated tile design, automated pick-and-place assembly, reduced dark-count noise, strongly suppressed inter-pixel crosstalk, and improved uniformity in SiPM quality; the reported crosstalk level was $< 2\%$ at the nominal reverse voltage. The targeted next step was a steel-stack system with 40 active layers, $2\times2$ HBUs per layer, and about 1% of the barrel ILC-AHCAL (Liu, 2017).

The BAIKAL-GVD project presents a closely related engineering-array logic (Avrorin et al., 2013). Its prototyping phase aimed at “in situ comprehensive tests of all elements and systems of the future telescope as the parts of engineering arrays operating in Lake Baikal.” The sequence of deployments proceeded from a 6-OM reduced-size section in 2008 and a 12-OM prototype string in 2009 to a three-string engineering array in April 2011, a first full-scale 24-OM string in April 2012, and a three-string demonstration-cluster stage in 2013. The 2013 array had 72 OMs on three 345 m long full-scale strings plus an instrumentation string, while the optimized final design emphasized 10,386 PMTs arranged as 27 clusters with 8 strings each, an instrumented area of about $2~\mathrm{km}^2$ , and an instrumented volume of about $1.4~\mathrm{km}^3$ (Avrorin et al., 2013).

In both detector examples, the prototype is not only a reduced copy of the final apparatus. It is the mechanism by which calibration, timing, power, communication, positioning, trigger logic, manufacturability, and scalability are sampled under realistic operating conditions. This suggests that, in detector R&D, protosampling names the transition from proving a measurement principle to proving the engineering and production concept.

4. Structured sub-Nyquist acquisition

A second major technical usage concerns hardware that samples structured signals below the Nyquist rate by replacing uniform amplitude sampling with prototype acquisition mechanisms tailored to the signal model. In the IF-TEM ADC work, protosampling is realized as time-encoding based, event-driven sampling for finite-rate-of-innovation signals (Naaman et al., 2023). The integrate-and-fire time-encoding machine is asynchronous, clock-less, and outputs spike times rather than amplitudes. For bounded input $y(t)$ , firing times satisfy

$\frac{1}{\kappa}\int_{t_n}^{t_{n+1}} \big(y(s)+b\big)\,ds = \delta,$

and the derived interval measurements are

$y_n \triangleq \int_{t_n}^{t_{n+1}} y(s)\,ds = -b(t_{n+1}-t_n)+\kappa\delta.$

The hardware board implements an integrator, comparator, differentiator, and fast reset path using a FET. For periodic FRI signals, the reconstruction chain first prefilters with a sum-of-sincs kernel over

$\mathcal{K}=\{-K,\ldots,-1,1,\ldots,K\}, \qquad K\ge 2L,$

then estimates Fourier coefficients from spike times and finally recovers amplitudes and delays by the annihilating filter. The reported hardware retrieved FRI parameters with an error of up to $60~\mu\mathrm{s}$ 0 dB while operating at rates approximately 10 times lower than the Nyquist rate; in a two-pulse example, 19 time instances produced a firing rate of 1.9 MHz, about 4.75 times the rate of innovation and about 10.5 times lower than the Nyquist rate (Naaman et al., 2023).

The sub-Nyquist radar prototype implements an allied logic through Xampling (Baransky et al., 2012). The received signal in one PRI is modeled as

$60~\mu\mathrm{s}$ 1

so the signal is sparse in delay and can be described by roughly $60~\mu\mathrm{s}$ 2 unknowns per PRI. Rather than digitizing the full wideband waveform, the receiver performs analog preprocessing with a 4-channel crystal receiver that extracts four groups of consecutive Fourier coefficients. Each channel is sampled at 250 kHz, for a total rate of 1 MHz, even though the received radar signals would require about 30 MHz practical matched-filter sampling and about 20 MHz complex-equivalent Nyquist sampling. The prototype therefore achieved about a 30-fold reduction relative to the practical matched-filter implementation and about a 20-fold reduction relative to the signal Nyquist rate while maintaining reasonable detection capability (Baransky et al., 2012).

A more theoretical bridge from ordinary periodic sampling to irregular sampling is provided by Periodic Nonuniform Sampling (Lacaze, 2019). The paper starts from Shannon reconstruction,

$60~\mu\mathrm{s}$ 3

and generalizes to a PNS of order $60~\mu\mathrm{s}$ 4,

$60~\mu\mathrm{s}$ 5

For multiband spectra, recovery is formulated through an $60~\mu\mathrm{s}$ 6 linear system $60~\mu\mathrm{s}$ 7 with

$60~\mu\mathrm{s}$ 8

and exact reconstruction requires $60~\mu\mathrm{s}$ 9. The paper’s conceptual move is to generalize the baseband spectrum hypothesis linked to the Nyquist bound to spectra in a finite number of intervals suited to the Landau condition. It also explicitly argues that resampling is costly in calculation time and accuracy, and that irregular sampling should be treated directly rather than first regularized onto a periodic grid (Lacaze, 2019).

These three lines of work share a common structure: they replace brute-force uniform acquisition with a prototype measurement architecture—spike times, selected Fourier bands, or interleaved periodic subsequences—chosen to match the signal’s low-dimensional or multiband structure.

5. Algorithmic and computational interpretations

In computational sampling theory, protosampling also appears as a design strategy in which the sampler itself is reorganized around sparsity, time occupancy, or conditional structure. The co-prime sampling work on time-division multiplexing starts from prototype and extended co-prime samplers, each based on two low-rate sub-samplers with spacings $< 2\%$ 0 and $< 2\%$ 1, where $< 2\%$ 2 and $< 2\%$ 3 are co-prime and $< 2\%$ 4 is the Nyquist period of the highest-bandwidth signal (Dias, 2021). Its central proposal is to use the “vacant” slots of extended co-prime operation to sample another signal via time division multiplexing. In the three-sampler version this permits acquisition of two signals with three samplers instead of four, and the paper further presents a two-sampler architecture in which one branch is shifted by half a period, introducing non-integer lag structure and connecting the method to super-Nyquist behavior and to generalized Extremely Sparse Co-Prime Arrays/Samplers (Dias, 2021).

A different algorithmic reinterpretation appears in "A Proximal Algorithm for Sampling" (Liang et al., 2022). There the target law is

$< 2\%$ 5

with $< 2\%$ 6 only semi-smooth or non-smooth and possibly non-convex. The method is based on the alternating sampling framework, a special case of Gibbs sampling on

$< 2\%$ 7

with an exact rejection-sampling realization of the restricted Gaussian oracle

$< 2\%$ 8

The details characterize ASF as “protosampling” because it converts a difficult sampling task into a proximal-style alternating scheme where the hard conditional is implemented exactly by rejection sampling. Under the stated step-size condition, the expected number of rejection steps is bounded by

$< 2\%$ 9

and the paper reports improved non-asymptotic complexity for broad non-smooth and non-convex settings (Liang et al., 2022).

Computer architecture uses a further extension of the idea in representative-region selection and live sampled simulation. Pac-Sim is presented as a sampled-simulation methodology for multithreaded workloads that combines the representative-region spirit of profile-driven methods with no-upfront-analysis online adaptivity (Liu et al., 2023). Its runtime loop includes marker detection, region profiling, clustering, prediction, and reconstruction; online BBVs use 16 dimensions; region sizes are bounded by $2\times2$ 0 million instructions and $2\times2$ 1 million instructions; and the reported average runtime errors are 1.63% for statically scheduled benchmarks and 3.81% for dynamically scheduled benchmarks, with speedups up to $2\times2$ 2 and $2\times2$ 3 on average for SPEC CPU2017 train workloads (Liu et al., 2023).

Nugget provides an infrastructure-oriented counterpart for targeted sampling across simulators and real hardware (Qiu et al., 2 Sep 2025). It operates at the LLVM IR level, defines interval analysis in terms of executed LLVM IR instructions rather than machine instructions, emits portable “nuggets” with start and end markers, and supports hardware validation before simulation. The paper reports that Nugget reduces interval-analysis overhead by an average of $2\times2$ 4 relative to gem5 functional simulation on multithreaded NPB workloads, with average overheads of about $2\times2$ 5 for SPEC CPU2017, $2\times2$ 6 for LSMS, and $2\times2$ 7 for NPB (Qiu et al., 2 Sep 2025). A plausible implication is that, in architecture research, protosampling increasingly denotes not only how representative regions are chosen but also the tooling layer that makes such selection portable and experimentally tractable.

6. Unifying themes and conceptual divergences

The cited uses of protosampling are heterogeneous, but several common themes recur. First, the prototype is elevated from a downstream test artifact to an active sampling device. In Atelier, generated and sampled media co-exist and recursively seed further generation (Guo et al., 8 Jan 2026). In AHCAL and BAIKAL-GVD, the prototype is the vehicle for sampling calibration, timing, integration, power, and manufacturability constraints under realistic conditions (Liu, 2017, Avrorin et al., 2013). In IF-TEM, Xampling radar, and PNS, the acquisition front end is redesigned so that the measured object is a structured summary—event times, selected Fourier bands, or offset periodic subsequences—rather than a uniform waveform (Naaman et al., 2023, Baransky et al., 2012, Lacaze, 2019).

Second, protosampling repeatedly appears where exhaustive treatment is too costly. The creativity paper addresses the explosion of generated assets and the need to keep process visible (Guo et al., 8 Jan 2026). The detector papers address the impracticality of committing directly to full collider or kilometer-scale instruments without scalable technological validation (Liu, 2017, Avrorin et al., 2013). The sub-Nyquist ADC and radar systems address clock cost, power consumption, and Nyquist-rate data volume (Naaman et al., 2023, Baransky et al., 2012). Pac-Sim and Nugget address the slowness of full architectural evaluation and simulation-driven interval discovery (Liu et al., 2023, Qiu et al., 2 Sep 2025).

Third, the literature does not support reducing protosampling to a single normative doctrine. In some papers it is principally about emergence and reflective practice; in others, about manufacturing readiness; in others, about exact recovery from low-dimensional signal models; and in still others, about simulation methodology. This suggests that the most stable encyclopedic definition is relational rather than procedural: protosampling is a mode of inquiry in which a provisional construct is used to sample a larger possibility space while simultaneously being refined as a candidate realization.

That breadth is also the source of ambiguity. In creative AI, the term names a conceptual reframing. In detector engineering, it names prototype-centered validation. In signal acquisition, it is tied to concrete asynchronous or multiband measurement architectures. In sampled simulation, it denotes representative-region discovery and execution-fragment reuse. The concept therefore has explanatory value across fields, but only when its local technical meaning is specified.