Papers
Topics
Authors
Recent
Search
2000 character limit reached

In-Bank Computing Approach

Updated 23 December 2025
  • In-Bank Computing is a dual paradigm combining centralized spreadsheet analytics and in-DRAM processing to enhance data throughput and reduce operational risks.
  • It integrates server-side validation and versioning in finance with high-performance Processing-In-Memory, enabling faster and more reliable analytic workflows.
  • The approach leverages parallel in-memory filtering and vectorized operations to deliver 5–6× speedups in database queries while maintaining low area and power overhead.

The in-bank computing approach denotes two distinct but conceptually related paradigms for accelerating data analytics and mitigating operational risk in high-assurance financial and analytic environments. In the context of spreadsheet analytics management in finance, in-bank computing refers to the unification of spreadsheet paradigms with centralized server-side computation—transforming user-developed analytics into centrally validated, versioned, and efficiently executed server routines. In the context of high-performance in-memory database analytics, in-bank computing refers to physically moving compute resources into DRAM banks (“Processing-In-Memory,” or PIM), such that filtering and scan operations execute in close proximity to the stored data, exploiting DRAM bandwidth and intrinsic parallelism. Both approaches dissolve traditional bottlenecks between user logic and system controls or between logic and memory, delivering scalable analytic capability while reducing operational, regulatory, or bandwidth risks (0802.2932, Shekar et al., 8 Apr 2025).

1. System Architectures and Component Layout

Spreadsheet Analytics Management in Financial Markets

The architecture consists of three tightly integrated tiers:

  • Formula Grid Editor: An Excel-style spreadsheet editor embedded within a financial workbench, allowing traders, quants, and risk personnel to specify cell-based formulas and bind named instrument attributes to spreadsheet cells.
  • TimeScape Repository: A centralized database and metadata store maintaining typed time-series (e.g., prices, sizes), instrument schemas, and versioned Formula Grid definitions with audit trails and access controls.
  • Formula Grid Calculation Server: A horizontally scalable back-end that retrieves raw arrays from the database, applies spreadsheet-style computations, and delivers computed analytics to all clients. All operations are executed engine-side using optimized vector/matrix routines in C/C++.

Component interaction is illustrated by the following conceptual flow:

1
Trader/Risk <-> Formula Grid Editor <-> TimeScape Repository <-> Formula Grid Calculation Server
All users—regardless of UI (Excel, .NET, APIs)—invoke a unified, centrally managed analytics pipeline (0802.2932).

Processing-In-Memory (PIM) for Database Filtering

The DRAM-centric in-bank PIM architecture arranges per-bank filtering units (BFUs) at the interface of each DRAM bank, downstream from the row buffer. Key hardware features include:

  • Control Unit: Schedules DRAM commands in “All-Bank” mode, dispatching SIMD-style comparisons each cycle.
  • Reconfigurable Comparator Block: Supports bit-widths from 2 to 64 bits, enabling equality/range checks across dictionary-encoded or bit-packed columns.
  • Bitmap Output Buffer: Accumulates one bit per data element for predicate matches—after 64 elements, the bitmap is flushed to DRAM.

System-wide, up to 512 concurrent BFUs can operate in parallel for scalable, high-bandwidth internal filtering (Shekar et al., 8 Apr 2025).

2. Analytic Workflow and Data Flow

Financial Analytics Roll-out

The typical deployment involves the following steps:

  1. Schema Extension: Data architects declare new analytic attributes (e.g., VWAP) as “Formula Grid” types in the repository.
  2. Formula Grid Authoring: Business users drag relevant time-series (e.g., TradePrice, TradeSize) onto a spreadsheet canvas. Intermediate vectors and scalar outputs are constructed with cell-level formulas and may be hidden for clarity.
  3. Validation and Check-in: Grids are tested on reference instruments using preview panes, then checked into the central store with controlled access and versioning.
  4. Central Rollout: All clients immediately receive the new analytic as a selectable attribute with consistent, server-side semantics (0802.2932).

PIM-Based Query Acceleration

The execution flow for accelerating analytic SQL workloads is hybridized:

  • PIM (DRAM Bank Level): Executes scan and filter predicates (equality/range) directly on target columns, generating output bitmaps.
  • CPU: Handles complex relational operations (joins, aggregations, final output materialization) using only a small, filtered dataset resulting from the in-DRAM scan.

The detailed cooperative sequence involves initializing BFUs via mode registers, dispatching bulk PIM_FILTER operations from the CPU, collecting bitmaps, and completing aggregation on the CPU (Shekar et al., 8 Apr 2025).

3. Computational and Operational Properties

High-Volume Spreadsheet Computing

  • Intrinsic array operations: Each cell can contain entire time series (e.g., 25,000+ points per cell), operating in O(n)O(n) time per cell; up to 100,000+ points per instrument per calculation.
  • Linear scaling: Execution time grows linearly with input size (T(n)αnT(n) \approx \alpha \cdot n with α2 ⁣ ⁣5μ\alpha \approx 2\!-\!5\, \mus/point for simple arithmetic). Benchmarked cases compute VWAP over 50,000 datapoints in sub-second latency on commodity servers.
  • Server-based scaling: Rather than distributing spreadsheets, the bank adds server nodes behind a load-balancer for concurrent clients (0802.2932).

DRAM-Level PIM Parallelism

  • Parallel lanes: Each BFU operates on 64 bits per tCCDLt_{CCD_L} cycle; with B banks, theoretical DRAM-internal bandwidth exceeds 800 Gb/s per rank.
  • Area/power overhead: Silicon footprint is 0.1% per DRAM die; static power approximately 118 μ\muW per bank.
  • Data mapping optimizations: Cacheline and superpage configurations ensure data locality for efficient SIMD filtering (Shekar et al., 8 Apr 2025).

4. Integration of Logic and Data

Logic/Query Integration

  • Formula compilation: Spreadsheet cell formulas are compiled into execution plans mixing vectorized data retrieval and arithmetic directly on raw arrays, removing the need for explicit user JOIN logic.
  • Example: VWAP

VWAP=i=1npivii=1nvi\mathrm{VWAP} = \frac{\sum_{i=1}^{n} p_i\,v_i}{\sum_{i=1}^{n} v_i}

Server pseudo-code directly fetches time-series, multiplies elementwise, sums, and divides—with all alignment and vectorization logic, including matching timestamps, handled by underlying infrastructure (0802.2932).

  • PIM/CPU task partitioning: All memory-bound, parallelizable filters are executed in-bank, and CPU-bound, memory-irregular operations remain on the processor, preserving the conventional query processing stack (Shekar et al., 8 Apr 2025).

Denormalization to Expose In-Bank Parallelism

  • Pre-join denormalization: Original equi-join

RS=sS{(r,s):r[C]=s[C]}R \bowtie S = \bigcup_{s \in S} \{ (r,s) : r[C]=s[C] \}

is replaced with

FilterOperation(R,C,v)={rR:r[C]=v}\mathrm{FilterOperation}(R', C, v) = \{ r \in R' : r[C] = v \}

after ‘folding’ necessary S columns into R. This reduces query-stage costs from O(R+S)O(|R|+|S|) (build/probe) to O(R)O(|R'|) (brute scan) (Shekar et al., 8 Apr 2025).

5. Performance, Scalability, and Empirical Findings

Financial Analytics, Operational Risk

  • Empirical latency: In-bank computing computes VWAP for 25,481 points in under 200 ms and supports on-demand analytics for thousands of equities across the firm.
  • Macro migration: Migrating over 150 custom Excel macros to 30 centralized analytics cut report generation from 45 to under 5 minutes.
  • Risk mitigation: “One truth” analytics definitions and audit trails satisfy regulatory needs (SOX 404), and operational risk drops with centralization, with observed error-reduction factors (β\beta) exceeding 90% for migrated analytics:

Rnew=Rold(1β)R_{\mathrm{new}} = R_{\mathrm{old}} \cdot (1-\beta)

PIM Database Analytics

  • End-to-end speedup: 5.92×\times–6.38×\times acceleration on standard analytics workloads (TPC-H, SSB) at modest (9–17%) memory overhead.
  • Selectivity dependence: Speedup is highest for highly selective filters (<104<10^{-4} survival fraction), diminishing as more data passes the filter.
  • Scale: At 23 GB (SSB scale 100), speedup reaches 5.9×\times; effect is lower (4.4×\times) at 1 GB (scale 10), indicating increased benefit in large databases.
  • Area/bandwidth efficiency: Hardware modifications add only 0.001 mm² per DRAM chip, enabling 16–512 ×\times internal parallelism (Shekar et al., 8 Apr 2025).

6. Risk Management, Software Controls, and Best Practices

Centralized storage and logic in banking infrastructures:

  • Versioning and controls: All spreadsheet logic is versioned with full audit trail and role-based access; every analytic is centrally managed—eliminating uncontrolled model proliferation.
  • Transparency and reversal: All changes are timestamped, user-tagged, and reversible.
  • SOX alignment: Enables technical enforcement of spreadsheet risk controls; eliminates the necessity of error-prone, manual process reviews (0802.2932).

PIM-based database acceleration:

  • Best-fit workloads: Filtering kernels are ideal for in-bank processing. Joins and aggregation operations are less suitable, so hybrid execution is necessary.
  • Software practices: Denormalization and careful selection of target columns are essential to memory/bandwidth efficiency. Predicate parameters and filtering configuration are communicated in minimal cycles per query, amortizing setup overhead across massive parallel scans.
  • System stack retention: Most of the dataflow/programming stack remains unmodified, with DBMSes (e.g., DuckDB) extended via a PIM filter operator (Shekar et al., 8 Apr 2025).

In both paradigms, in-bank computing maximizes the exploitation of underlying hardware parallelism or workflow centralization, delivering greater analytic assurance and throughput while substantially mitigating operational, regulatory, or bandwidth bottlenecks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to In-Bank Computing Approach.