AtomicTableLLM: Modular Table Reasoning
- AtomicTableLLM is a modular, skill-based reasoning system designed for detailed, interpretable analysis of scientific tables and structured materials data.
- It employs a pre-trained sequence-to-sequence LLM with a skill-chaining controller to decompose table reasoning into discrete, atomic tasks for claim verification.
- The system achieves enhanced interpretability, reduced error propagation, and cross-domain generalization in areas such as materials science, medicine, and finance.
AtomicTableLLM is a modular, skill-based reasoning system designed for fine-grained, interpretable machine reasoning with scientific tables and structured materials data. In distinct applications, the term may refer to: (1) the OpenLAM large atom model for universal periodic table coverage in computational materials design (Peng et al., 20 Jan 2025), or (2) an LLM-based, modular table reasoning system for claim verification in scientific literature (Zhang et al., 8 Jun 2025). AtomicTableLLM architectures emphasize atomic decomposition of table reasoning, skill chaining, and robust table-structure representation, enabling enhanced interpretability, generalization, and precise verification in domains such as materials science, medicine, and finance.
1. Model Architecture and Inference Flow
AtomicTableLLM as introduced in (Zhang et al., 8 Jun 2025) is based on a pre-trained sequence-to-sequence LLM (e.g., Deepseek-Qwen-7B), augmented by a "skill-chaining controller." This controller coordinates a series of discrete, lightweight reasoning modules—termed "atomic skills"—each handling a well-defined subtasks in the table claim verification pipeline:
- Interpretation Head: Extracts the semantic intent of the claim given the table.
- Planner Head: Decomposes the claim into an ordered list of explicit sub-goals.
- Cell-Grounding Head: For each sub-goal, identifies and extracts relevant cell values.
- Reasoning Head: Dynamically invokes one or more atomic skills (e.g., concept matching, value extraction, numerical computation).
- Recap Head: Summarizes outcomes of reasoning steps.
- Conclusion Head: Aggregates stepwise findings to yield a binary SUPPORT/REFUTE label.
This inference schema formalizes the process as a function:
where is the table (including caption/context), is the claim, are the model parameters, and is the output label.
Each component is individually supervised during training, and the overall system proceeds via sequential invocation of the reasoning modules, maintaining explicit chain-of-subplans and grounding evidence at every step.
2. Atomic Skills and Modular Reasoning
A central concept in AtomicTableLLM is the decomposition of table reasoning into a finite set of atomic skills—elementary, reusable operations. The canonical inventory encompasses the following 12 skills (Zhang et al., 8 Jun 2025):
| # | Atomic Skill | Description |
|---|---|---|
| 1 | Concept Matching | Align claim noun phrases to headers |
| 2 | Concept Disambiguation | Normalize ambiguous column names |
| 3 | Value Extraction | Parse and extract numeric values |
| 4 | Unit Conversion | Canonicalize measurement units |
| 5 | Numerical Calculation | Arithmetic over extracted values |
| 6 | Numerical Comparison | Evaluate numeric relations |
| 7 | Range/Threshold Check | Assess value inclusion in intervals |
| 8 | Schema Understanding | Interpret table structure |
| 9 | Trend Detection | Identify monotonic or extremal trends |
| 10 | Logical Condition Matching | Check logical predicates |
| 11 | OOD Knowledge Utilization | Inject external domain knowledge |
| 12 | Causal Inference | Infer directed relationships |
For each sub-plan generated during inference, the SelectSkills module (a prompted LLM head) determines the appropriate subset and order of skills to invoke. This modularity allows the model to compose highly interpretable, stepwise reasoning chains.
3. Data Sources, Training Protocols, and Supervision
AtomicTableLLM is trained and evaluated on SciAtomicBench, a multi-domain, fine-grained benchmark annotated for claim verification with fully detailed reasoning chains (Zhang et al., 8 Jun 2025). SciAtomicBench includes:
- Machine Learning: 1,376 tables
- Materials Science: 37 expert tables (multi-claims)
- Medical Science: 1,468 tables
- Finance: 343 tables
Each table includes 1–3 positive and negative claims, with ground-truth SUPPORT/REFUTE labels and stepwise annotation of interpretation, sub-plan decomposition, cell grounding, and invoked skills.
Fine-tuning employs a training split of 350 ML examples, validation on 50, and Cross-Entropy loss:
with explicit supervision at each reasoning stage. Hyperparameters include: batch size 16, epochs 3, learning rate, input length 4096 tokens, and temperature (generation) 0.8.
In the context of materials modeling (OpenLAM), AtomicTableLLM refers to a Deep Potential attention-based message-passing neural network pretrained on 19.8 million crystal structures spanning all stable elements (Peng et al., 20 Jan 2025).
4. Input Representations and Periodic Table Coverage
AtomicTableLLM in the OpenLAM context operates on graph-based representations of crystal structures:
- Atomic Encodings: Each chemical element receives a learnable embedding vector (128–256 dimensions), indexed by atomic number.
- Local Environment Features: Each node (atom) includes features such as coordination number and local radial distribution.
- Graph Edges: Represent interatomic distances (cutoff approximately 6 Å), and attention heads may access angular or triplet features.
- Periodicity: Primitive cells and neighbor lists enforce periodic boundary conditions.
- Elemental Scope: Dataset covers all stable elements (H to Og). Only unphysical entries (high hull energy) are filtered.
Structurally, the model applies stacked message-passing layers with attention, aggregating geometric and atomic-type information and outputting structure-level properties via pooling and readout networks.
5. Evaluation Metrics and Comparative Performance
AtomicTableLLM's efficacy is measured along two principal axes:
- Claim Verification (Tables in Literature) (Zhang et al., 8 Jun 2025):
- Label accuracy: Model’s SUPPORT/REFUTE output against annotated gold labels.
- Chain Quality: Assessed via granularity, redundancy, alignment, interpretability, and stepwise correctness.
- On SciAtomicBench, AtomicTableLLM (Deepseek-Qwen-7B backbone, atomic fine-tuning) outperforms closed-source GPT-4o with chain-of-thought (CoT) prompting and other specialized table LLMs across all major domains, with the following representative accuracies:
| Domain | GPT-4o (CoT) | TableLLaMa | Deepseek-Qwen-7B (atomic) | |-------------------|-------------:|-----------:|--------------------------:| | Machine Learning | 0.9025 | 0.6263 | 0.8063 | | Materials Science | 0.8580 | 0.3337 | 0.7593 | | Medical Science | 0.8152 | 0.5710 | 0.7331 | | Finance | 0.8570 | 0.5328 | 0.8570 |
- Statistical significance is confirmed ( over 1,000 randomization trials against GPT-4o CoT).
- Crystal Structure Modeling (OpenLAM) (Peng et al., 20 Jan 2025):
- Energy-above-hull Mean Absolute Error (MAE): $0.0096$ eV/atom (OpenLAM vs. Materials Project ground truth). State-of-the-art machine learning interatomic potentials (MLIPs) typically yield MAE 0.02 eV/atom.
- Generative Modeling: Fraction of valid, low-energy crystal candidates and diversity across space groups/chemistries.
- Methods such as ConCDVAE and InvDesFlow, built upon OpenLAM, achieve high rates (>30%) of valid hull-stable submissions in competitive benchmarks.
6. Interpretability, Generalization, and Efficiency
AtomicTableLLM's explicit skill-chain design enables:
- Higher interpretability: Stepwise subplans and invoked atomic skills yield inherently granular and human-interpretable logical traces.
- Reduced error propagation: Snowball error rates decrease from 50% (CoT variants) to 20%.
- Information redundancy and granularity: Human and model-based evaluations indicate lower redundancy and finer granularity compared to monolithic CoT outputs.
- Cognitive load reduction: Isolating subtasks reduces working-memory demands, aligning with Cognitive Load Theory.
- Cross-domain transfer: With minimal retraining, the same set of atomic skills generalizes from machine learning tables to materials science, medicine, and finance, illustrating the modularity and domain-agnosticity of skill chaining.
- Efficiency: Despite longer token chains, inference-time is 20–35% faster than CoT baselines due to sparser dependency graphs; each step conditions on only its immediate predecessor.
7. Practical Applications and Availability
AtomicTableLLM serves diverse functions depending on context:
- Scientific Table Claim Verification: Automated fact-checking of claims in scientific publications, particularly for high-information-density tabular data (Zhang et al., 8 Jun 2025).
- Materials Discovery: Large atom modeling, stability and property prediction, and inverse design of novel crystals across all elements, via the OpenLAM foundation model (Peng et al., 20 Jan 2025).
- Generative Modeling: Aiding in the discovery and optimization of high-entropy alloys, ionic crystals, and materials with tailored electronic or mechanical properties.
- API and Data Access: OpenLAM is available via the Crystal Craft App and the OpenLAM repository, along with community benchmarks such as the Crystal Philately competition set.
The explicit modularity, atomic skill inventory, and cross-domain transferability collectively position AtomicTableLLM as a pivotal framework for high-precision table understanding and generative materials modeling in both scientific research and industrial applications.