Mole-Syn: Neuromorphic and Molecular Design
- Mole-Syn is a multidisciplinary framework that unifies neuromorphic MoS₂ floating-gate transistor technology with advanced, synthesis-aware molecular generation models.
- It employs precise control of Fowler–Nordheim tunneling and SE(3)-equivariant flow matching to achieve energy-efficient synaptic emulation and 3D molecular design.
- The approach demonstrates high validity, synthetic feasibility, and robust analog generation, advancing drug discovery and materials innovation.
Mole-Syn refers to a diverse set of concepts and systems unified by the principle of synthesizability, fundamentally anchored in molecular systems. The term encompasses (1) neural-inspired hardware constructed from 2D materials for emulating biological synapses; (2) state-of-the-art generative and planning models for synthesizable small molecule design; and (3) large-scale language modeling frameworks that output executable synthetic routes for drug-like compounds. Across these domains, Mole-Syn advances molecular design and information processing by integrating synthesis constraints and structural realism.
1. MoS₂ Floating-Gate Synaptic Transistor (Neuromorphic Mole-Syn)
Mole-Syn designates a synaptic device based on multilayer two-dimensional molybdenum disulfide (MoS₂) that utilizes a floating-gate transistor architecture to realize solid-state synaptic functionality (Paul et al., 2019). The architecture comprises the following layers:
- p⁺⁺–Si back-gate
- SiO₂ dielectric (285 nm, )
- Graphene with Au pad (monolayer + 50 nm Au)
- hBN tunnel barrier (5.8 nm, , eV)
- MoS₂ transport channel (2 nm, , eV)
- Cr/Au source/drain (5/50 nm)
Electrical operation exploits controlled Fowler–Nordheim tunneling through hBN for bidirectional charge modulation (hole injection for potentiation, electron injection for depression). Gate-coupling is dominated by fF, and band alignment sets electron barrier eV and hole barrier eV.
The device achieves a near-ideal subthreshold swing of mV/decade over four decades of drain current, with on/off ratios exceeding and threshold voltages tunable between approximately V and V. Channel conductance can be modulated by up to 80% per potentiation pulse, with energy expenditure per pulse as low as fJ for 100 μs operation, approaching the energy efficiency regime of biological synapses.
Programmed via gate voltage pulses ( V for potentiation, V for depression, width ms), Mole-Syn emulates biological learning rules including spike-timing dependent plasticity (STDP) with time constants s, s and nonvolatile multilevel states persisting s.
These properties support application in neuromorphic circuits, enabling three-terminal synaptic arrays compatible with crossbar topologies and hybrid CMOS integration. The large on/off modulation, robust nonvolatility, and low update energy make Mole-Syn a candidate for scalable, energy-efficient neuromorphic hardware (Paul et al., 2019).
2. Synthesizable 3D Molecular Generative Models (SynCoGen-based Mole-Syn)
Contemporary Mole-Syn methodology also captures end-to-end architectures for generating small molecules with guaranteed synthetic tractability and 3D structural fidelity (Rekesh et al., 16 Jul 2025). A paradigm example is SynCoGen, which unifies masked graph diffusion (discrete graph denoising) and flow matching (continuous coordinate denoising) into a single generative framework. This enables sampling from the joint distribution over building block identities , reaction annotations , and atomic coordinates .
Molecules are encoded as , where is a node (block) one-hot indicator, is an edge (reaction) one-hot, and comprises atomic coordinates. Discrete noise is injected via an absorbing diffusion kernel (), while coordinates are perturbed along a path between a centered Gaussian prior and the ground-truth geometry (trained via a conditional flow matching loss,
).
SynCoGen's neural backbone utilizes a modified SemlaFlow, an SE(3)-equivariant flow-matching network over atom-atom features, with pooling to aggregate atom-level information into block-level predictions. Validity constraints (no self-loops, n−1 edges for blocks, parent-child block assignments) and compatibility-masked sampling guarantee chemically plausible outputs.
The SynSpace dataset used for training comprises 93 commercial building blocks, 19 high-yield reaction templates, k reaction graphs (each 2–4 couplings), and over $3.3$ M optimized conformers, curated via algorithmic enumeration and multi-step geometry refinement.
3. LLM Frameworks for Synthesizable Pathway Generation
Mole-Syn also denotes systems that leverage LLMs to plan and generate synthesizable molecules and analogs, exemplified by SynLlama (Sun et al., 16 Mar 2025). SynLlama repurposes Meta's Llama-3 (1B/8B parameters) via supervised fine-tuning on a dataset of M bottom-up retrosynthetic pathways (up to 5 steps), sampling from 229,579 Enamine building blocks and 91 validated reaction templates. Each pathway is encoded as a sequence of tuples , with templates in SMARTS notation and building blocks as canonical SMILES.
The generation objective maximizes the sequence log-likelihood,
with sequential conditional factorization.
SynLlama demonstrates the ability to generalize to building blocks withheld from training, producing both exact and close-analog synthetic routes for novel compounds. Analysis across 1,000 test molecules indicated a total reconstruction (Enamine + New BBs) of 642/1,000 (64.2%), with analog similarity (Morgan, scaffold, pharmacophore) of . In head-to-head comparisons, SynLlama yielded superior reconstructive and analog generation performance compared to SynNet and ChemProjector, despite utilizing a dataset 40–60 smaller (Sun et al., 16 Mar 2025).
4. Comparative Metrics and Benchmarking
Key experimental metrics for Mole-Syn-inspired 3D generative frameworks include validity, synthesizability (AiZyn and Syntheseus solve rates), energetic plausibility (GFN-FF, xTB, PoseBusters), and molecular diversity/novelty (Rekesh et al., 16 Jul 2025). In SynCoGen, unconditional 3D molecule generation achieves 96.7% validity, 50% retrosynthetically solvable by AiZyn, 72% by Syntheseus, GFN-FF energy 3.01 kcal/mol/atom, and 93.9% novelty. In contrast, prior approaches often exhibit <50% synthetic feasibility or lack explicit 3D atomistic coordinates.
For LLM-based frameworks, evaluation encompasses template recall, building block selection, valid SMILES output, and product reconstruction. SynLlama attains 97.3% valid JSON output, 97.9% product reconstruction rate, and high instruction-following fidelity. Similarity analysis further validates analog generation with Tanimoto and Murcko-scaffold metrics exceeding 0.94 (Sun et al., 16 Mar 2025).
Specialized tasks, such as zero-shot linker design for protein–ligand complexes, reveal a distinct performance edge: SynCoGen yields retrosynthetically tractable, fully-connected linkers with valid routes in >50% of challenging cases, where DiffLinker and related baselines solve none (Rekesh et al., 16 Jul 2025).
5. Applications and Synthesis-Aware Design Strategies
Mole-Syn platforms underpin multiple applications in synthetic chemistry, drug discovery, and molecular informatics:
- Analog expansion and lead optimization: Direct sampling of synthesizable analogs conditioned on a given scaffold or substructure (Rekesh et al., 16 Jul 2025, Sun et al., 16 Mar 2025).
- Structure-based generation: Conditioning generative models on protein pocket geometry or chemical descriptors to yield binders compatible with specific binding sites (Rekesh et al., 16 Jul 2025).
- Complex molecular assembly: Design of PROTACs, bivalent molecules, or linker constructs under explicit synthetic constraints.
- Materials and framework discovery: Expansion to inorganic or supramolecular systems by block+coordinate co-generation, facilitating processability and structural control.
- High-throughput retrosynthesis planning: Automated enumeration and scoring of reaction routes for virtual libraries or de novo designed molecules (Sun et al., 16 Mar 2025).
6. Prospects and Future Directions
Anticipated developments for Mole-Syn systems include:
- Enabling direct property conditioning: Integration of predictive or external property models (e.g., binding affinity, ADMET) for property-constrained molecular generation (Rekesh et al., 16 Jul 2025).
- Expanding reactive space: Incorporation of broader reaction class vocabularies, advanced protecting-group strategies, and macrocyclization protocols.
- Integration of multimodal descriptors: Combining chemical, structural, and environmental data streams to inform synthesis-aware 3D sampling.
- Bridging with high-fidelity quantum/physical simulation: Further coupling geometry-aware generative frameworks with rigorous energetic and processability evaluation.
These directions highlight the ongoing trajectory toward robust, synthesis-ready, property-driven molecular generation and neuromorphic circuit design.
7. Summary
Mole-Syn, in contemporary research, encompasses both neuromorphic MoS₂ floating-gate hardware for synaptic emulation (Paul et al., 2019) and algorithmic frameworks for synthesizable molecule generation with executably annotated synthetic routes and spatial structures (Rekesh et al., 16 Jul 2025, Sun et al., 16 Mar 2025). These approaches combine material innovation, machine learning, and cheminformatics, establishing foundational components for scalable neuromorphic systems and synthesis-constrained chemical design.