Papers
Topics
Authors
Recent
Search
2000 character limit reached

RuCo-C: Catalysis, Spintronics, and Reinforcement Learning

Updated 22 May 2026
  • RuCo-C involves Ruthenium, Cobalt, and Carbon, pivotal in catalysis, spintronics, and reinforcement learning.
  • Bimetallic Ru–Co nanoparticles in RuCo-C function as tunable catalysts, enhancing carbon nanotube nucleation.
  • RuCo-C's reinforcement learning framework evaluates text-to-SQL conversion with detailed rubric-based critiques.

RuCo-C refers to materials or frameworks where ruthenium (Ru), cobalt (Co), and carbon (C) play a central role, spanning catalysis, spintronic coupling, and reinforcement learning for text-to-SQL. The term encompasses: (1) bimetallic Ru–Co–C systems as heterogeneous catalysts for carbon nanotube nucleation, (2) magnetic alloys relevant for non-collinear spin coupling, and (3) RuCo-C as an acronym for a benchmarked fine-grained reinforcement learning judge for text-to-SQL with rubric-based interpretable critiques. Each context involves Ru, Co, and C at the intersection of structure, function, and algorithmic design.

1. Ru–Co–C Nanoparticles in Catalysis and SWCNT Nucleation

Bimetallic RuCo nanoparticles serve as tunable catalysts for single-walled carbon nanotube (SWCNT) nucleation during chemical vapor deposition (CVD) of methane. Under CVD conditions (1000 K), Co₅₅₋ₓRuₓ particles (x = 0–17 atoms, 0–30 at % Ru) adopt a core–shell or segregated morphology. Radial distribution functions indicate that, even at 30 at % Ru loading, the surface shell remains >95 % Co; Ru is confined to the particle interior. Surface Co sites exhibit a first-shell coordination number (CN) ≈8.5 ± 0.5, while core Ru sites approach CN ≈11–12. Lindemann indices at 1000 K are η_Co ≈ 0.12 (molten-like shell) and η_Ru ≈ 0.08 (quasi-solid core), supporting significant phase segregation (Page et al., 30 Jul 2025).

Although Ru–C and Ru–H bonds at the surface are negligible, Ru indirectly modulates surface chemistry. The activation barrier for methane dehydrogenation (CH₄ → CH₃ + H) is parameterized as

ΔEa(x)=0.46eV+0.001eV×x\Delta E_a(x) = 0.46\,\mathrm{eV} + 0.001\,\mathrm{eV}\times x

yielding ΔEₐ(0 % Ru) = 0.46 eV, ΔEₐ(30 % Ru) = 0.49 eV. This minor but systematic increase impedes C–H activation, decreasing Co–H bond populations and extending the lifetimes of CHₓ intermediates by 30–50 ps for 30 % Ru. The C₂H radical residence time is doubled (≈300 ps → >500 ps), shifting carbon-chain chemistry toward longer chains (C₇–C₁₁), which are ≈2× more abundant at late nucleation times (Page et al., 30 Jul 2025).

2. Electronic Structure and d-Band Effects in RuCo-C Catalysts

RuCo-C catalytic behavior is governed by electronic structure modifications. Incorporation of Ru shifts the Co 3d band center (ε_d) further below the Fermi level (E_F), and E_F itself decreases by ≈0.25 eV across 0–30 % Ru. Specifically,

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}

Following d-band theory, a lower ε_d/E_F weakens Co–C, Co–H, and Co–CH₃ adsorption by reducing back-donation to adsorbate σ* orbitals:

  • E_ads(C): –1.76 eV (0 % Ru) → –1.54 eV (30 % Ru)
  • E_ads(H): –0.82 eV → –0.66 eV
  • E_ads(CH₃): –1.22 eV → –1.04 eV

This energetics profile leads to attenuated surface reactivity, extended lifetime of key intermediates, and selective promotion of sp² condensation (higher hexagon:pentagon ratio) at the nucleation front, directing SWCNT cap formation (Page et al., 30 Jul 2025).

3. Magnetic Non-Collinearity in RuCo–C Alloy Thin Films

RuCo alloy spacers in Co|Ru₁₀₀₋ₓCoₓ|Co trilayers enable tailored non-collinear alignment of ferromagnetic layers. The relative magnetization angle θ is controlled by Co fraction x (at.%) and layer thickness d (nm). For d = 0.7 nm, as x increases from Ru-rich values, non-collinearity appears at x_min ≈ 44 %, reaches θ ≈ 120° at x ≈ 50 %, passes through θ = 90° at x ≈ 55 %, falling to θ ≈ 60° for x ≈ 61 %. Thickness variations modulate x_min: at d = 1.4 nm, x_min ≈ 60 % (Nunn et al., 2019).

The coupling energy per area is

Ecoupling(θ)=J1(x,d)cosθ+J2(x,d)cos2θE_\text{coupling}(\theta) = J_1(x,d) \cos{\theta} + J_2(x,d) \cos^2{\theta}

where J₁ and J₂ are bilinear and biquadratic exchange constants. Non-collinearity requires J₂ > |J₁|/2. In the non-collinear window (44 ≲ x ≲ 61 %, 0.4 ≲ d ≲ 1.0 nm), J₂ is maximized when the RuCo spacer acquires ferromagnetic order (M_s jump) and attains values up to 2 mJ/m² at d ≈ 0.7 nm, x ≈ 50–55 % (Nunn et al., 2019).

Phase boundaries and design guidelines for target coupling angles are summarized as follows:

Region (x at.%) θ Coupling Regime
x ≲ 44 collinear Pure Ru-like (oscillatory J₁)
44 ≲ x ≲ 61 0°<θ<180° Non-collinear (J₂>
x ≳ 61 collinear Ferromagnetic (J₁<0)

Orthogonality (θ ≈ 90°): d ≈ 0.7 nm, x ≈ 55 ± 3 %, realizing robust non-collinearity for multilayer spintronic devices (Nunn et al., 2019).

4. RuCo-C: Fine-Grained Reinforcement Learning Framework for Text-to-SQL

RuCo-C also denotes a generative judge model and RL training pipeline for fine-grained evaluation of text-to-SQL systems (Wang et al., 27 Nov 2025). Unlike prior models relying on binary execution rewards and expensive gold SQL, RuCo-C performs human-free query-specific evaluation by generating:

  • Rubrics: step-wise QA items targeting specific SQL aspects (SELECT completeness, JOIN correctness, predicate coverage)
  • Critique responses: binary judgments with supporting evidence, formulated via supervised fine-tuning on synthetic multi-agent data

The output schema for sample i is

Oi=(si,y^i,c~i)O^i = (s^i,\, \hat{y}^i,\, \tilde{c}^i)

with si={s1,,sNi}s^i = \{s_1,\ldots,s_{N_i}\}, each sk=(bk,ak)s_k = (b_k, a_k) representing a rubric question–answer, y^i\hat{y}^i a binary classification, and c~i\tilde{c}^i an optional corrected SQL.

Training minimizes the negative log-likelihood

LSFT=i=1MlogPθ(OiXi)\mathcal{L}_\mathrm{SFT} = -\sum_{i=1}^M \log P_\theta(O^i \mid X^i)

where Xi={q,m,c^}X^i = \{q, m, \hat{c}\} comprises the NL question, schema, and SQL candidate.

Reward decomposition includes:

  • Process (rubric) reward:

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}0

  • Outcome reward:

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}1

  • Format reward:

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}2

  • Total reward:

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}3

The progressive exploration strategy adjusts coefficients (εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}4) to phase in dense rubric feedback as the RL agent masters basic outcomes and formatting. This curriculum optimizes under Group Relative Policy Optimization (GRPO):

εd(Co55)=1.65eV εd(Co38.5Ru16.5)=1.85eV EF(Co55)=11.00eVEF(Co38.5Ru16.5)=11.25eV\varepsilon_d(\mathrm{Co}_{55}) = -1.65\,\mathrm{eV} \ \varepsilon_d(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -1.85\,\mathrm{eV} \ E_F(\mathrm{Co}_{55}) = -11.00\,\mathrm{eV} \rightarrow E_F(\mathrm{Co}_{38.5}\mathrm{Ru}_{16.5}) = -11.25\,\mathrm{eV}5

where r_i is the policy ratio, A_i the group-relative advantage from R_total, and D_KL the KL penalty.

5. Quantitative Performance and Insights in RuCo-C Applications

Catalysis and Magnetism

  • Catalytic CH₄→CH₃+H decomposition on RuCo is suppressed by ≈30 % as Ru increases from 0–30 at % (k ≈ 0.20 ps⁻¹→0.14 ps⁻¹). Long-chain C₈–C₁₁ populations at 500 ps double compared to pure Co (Page et al., 30 Jul 2025).
  • In thin-film spintronics, non-collinear coupling (0°<θ<180°) is realizable in 44–61 at.% Co, 0.4–1.0 nm RuCo spacers, peaking at θ = 90° for d = 0.7 nm, x = 55 %. The biquadratic term J₂ reaches ≈2 mJ/m² (Nunn et al., 2019).

Text-to-SQL RL

On major benchmarks:

  • Spider dev set: RuCo-C (7B) achieves AUC 68.15 (+2.96), ACC 68.07 (+2.01), F1 67.33 (+9.28) over execution-only baselines.
  • BIRD dev set: RuCo-C (7B) yields AUC 72.40 (+5.52), ACC 68.29 (–6.72), F1 54.04 (+5.00).
  • Ablations confirm that static/dynamic rubric rewards yield 1–9 AUC point gains. RuCo-C reduces false positives/negatives and yields better reward separation and stable RL training (Wang et al., 27 Nov 2025).

6. Broader Implications and Future Research Directions

In catalysis, RuCo-C provides a model system illustrating how electronic structure manipulation (via d-band theory) tunes C–H activation and carbon assembly, offering a predictive lever for rational catalyst design. For spintronics, RuCo-C alloy spacers unlock robust, tunable non-collinear couplings, directly linking spacer magnetism to device-level angular control.

Algorithmically, RuCo-C's rubric-based RL paradigm demonstrates that interpretable, fine-grained critiques overcome the scalability and diagnostic bottlenecks of binary execution rewards in semantic tasks such as text-to-SQL. The prospective integration of RuCo-C critiques into generation models, extension to other semantic parsing domains, and the automation of dynamic reward schedules using meta-learning or difficulty estimation are identified future directions (Wang et al., 27 Nov 2025).

This unifies the material context (catalysis and magnetism) and algorithmic context (reinforcement learning for interpretability and evaluation), centered on the Ru–Co–C motif across disciplinary boundaries.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RuCo-C.