Papers
Topics
Authors
Recent
Search
2000 character limit reached

RedSage: Open-Source Cybersecurity LLM Framework

Updated 2 February 2026
  • RedSage is an open-source framework designed for training cybersecurity-focused large language models using domain-aware continual pretraining and expert data curation.
  • It employs a rigorous four-stage pipeline including data filtering, expert resource curation, agentic dialogue augmentation, and multi-stage training for robust performance.
  • The framework achieves state-of-the-art accuracy on cybersecurity benchmarks while enabling privacy-preserving, on-premises deployment on consumer-grade GPUs.

RedSage is an open-source, locally deployable framework for training LLMs as cybersecurity generalists. Developed to address privacy concerns associated with proprietary API-based LLM solutions and the limitations of open models lacking robust domain adaptation, RedSage combines domain-aware continual pretraining, curated expert resources, and a novel agentic augmentation pipeline. Its comprehensive methodology spans large-scale data curation, agent-driven dialogue augmentation, rigorous multi-stage training, and multifaceted evaluation, establishing reproducibility and state-of-the-art performance on domain-specific and general benchmarks (Suryanto et al., 29 Jan 2026).

1. End-to-End Pipeline and Workflow

RedSage employs a four-stage architecture, integrating bulk web data filtering, expert document curation, synthetic agentic data generation, and multi-step training. The pipeline is illustrated conceptually in Figures 1 and 2 of the reference paper and summarized as follows:

  1. Data Ingestion and Filtering:
    • FineWeb to CyberFineWeb: A binary classifier (ModernBERT-base) prunes a 17.2 T-token Common Crawl web collection to ≈89.8 B tokens (≈125 M documents), selecting potentially cybersecurity-relevant content.
    • General-Knowledge Replay: 30% of each chunk is sampled from FineWeb-Edu (1.3 T tokens), mitigating catastrophic forgetting of general knowledge during domain adaptation.
    • Deduplication: MinHash-LSH deduplication globalizes the set to ≈46.8 B tokens.
    • Chunking and Selection: The deduplicated data is split into 20 chronological chunks; only the last 5 chunks (≈11.7 B tokens, 13 M docs) are retained and denoted as CyberFineWeb.
  2. Curated Expert Resources:
    • RedSage-Seed: 28,637 Markdown documents (≈149 M tokens), categorized by general/framework knowledge (e.g., MITRE ATT&CK, CWE, OWASP Top 10), offensive skills (CTF write-ups, HackTricks), CLI tools, and Kali tool docs.
    • RedSage-Dump: 459 K documents (≈700 M tokens) from standards (NVD, NIST, RFCs) and reputable portals. HTML data is transformed to Markdown via ReaderLM-v2; XML sources are parsed directly.
  3. Agentic Augmentation:
    • A planner agent analyzes each seed chunk, proposing augmentation plans over penetration-testing phases, tool usage, scenario-based role-plays, and step-wise reasoning.
    • An augmenter agent instantiates these proposals into persona-grounded, multi-turn conversations following a fixed chat format, subject to validity, technical, and topical consistency checks.
    • The resulting RedSage-Conv corpus comprises 266,180 multi-turn dialogues (≈353 M tokens, avg. 9.7 turns per dialogue).
  4. Model Training, Post-Training, and Deployment:
    • Continued Pretraining: CyberFineWeb followed by RedSage-Seed and RedSage-Dump, yielding RedSage-Base.
    • Supervised Fine-Tuning: RedSage-Conv dialogues plus SmolTalk2 produce RedSage-Ins.
    • Direct Preference Optimization (DPO): Incorporates the Tulu 3 Preference Mixture, yielding RedSage-DPO.
    • The final RedSage-8B-DPO model runs efficiently on a single consumer-grade 8 GB GPU, enabling on-premises inference.

2. Domain-Specific Data Curation

RedSage’s data is partitioned into three principal resources:

Dataset Tokens (B) Documents Description
CyberFineWeb 11.7 13 M Filtered Common Crawl (recent 5 chunks)[a]
RedSage-Seed 0.15 28,637 Markdown expert resources, category-delineated
RedSage-Dump 0.7 459 K Standards, educational, news, parsed documents

[a] Only last 5 of 20 chronological chunks.

  • CyberFineWeb: Filtered using a ModernBERT-base classifier trained on 4.6 M samples; incorporates a 30% replay ratio of FineWeb-Edu for generality.
  • RedSage-Seed: Systematically categorized, harvested from sources such as Wikipedia, MITRE frameworks, tool documentation, and targeted security write-ups.
  • RedSage-Dump: Assembles standards (NIST, NVD, RFCs) and broad educational or technical news content pertinent to cybersecurity.

Processing pipelines convert HTML to Markdown, deduplicate at scale, and parse structured formats directly, ensuring data consistency and relevance for supervised and conversational fine-tuning [(Suryanto et al., 29 Jan 2026), Appendix Tables A.1–A.2].

3. Agentic Dialogue Augmentation

RedSage leverages a two-stage agentic augmentation regime, modeled after AgentInstruct:

  • Planner Agent: For each document chunk ss, outputs a structured plan A(s)={(k1,t1),,(kms,tms)}A(s) = \{(k_1, t_1), \ldots, (k_{m_s}, t_{m_s})\} specifying cybersecurity skill sets and augmentation types, such as role-plays or conceptual Q&A, spanning full penetration-testing arcs.
  • Augmenter Agent: Each plan element π\pi is instantiated into a set of multi-turn, persona-based conversations B(s,π)={c1,,crs,π}B(s, \pi) = \{c_1, \ldots, c_{r_{s, \pi}}\}, where each cc obeys a strict dialogue format.

The aggregate dialogue set is constructed as: RedSage-Conv=sSπA(s)B(s,π)\text{RedSage-Conv} = \bigcup_{s \in S} \bigcup_{\pi \in A(s)} B(s, \pi) yielding 266,180 distinct dialogues. Quality controls ensure format validity, technical soundness, and topical fidelity. The result is a dataset with broad coverage over skills, frameworks, tools, and scenario-based reasoning [(Suryanto et al., 29 Jan 2026), Table 3 and Figures 5–6].

4. Training Protocols and Optimization

RedSage training is segmented into three phases:

  • Continued Pretraining (CPT): Using Qwen3-8B-Base as the initialization point, CPT proceeds in two stages: a single epoch on five CyberFineWeb chunks, followed by one epoch on RedSage-Seed plus RedSage-Dump. AdamW optimizer is used (lr = 2.5 × 10⁻⁶, batch = 1024, bf16), with 1 K-step warmup and early stopping.

LCPT=t=1Tlogpθ(xtx<t)\mathcal{L}_{\mathrm{CPT}} = -\sum_{t=1}^T \log p_\theta(x_t \mid x_{<t})

  • Supervised Fine-Tuning (SFT): Conducted over RedSage-Conv and SmolTalk2 (non-reasoning subset), with 2 epochs, cosine decay learning rate (base lr = 2.5 × 10⁻⁵), batch 32 × 32, seq_length 32,768, and cross-entropy loss.
  • Direct Preference Optimization (DPO): Employs Tulu 3 8B Preference Mixture as data, minimizing

LDPO=E(x,y+,y)logσ(β[sθ(y+x)sθ(yx)])\mathcal{L}_{\mathrm{DPO}} = - \mathbb{E}_{(x,y^+,y^-)} \log \sigma \bigl( \beta \, [s_\theta(y^+ \mid x) - s_\theta(y^- \mid x)] \bigr)

(where β\beta is the temperature parameter), targeting improved preference alignment without sacrificing accuracy. Cumulative training approximates 4,096 GPU-hours [(Suryanto et al., 29 Jan 2026), Table A.4].

5. Evaluation Methodology

Evaluation spans both the newly introduced RedSage-Bench and established cybersecurity plus general LLM benchmarks:

  • RedSage-Bench: Composed of 30,000 MCQs (10 K per Knowledge, Skills, Tools) and 240 open-ended Q&A. Questions are synthesized with open-source LLMs under strict prompting and validated via two-stage LLM verification, including chain-of-thought scoring and taxonomy-based sampling. Open-ended answers are judged by agentic OpenQA pipelines, LLM-as-judge rubrics, and human verification, scored on factual accuracy and qualitative response (0–10).
    • MCQ accuracy:

    Acc=#correct answers#questions\mathrm{Acc} = \frac{\#\text{correct answers}}{\#\text{questions}}

  • External Benchmarks:

    • Cybersecurity: CTI-Bench, CyberMetric, SecBench-En, SecEval, MMLU-CSec, SECURE. Base models use 5-shot; instruction-tuned models use 0-shot evaluation protocols.
    • General LLM: ARC-C, HellaSwag, TruthfulQA, MMLU, WinoGrande, GSM8K, IFEval.

Results for RedSage-8B-DPO, compared to Qwen3-8B, are summarized in the following table:

Benchmark RedSage-8B-DPO Qwen3-8B Improvement
RedSage-Bench (MCQ) 84.83% 81.85% +2.98
CTI-Bench (Mean) 81.10% 75.71% +5.39
Open LLM Mean 74.33% 65.92% +8.41

Comprehensive results are detailed in Tables 4–6 of the reference.

6. Principal Findings and Implications

Empirical analyses highlight that domain-aware pretraining, combined with agentic post-training, produces pronounced gains in cybersecurity expertise—up to +5.59 points over prevailing open-source 8B models on relevant benchmarks. Tool-proficiency and framework-related capacities benefit considerably from curated RedSage-Seed data and agentic dialogue structures.

DPO-driven preference alignment enhances open-ended answer quality (mean improvement +0.07) while maintaining high factual accuracy. Notably, domain-specific training also generates positive transfer to general reasoning and instruction-following tasks (e.g., GSM8K, ARC-C), yielding a 2–4 point advantage. The entire RedSage pipeline is open source—data, models, and code—and deployable on standard consumer GPUs, facilitating privacy-preserving local cybersecurity assistants (Suryanto et al., 29 Jan 2026).

A plausible implication is that large-scale filtered pretraining, combined with expert-curated resources and agentic dialogue simulation, is a reproducible methodology for domain-adapted generalist LLMs capable of simultaneous domain-specific and broader analytical competencies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to RedSage Framework.