RedSage: Open-Source Cybersecurity LLM Framework
- RedSage is an open-source framework designed for training cybersecurity-focused large language models using domain-aware continual pretraining and expert data curation.
- It employs a rigorous four-stage pipeline including data filtering, expert resource curation, agentic dialogue augmentation, and multi-stage training for robust performance.
- The framework achieves state-of-the-art accuracy on cybersecurity benchmarks while enabling privacy-preserving, on-premises deployment on consumer-grade GPUs.
RedSage is an open-source, locally deployable framework for training LLMs as cybersecurity generalists. Developed to address privacy concerns associated with proprietary API-based LLM solutions and the limitations of open models lacking robust domain adaptation, RedSage combines domain-aware continual pretraining, curated expert resources, and a novel agentic augmentation pipeline. Its comprehensive methodology spans large-scale data curation, agent-driven dialogue augmentation, rigorous multi-stage training, and multifaceted evaluation, establishing reproducibility and state-of-the-art performance on domain-specific and general benchmarks (Suryanto et al., 29 Jan 2026).
1. End-to-End Pipeline and Workflow
RedSage employs a four-stage architecture, integrating bulk web data filtering, expert document curation, synthetic agentic data generation, and multi-step training. The pipeline is illustrated conceptually in Figures 1 and 2 of the reference paper and summarized as follows:
- Data Ingestion and Filtering:
- FineWeb to CyberFineWeb: A binary classifier (ModernBERT-base) prunes a 17.2 T-token Common Crawl web collection to ≈89.8 B tokens (≈125 M documents), selecting potentially cybersecurity-relevant content.
- General-Knowledge Replay: 30% of each chunk is sampled from FineWeb-Edu (1.3 T tokens), mitigating catastrophic forgetting of general knowledge during domain adaptation.
- Deduplication: MinHash-LSH deduplication globalizes the set to ≈46.8 B tokens.
- Chunking and Selection: The deduplicated data is split into 20 chronological chunks; only the last 5 chunks (≈11.7 B tokens, 13 M docs) are retained and denoted as CyberFineWeb.
- Curated Expert Resources:
- RedSage-Seed: 28,637 Markdown documents (≈149 M tokens), categorized by general/framework knowledge (e.g., MITRE ATT&CK, CWE, OWASP Top 10), offensive skills (CTF write-ups, HackTricks), CLI tools, and Kali tool docs.
- RedSage-Dump: 459 K documents (≈700 M tokens) from standards (NVD, NIST, RFCs) and reputable portals. HTML data is transformed to Markdown via ReaderLM-v2; XML sources are parsed directly.
- Agentic Augmentation:
- A planner agent analyzes each seed chunk, proposing augmentation plans over penetration-testing phases, tool usage, scenario-based role-plays, and step-wise reasoning.
- An augmenter agent instantiates these proposals into persona-grounded, multi-turn conversations following a fixed chat format, subject to validity, technical, and topical consistency checks.
- The resulting RedSage-Conv corpus comprises 266,180 multi-turn dialogues (≈353 M tokens, avg. 9.7 turns per dialogue).
- Model Training, Post-Training, and Deployment:
- Continued Pretraining: CyberFineWeb followed by RedSage-Seed and RedSage-Dump, yielding RedSage-Base.
- Supervised Fine-Tuning: RedSage-Conv dialogues plus SmolTalk2 produce RedSage-Ins.
- Direct Preference Optimization (DPO): Incorporates the Tulu 3 Preference Mixture, yielding RedSage-DPO.
- The final RedSage-8B-DPO model runs efficiently on a single consumer-grade 8 GB GPU, enabling on-premises inference.
2. Domain-Specific Data Curation
RedSage’s data is partitioned into three principal resources:
| Dataset | Tokens (B) | Documents | Description |
|---|---|---|---|
| CyberFineWeb | 11.7 | 13 M | Filtered Common Crawl (recent 5 chunks)[a] |
| RedSage-Seed | 0.15 | 28,637 | Markdown expert resources, category-delineated |
| RedSage-Dump | 0.7 | 459 K | Standards, educational, news, parsed documents |
[a] Only last 5 of 20 chronological chunks.
- CyberFineWeb: Filtered using a ModernBERT-base classifier trained on 4.6 M samples; incorporates a 30% replay ratio of FineWeb-Edu for generality.
- RedSage-Seed: Systematically categorized, harvested from sources such as Wikipedia, MITRE frameworks, tool documentation, and targeted security write-ups.
- RedSage-Dump: Assembles standards (NIST, NVD, RFCs) and broad educational or technical news content pertinent to cybersecurity.
Processing pipelines convert HTML to Markdown, deduplicate at scale, and parse structured formats directly, ensuring data consistency and relevance for supervised and conversational fine-tuning [(Suryanto et al., 29 Jan 2026), Appendix Tables A.1–A.2].
3. Agentic Dialogue Augmentation
RedSage leverages a two-stage agentic augmentation regime, modeled after AgentInstruct:
- Planner Agent: For each document chunk , outputs a structured plan specifying cybersecurity skill sets and augmentation types, such as role-plays or conceptual Q&A, spanning full penetration-testing arcs.
- Augmenter Agent: Each plan element is instantiated into a set of multi-turn, persona-based conversations , where each obeys a strict dialogue format.
The aggregate dialogue set is constructed as: yielding 266,180 distinct dialogues. Quality controls ensure format validity, technical soundness, and topical fidelity. The result is a dataset with broad coverage over skills, frameworks, tools, and scenario-based reasoning [(Suryanto et al., 29 Jan 2026), Table 3 and Figures 5–6].
4. Training Protocols and Optimization
RedSage training is segmented into three phases:
- Continued Pretraining (CPT): Using Qwen3-8B-Base as the initialization point, CPT proceeds in two stages: a single epoch on five CyberFineWeb chunks, followed by one epoch on RedSage-Seed plus RedSage-Dump. AdamW optimizer is used (lr = 2.5 × 10⁻⁶, batch = 1024, bf16), with 1 K-step warmup and early stopping.
- Supervised Fine-Tuning (SFT): Conducted over RedSage-Conv and SmolTalk2 (non-reasoning subset), with 2 epochs, cosine decay learning rate (base lr = 2.5 × 10⁻⁵), batch 32 × 32, seq_length 32,768, and cross-entropy loss.
- Direct Preference Optimization (DPO): Employs Tulu 3 8B Preference Mixture as data, minimizing
(where is the temperature parameter), targeting improved preference alignment without sacrificing accuracy. Cumulative training approximates 4,096 GPU-hours [(Suryanto et al., 29 Jan 2026), Table A.4].
5. Evaluation Methodology
Evaluation spans both the newly introduced RedSage-Bench and established cybersecurity plus general LLM benchmarks:
- RedSage-Bench: Composed of 30,000 MCQs (10 K per Knowledge, Skills, Tools) and 240 open-ended Q&A. Questions are synthesized with open-source LLMs under strict prompting and validated via two-stage LLM verification, including chain-of-thought scoring and taxonomy-based sampling. Open-ended answers are judged by agentic OpenQA pipelines, LLM-as-judge rubrics, and human verification, scored on factual accuracy and qualitative response (0–10).
- MCQ accuracy:
External Benchmarks:
- Cybersecurity: CTI-Bench, CyberMetric, SecBench-En, SecEval, MMLU-CSec, SECURE. Base models use 5-shot; instruction-tuned models use 0-shot evaluation protocols.
- General LLM: ARC-C, HellaSwag, TruthfulQA, MMLU, WinoGrande, GSM8K, IFEval.
Results for RedSage-8B-DPO, compared to Qwen3-8B, are summarized in the following table:
| Benchmark | RedSage-8B-DPO | Qwen3-8B | Improvement |
|---|---|---|---|
| RedSage-Bench (MCQ) | 84.83% | 81.85% | +2.98 |
| CTI-Bench (Mean) | 81.10% | 75.71% | +5.39 |
| Open LLM Mean | 74.33% | 65.92% | +8.41 |
Comprehensive results are detailed in Tables 4–6 of the reference.
6. Principal Findings and Implications
Empirical analyses highlight that domain-aware pretraining, combined with agentic post-training, produces pronounced gains in cybersecurity expertise—up to +5.59 points over prevailing open-source 8B models on relevant benchmarks. Tool-proficiency and framework-related capacities benefit considerably from curated RedSage-Seed data and agentic dialogue structures.
DPO-driven preference alignment enhances open-ended answer quality (mean improvement +0.07) while maintaining high factual accuracy. Notably, domain-specific training also generates positive transfer to general reasoning and instruction-following tasks (e.g., GSM8K, ARC-C), yielding a 2–4 point advantage. The entire RedSage pipeline is open source—data, models, and code—and deployable on standard consumer GPUs, facilitating privacy-preserving local cybersecurity assistants (Suryanto et al., 29 Jan 2026).
A plausible implication is that large-scale filtered pretraining, combined with expert-curated resources and agentic dialogue simulation, is a reproducible methodology for domain-adapted generalist LLMs capable of simultaneous domain-specific and broader analytical competencies.