Papers
Topics
Authors
Recent
2000 character limit reached

SecureCode v2.0: Incident-Grounded Secure Coding Dataset

Updated 27 December 2025
  • SecureCode v2.0 is a production-grade dataset that provides 1,215 rigorously validated coding examples grounded in real-world security incidents.
  • It models realistic four-turn developer–AI dialogues with both vulnerable and secure code snippets annotated with specific CVE identifiers.
  • The dataset spans 11 programming languages and covers OWASP Top 10 and AI/ML security threats, enhancing secure code generation model training.

SecureCode v2.0 is a production-grade dataset purpose-built for the training and evaluation of security-aware code generation models. Addressing the high incidence of vulnerable code produced by AI assistants in security-relevant contexts, it offers 1,215 rigorously validated coding scenarios, each closely grounded in real-world security incidents. Every example is structurally modeled on actual developer–AI assistant workflows and operationalizes contemporary security guidance, making it fundamentally distinct from earlier secure coding resources in both scale and incident fidelity (Thornton, 20 Dec 2025).

1. Scope, Composition, and Vulnerability Coverage

SecureCode v2.0 comprises 1,215 security-focused coding examples, partitioned as follows:

Split Count
Train 989
Validation 122
Test 104

The dataset spans 11 programming and specification languages—Python, JavaScript (Node.js/Express, NestJS), Java (Spring Boot), Go (Gin), PHP (Laravel/Symfony), C# (ASP.NET Core), TypeScript, Ruby (Rails), Rust, Kotlin, and YAML (infrastructure-as-code)—reflecting their prevalence in production systems and direct mapping to incidents indexed in the CVE database between 2017 and 2025.

Coverage includes the complete OWASP Top 10:2025 categories and extends to AI/ML security threats, totaling eleven vulnerability classes:

  1. Broken Access Control
  2. Security Misconfiguration
  3. Injection
  4. Cryptographic Failures
  5. Insecure Design
  6. Vulnerable & Outdated Components
  7. Identification & Authentication Failures
  8. Software & Data Integrity Failures
  9. Security Logging & Monitoring Failures
  10. Server-Side Request Forgery (SSRF)
  11. AI/ML Security Threats

Each scenario is anchored in a documented security incident via CVE identifiers.

2. Structural Design and Conversational Modality

Every example in SecureCode v2.0 embodies a realistic four-turn developer–AI assistant dialogue, reflecting iterative real-world code development and review:

  1. Developer Request: Context-rich functional or architectural feature request.
  2. AI Assistant Response: Side-by-side vulnerable and secure code solutions (explicitly annotated), referencing a specific CVE, paired with a succinct attack demonstration.
  3. Developer Follow-Up: Advanced or scaled scenario question (e.g., performance under production loads).
  4. AI Assistant Defense-in-Depth Guidance: Operational advice including SIEM (Security Information and Event Management) integration, logging, container security, monitoring, and infrastructure controls.

An excerpted JSON structure for the second turn illustrates the approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
{
  "turn": 2,
  "speaker": "assistant",
  "cve": "CVE-2019-12345",
  "vulnerable_code": {
    "language": "Python",
    "snippet": "def get_user(email):\n    query = f\"SELECT * FROM users WHERE email = '{email}'\"  # SQL injection\n    return db.execute(query)"
  },
  "secure_code": {
    "language": "Python",
    "snippet": "def get_user(email):\n    stmt = \"SELECT * FROM users WHERE email = %s\"\n    return db.execute(stmt, (email,))  # parameterized"
  },
  "attack_demo": "If email = \"' OR '1'='1\" then all rows returned."
}

The turn-based scaffold ensures that both naive and robust implementations, exploitability, and operational context are delivered per scenario.

3. Incident Grounding and Validation Framework

Every example directly references a real CVE or substantiated security breach (e.g., CVE-2017-5638, Equifax Struts2 OGNL RCE). Structural and compliance integrity is rigorously enforced through an automated framework (validate_contributing_compliance.py), which applies the following checks:

  • Enforces presence of all four conversational turns
  • Validates CVE identifier format (CVE-\d{4}-\d+)
  • Restricts language tags to a validated set
  • Applies minimum content length thresholds (50+ characters in developer turns, 100+ in assistant turns)
  • Guarantees inclusion of both vulnerable and patched code with operational instructions

Formally, dataset-wide incident association is quantified as:

CoverageRatio=#{examples with valid CVE tags}total examples=1,2151,215=1.00\text{CoverageRatio} = \frac{\#\{\text{examples with valid CVE tags}\}}{\text{total examples}} = \frac{1,215}{1,215} = 1.00

Duplication is actively filtered using Jaccard similarity:

J(A,B)=∣A∩B∣∣A∪B∣,exclude if J>0.8J(A,B) = \frac{|A \cap B|}{|A \cup B|}, \quad\text{exclude if } J>0.8

A documented increase in compliance from 47.2% to 100% was achieved over six weeks, resolving issues such as 452 CVE-format mismatches, 60 language tag errors, 86 missing guidance entries, and 6 incomplete SSTI (Server-Side Template Injection) scenarios.

4. Security Content: Code, Attacks, and Mitigations

Each scenario includes both vulnerable and secure code variants, with explicit exploit demonstrations and mitigation strategies. Examples across languages illustrate depth:

JavaScript (Express): Command Injection

Vulnerable:

1
2
3
4
app.get('/backup', (req,res) => {
  const cmd = `tar czf /tmp/backup.tgz ${req.query.dir}`;
  require('child_process').exec(cmd, (err,out) => { res.send(out); });
});
Secure:
1
2
3
4
5
6
const { spawn } = require('child_process');
app.get('/backup', (req,res) => {
  const dir = path.resolve('/safe/root', req.query.dir);
  const tar = spawn('tar', ['czf', '/tmp/backup.tgz', dir], { shell: false });
  tar.on('close', () => res.send('OK'));
});

Go (Gin): SSRF

Vulnerable:

1
2
url := c.Query("url")
resp, _ := http.Get(url)  // uncontrolled SSRF
Secure:
1
2
3
4
5
6
7
allowed := map[string]bool{"example.com":true}
u, err := url.Parse(c.Query("url"))
if err!=nil||!allowed[u.Host] {
  c.AbortWithStatus(400)
  return
}
resp, _ := http.Get(u.String())

Attack vectors addressed include authentication bypass, various injection modes (SQL, OS command), deserialization RCE, and SSRF. Turn 4 systematically delivers defense-in-depth measures: detailed logging, proactive WAF (Web Application Firewall) rules matching injection payloads (e.g., (\b(union|select)\b)), container lockdown (Docker non-root, read-only FS), and AppArmor confinement.

5. Operational Security and Testing Practices

Operational guidance is comprehensively embedded, focusing on application monitoring and infrastructure defense:

  • SIEM Integration: Structured logs (e.g., fields user_id, endpoint, error_code) and example Splunk queries (index=app_logs error_code=401 | stats count by source_ip).
  • Infrastructure Hardening: Docker security (e.g., non-root containers), enforced AppArmor profiles, and tailored WAF configurations.
  • Language-Specific Testing: E.g., pytest with pytest-security (Python), Jest plus supertest (JavaScript), and Go’s table-driven SSRF testing.

Each scenario's guidance ensures defense-in-depth across the software–infrastructure boundary.

6. Benchmarks, Evaluation, and Data Utilization

Evaluation departs from traditional accuracy and instead operationalizes vulnerability rate as the central metric. In pilot fine-tuning, models trained on SecureCode v2.0 exhibited a decline in security-relevant vulnerabilities from 45% to under 5% in generated code.

The formal evaluation protocol is as follows:

  1. Fine-tune base LLMs on the 989-example training split.
  2. Tune hyperparameters with 122 validation cases.
  3. Quantify outputs’ vulnerability rate across the 104-example test set, using automated static analysis and expert review.

Artifacts include a public dataset (HuggingFace: scthornton/securecode-v2), the validation compliance script, and benchmarking scripts. Licensing is provided under CC BY-NC-SA 4.0.

SecureCode v2.0 thus provides a uniquely incident-grounded, operationally prescriptive, and conversationally structured resource for advancing security-aware code generation models (Thornton, 20 Dec 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to SecureCode v2.0.