Security Testing with GenAI
Important: This guidance provides examples of how GenAI can augment security testing workflows. It does not replace existing DoD or Department of War security testing guidance, policies, or requirements. All security testing must comply with applicable DoD directives (e.g., DoDI 8500.01, DoDI 8531.01), RMF processes, and organization-specific security requirements.
What is Security Testing?
Security testing validates that systems resist attacks, protect sensitive data, and follow secure coding practices. It identifies vulnerabilities before adversaries can exploit them. GenAI assists with static analysis, dynamic testing scenarios, and fuzzing—but all security findings require expert validation. GenAI supplements, never replaces, professional security assessment.
When Security Testing Occurs
Security testing happens continuously throughout the SDLC and at critical RMF gates:
Common triggers:
- Every code commit - Automated SAST scans identify vulnerabilities in new code
- Pull request review - Security-focused code review with GenAI-assisted analysis
- Pre-deployment gates - DAST and penetration testing before production release
- RMF Assessment & Authorization (A&A) - Comprehensive security testing for ATO
- After security patches - Regression tests ensure vulnerabilities stay fixed
- Continuous monitoring (Con Mon) - Ongoing vulnerability scanning per DoDI 8531.01
- Scheduled security audits - Periodic comprehensive security assessments
- Threat model updates - When new threats emerge or system architecture changes
GenAI helps generate security test scenarios and identify vulnerability patterns, but security experts must validate findings and assess exploitability.
Where GenAI Helps
- SAST vulnerability identification - Finding common vulnerability patterns in source code
- DAST scenario generation - Creating runtime security test cases
- Fuzzing seed creation - Generating diverse malicious inputs to find edge-case vulnerabilities
- Security regression tests - Ensuring fixed vulnerabilities don't reappear
- Mapping to standards - Linking findings to OWASP Top 10, CWE, NIST controls
Example Requirement (SAST)
REQ-SEC-401: "The authentication module shall use parameterized queries for all database operations and shall not construct SQL queries through string concatenation."
Example Prompt (SAST)
"Review this authentication function for security vulnerabilities. Check for: SQL injection, command injection, improper input validation, weak cryptography, hard-coded secrets, and insufficient logging. Reference OWASP Top 10 and CWE-89 (SQL Injection)."
Code being reviewed:
def authenticate_user(username, password):
query = f"SELECT * FROM users WHERE username='{username}' AND password='{password}'"
result = db.execute(query)
return result
Example Output (SAST)
GenAI-generated security findings:
CRITICAL: SQL Injection Vulnerability (CWE-89, OWASP A03:2021)
Location: authenticate_user(), line 2
Issue: Direct string interpolation in SQL query allows injection attacks
Evidence: query = f"SELECT * FROM users WHERE username='{username}'..."
Impact: Attacker can bypass authentication, extract data, or modify database
Recommendation: Use parameterized queries with prepared statements
Example exploit: username="' OR '1'='1" bypasses authentication
CRITICAL: Plaintext Password Handling
Location: authenticate_user(), line 2
Issue: Password compared in plaintext without hashing
Evidence: password='{password}' suggests plaintext comparison
Impact: Passwords exposed in logs, database breach reveals all passwords
Recommendation: Use bcrypt/argon2 for password hashing with unique salt
HIGH: Insufficient Logging
Location: authenticate_user()
Issue: No audit log for authentication attempts (success or failure)
Impact: Cannot detect brute force attacks or account compromises
Recommendation: Log authentication events with timestamp, username, source IP, outcome
Human security expert must:
- Validate exploitability in actual deployment environment
- Assess business risk and mission impact
- Prioritize fixes based on threat model
- Verify recommendations align with security architecture
Example Requirement (DAST)
REQ-SEC-305: "The API shall reject requests with missing or invalid authorization headers and return HTTP 401 with error code AUTH_REQUIRED."
Example Prompt (DAST)
"Generate DAST security test scenarios for requirement REQ-SEC-305. Include tests for: missing auth header, malformed auth header, expired token, tampered token signature, valid token with insufficient privileges. Use Python with requests library."
Example Output (DAST)
def test_missing_auth_header_returns_401():
response = requests.get("https://api.example.mil/protected", headers={})
assert response.status_code == 401
assert response.json()["error"] == "AUTH_REQUIRED"
def test_malformed_auth_header_returns_401():
response = requests.get(
"https://api.example.mil/protected",
headers={"Authorization": "NotAValidFormat"}
)
assert response.status_code == 401
def test_expired_token_returns_401():
expired_token = generate_expired_jwt()
response = requests.get(
"https://api.example.mil/protected",
headers={"Authorization": f"Bearer {expired_token}"}
)
assert response.status_code == 401
assert response.json()["error"] == "TOKEN_EXPIRED"
def test_tampered_token_signature_returns_401():
valid_token = generate_valid_jwt()
tampered_token = valid_token[:-10] + "TAMPERED00" # Corrupt signature
response = requests.get(
"https://api.example.mil/protected",
headers={"Authorization": f"Bearer {tampered_token}"}
)
assert response.status_code == 401
def test_insufficient_privileges_returns_403():
low_privilege_token = generate_jwt(role="read-only")
response = requests.delete(
"https://api.example.mil/protected/resource",
headers={"Authorization": f"Bearer {low_privilege_token}"}
)
assert response.status_code == 403
assert response.json()["error"] == "INSUFFICIENT_PRIVILEGES"
What GenAI-Generated Security Tests Often Miss
GenAI can identify common vulnerability patterns but misses critical security context:
- Threat model alignment - Whether vulnerabilities actually matter for your threat landscape
- Exploitability assessment - If vulnerabilities can be exploited in your environment
- Defense-in-depth - Compensating controls that mitigate risk
- Organizational security context - Classification levels, compartmentalization, network isolation
- False positive identification - Distinguishing real threats from unlikely scenarios
All GenAI security findings must be triaged and validated by security professionals before action.
Governance Checklist
Before accepting GenAI-generated security tests:
- [ ] All security test code reviewed and approved by security team
- [ ] Tests run only in approved test environments (never production)
- [ ] Findings triaged and prioritized by security professionals
- [ ] High/critical findings tracked in official vulnerability management system (e.g., eMASS)
- [ ] Test results logged for RMF artifacts (SAR, SSP, POA&M)
- [ ] Tests aligned with organizational threat model and risk assessment
- [ ] Findings mapped to NIST SP 800-53 controls where applicable
- [ ] Vulnerability remediation timelines comply with DoDI 8531.01 requirements
Integration with Security Testing Tools
GenAI-generated security tests work alongside professional security tools:
SAST Tools:
- SonarQube, Fortify, Checkmarx, Veracode (automated code scanning)
- Semgrep, CodeQL (custom rule-based analysis)
DAST Tools:
- OWASP ZAP, Burp Suite (dynamic runtime testing)
- Acunetix, Nessus (vulnerability scanning)
Secret Scanning:
- GitGuardian, TruffleHog, detect-secrets (credential detection)
Dependency Scanning:
- Dependabot, Snyk, WhiteSource (vulnerable library detection)
Fuzzing:
- AFL, LibFuzzer, Peach (automated input fuzzing)
GenAI helps generate test ideas and scenarios. Security tools provide systematic coverage, validation, and compliance reporting.
Security Testing Workflow
Note: This workflow section is unique to security testing because it involves multiple types of testing (SAST, DAST, penetration testing) that happen at different stages of the SDLC, each with distinct tools, techniques, and timing requirements.
Important: The workflows below are example patterns for integrating GenAI into security testing processes. Organizations must adapt these examples to align with their existing security testing procedures, tooling, and compliance requirements.
1. Continuous SAST (every commit): - GenAI analyzes new code for vulnerability patterns - Automated SAST tools scan for known vulnerabilities - Security team triages findings
2. DAST (pre-deployment gates): - GenAI generates runtime security test scenarios - DAST tools execute tests against running application - Security team validates exploitability
3. Security Regression (after fixes): - GenAI generates regression tests for fixed vulnerabilities - Tests added to CI/CD pipeline to prevent reintroduction - Tracked in security tracking system
4. Penetration Testing (scheduled): - Professional security assessors conduct manual testing - GenAI assists with test case generation - Results feed back into automated test suite
Vulnerability Management (DoDI 8531.01)
GenAI-generated security findings must be managed through the official DoD vulnerability management process defined in DoDI 8531.01. Security teams should:
- Track all findings in the official vulnerability management system
- Follow established processes for assessment, remediation, and reporting
- Document findings and remediation status for RMF compliance
Refer to DoDI 8531.01 for the complete vulnerability management framework and requirements.
DoD and NIST Security Testing Framework
This guidance aligns with DoD cybersecurity and vulnerability management policies:
Key DoD Instructions:
- DoDI 8531.01 - DoD Vulnerability Management: Establishes uniform vulnerability management programs and requires efficient vulnerability assessment techniques throughout the system lifecycle
- DoDI 8510.01 - Risk Management Framework (RMF): Governs security testing and assessment for ATO processes
- DoDI 8500.01 - Cybersecurity: Establishes DoD cybersecurity program and requirements
NIST Standards:
- NIST SP 800-53 - Security and Privacy Controls: Defines security testing controls (CA-8, RA-5, SI-2)
- NIST SP 800-115 - Technical Guide to Information Security Testing and Assessment
GenAI-augmented security testing supports these frameworks by accelerating test generation and vulnerability identification, while maintaining human oversight required for DoD compliance.
References
- U.S. Department of Defense, "DoD Vulnerability Management," DoDI 8531.01, Sept. 2020. [Online]. Available: https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/853101p.pdf
- U.S. Department of Defense, "Risk Management Framework (RMF) for DoD Information Technology (IT)," DoDI 8510.01, July 2023. [Online]. Available: https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/851001p.pdf
- U.S. Department of Defense, "Cybersecurity," DoDI 8500.01, Mar. 2024. [Online]. Available: https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/850001p.PDF
- National Institute of Standards and Technology, "Security and Privacy Controls for Information Systems and Organizations," NIST SP 800-53 Rev. 5, Sept. 2020. [Online]. Available: https://doi.org/10.6028/NIST.SP.800-53r5