Documentation Index
Fetch the complete documentation index at: https://mintlify.com/gadievron/raptor/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The exploitability validation pipeline ensures security findings are not false positives by systematically verifying they are real, reachable, and exploitable through a 6-stage process.
Why Validation?
Static analysis tools produce many findings, but not all are exploitable:
- Hallucinated findings: File doesn’t exist, code doesn’t match scanner output
- Unreachable code: Dead code, test-only functions
- Protected paths: Effective sanitization, impossible preconditions
- Binary constraints: Mitigations that block exploitation
Validation prevents wasted effort on false positives.
The 6-Stage Pipeline
┌─────────────────────────────────────────────────────┐
│ Stage 0: Inventory │
│ Build ground truth checklist of all functions │
└─────────────┬───────────────────────────────────────┘
│ checklist.json
▼
┌─────────────────────────────────────────────────────┐
│ Stage A: One-Shot Analysis │
│ Quick exploitability check + PoC attempt │
└─────────────┬───────────────────────────────────────┘
│ findings.json (status: pending/not_disproven)
▼
┌─────────────────────────────────────────────────────┐
│ Stage B: Process │
│ Systematic analysis with attack trees │
└─────────────┬───────────────────────────────────────┘
│ 5 working documents
▼
┌─────────────────────────────────────────────────────┐
│ Stage C: Sanity Check │
│ Validate against actual source code │
└─────────────┬───────────────────────────────────────┘
│ sanity_check added to findings
▼
┌─────────────────────────────────────────────────────┐
│ Stage D: Ruling │
│ Filter based on practical exploitation criteria │
└─────────────┬───────────────────────────────────────┘
│ ruling & confirmed findings
▼
┌────┴────┐
│ │
▼ ▼
Memory Web/Injection
Corruption (Done)
│
▼
┌─────────────────────────────────────────────────────┐
│ Stage E: Feasibility │
│ Binary constraint analysis for memory corruption │
└─────────────┬───────────────────────────────────────┘
│ final_status & feasibility
▼
validation-report.md
Stage 0: Inventory
Purpose: Build a complete checklist of all code to be analyzed.
Output
checklist.json - Complete function inventory:
{
"generated_at": "2026-03-04T12:00:00Z",
"target_path": "/path/to/code",
"total_files": 42,
"total_functions": 256,
"files": [
{
"path": "src/parser.c",
"functions": [
{
"name": "parse_header",
"line_start": 120,
"line_end": 145,
"checked": false
}
]
}
]
}
Execution
from packages.exploitability_validation.checklist_builder import build_checklist
checklist = build_checklist(
target_path="/path/to/code",
workdir=".out/validation/",
exclude_patterns=["*_test.*", "test_*"]
)
Stage A: One-Shot Analysis
Purpose: Quick exploitability assessment with PoC attempts.
Gates Applied
- GATE-1 [ASSUME-EXPLOIT]: Assume findings are exploitable until proven otherwise
- GATE-4 [NO-HEDGING]: No “maybe” or “could be” - verify all claims
- GATE-6 [PROOF]: Provide concrete proof and vulnerable code
Output
findings.json - Initial exploitability assessment:
{
"stage": "A",
"timestamp": "2026-03-04T12:30:00Z",
"findings": [
{
"id": "FINDING-0001",
"file": "src/parser.c",
"line": 134,
"function": "parse_header",
"vuln_type": "buffer_overflow",
"status": "not_disproven",
"message": "Unbounded strcpy into fixed buffer",
"proof": "strcpy(buf, header);",
"poc_attempted": true,
"poc_result": "crash with SIGSEGV"
}
]
}
Status Values
poc_success - PoC successfully demonstrated vulnerability
not_disproven - Cannot rule out, needs deeper analysis (Stage B)
disproven - Proven safe, no further analysis needed
Stage B: Process
Purpose: Systematic analysis for “not_disproven” findings using attack trees and knowledge graphs.
Gates Applied
ALL gates (1-6):
- GATE-1: Assume exploitable
- GATE-2: Strictly follow instructions
- GATE-3: Update checklist, collect evidence
- GATE-4: No hedging
- GATE-5: Full code coverage
- GATE-6: Provide proof
Working Documents
Stage B creates 5 specialized documents:
1. attack-tree.json
Knowledge graph of attack paths:
{
"root": "Exploit buffer overflow in parse_header",
"updated_at": "2026-03-04T13:00:00Z",
"nodes": [
{
"id": "node-001",
"type": "goal",
"description": "Control instruction pointer",
"children": ["node-002", "node-003"],
"status": "testing"
},
{
"id": "node-002",
"type": "method",
"description": "Overwrite return address on stack",
"prerequisites": ["Stack overflow possible", "No stack canary"],
"status": "confirmed"
}
]
}
2. hypotheses.json
Testable predictions:
[
{
"id": "hyp-001",
"hypothesis": "Input length controls overflow distance",
"status": "confirmed",
"evidence": [
"Input of 100 bytes overwrites RBP",
"Input of 104 bytes overwrites return address"
],
"tested_at": "2026-03-04T13:15:00Z"
},
{
"id": "hyp-002",
"hypothesis": "Stack canary blocks exploitation",
"status": "disproven",
"evidence": ["Binary compiled without -fstack-protector"],
"tested_at": "2026-03-04T13:20:00Z"
}
]
3. disproven.json
Failed approaches:
[
{
"approach": "ROP chain via libc gadgets",
"why_failed": "ASLR randomizes libc base, no info leak available",
"attempted_at": "2026-03-04T13:30:00Z",
"learnings": "Need info leak primitive before ROP"
}
]
4. attack-paths.json
Attempted exploitation paths with PROXIMITY scoring:
[
{
"path_id": "path-001",
"description": "Direct return address overwrite",
"steps": [
"1. Send 104-byte input",
"2. Overwrite return address with shellcode location",
"3. Return from function to shellcode"
],
"proximity": 8,
"blockers": ["DEP prevents shellcode execution"],
"status": "blocked"
},
{
"path_id": "path-002",
"description": "ROP chain to mprotect()",
"steps": [
"1. Leak stack address",
"2. Build ROP chain calling mprotect()",
"3. Make stack executable",
"4. Jump to shellcode on stack"
],
"proximity": 5,
"blockers": ["No info leak primitive found"],
"status": "investigating"
}
]
PROXIMITY Scale:
10 - Working exploit
8-9 - Very close, minor obstacles
6-7 - Feasible path, some blockers
4-5 - Significant obstacles
1-3 - Far from exploitation
0 - Not viable
5. attack-surface.json
Sources, sinks, and trust boundaries:
{
"sources": [
{
"type": "user_input",
"location": "src/parser.c:100",
"function": "read_header",
"description": "HTTP header from socket",
"controllable": true
}
],
"sinks": [
{
"type": "memory_operation",
"location": "src/parser.c:134",
"function": "parse_header",
"operation": "strcpy",
"dangerous": true
}
],
"trust_boundaries": [
{
"location": "src/parser.c:105",
"type": "validation",
"description": "Header length check",
"effective": false,
"reason": "Check uses signed comparison, negative values bypass"
}
]
}
Stage C: Sanity Check
Purpose: Verify findings against actual source code.
Gates Applied
- GATE-3 [CHECKLIST]: Update checklist with verification
- GATE-5 [FULL-COVERAGE]: Check all code, no sampling
- GATE-6 [PROOF]: Show actual code verbatim
Verification Checks
- File exists at stated path
- Code matches VERBATIM at stated line (not paraphrased)
- Source→sink flow is real (not hypothetical)
- Code is reachable (function is actually called)
Output
findings.json with sanity_check field added:
{
"id": "FINDING-0001",
"file": "src/parser.c",
"line": 134,
"sanity_check": {
"passed": true,
"file_exists": true,
"code_matches": true,
"code_verbatim": " strcpy(buf, header);",
"flow_real": true,
"reachable": true,
"verified_at": "2026-03-04T14:00:00Z"
}
}
Stage D: Ruling
Purpose: Make final exploitability determination based on all evidence.
Gates Applied
- GATE-3 [CHECKLIST]: Document ruling decisions
- GATE-5 [FULL-COVERAGE]: Rule on all findings
- GATE-6 [PROOF]: Justify ruling with evidence
Ruling Criteria
Findings are ruled_out if:
- Failed sanity check
- Requires impossible preconditions
- Protected by effective mitigations
- Attack paths have PROXIMITY ≤ 2
Findings are confirmed if:
- Passed sanity check
- Realistic exploitation path exists
- No effective protections
- Attack paths have PROXIMITY ≥ 6
Output
findings.json with ruling field:
{
"id": "FINDING-0001",
"ruling": {
"status": "Confirmed",
"reason": "Passed sanity check, direct exploitation path with proximity 8",
"attack_path": "path-001",
"prerequisites": [],
"ruled_at": "2026-03-04T14:30:00Z"
}
}
Status Values
Confirmed - Exploitable, proceed to Stage E
Ruled Out - Not exploitable, stop here
Stage E: Feasibility
Purpose: Binary constraint analysis for memory corruption vulnerabilities.
Scope: Stage E only applies to memory corruption types (buffer overflow, format string, UAF, etc.). Web/injection vulnerabilities stop at Stage D.
Memory Corruption Types
Stage E applies to:
buffer_overflow
heap_overflow
stack_overflow
format_string
use_after_free
double_free
integer_overflow
out_of_bounds_read
out_of_bounds_write
Binary Analysis
Integrates with packages/exploit_feasibility for:
- Protection detection: ASLR, DEP, RELRO, stack canaries
- Constraint analysis: Bad bytes, null terminators
- Gadget availability: ROP gadgets, syscall availability
- Verdict: Likely / Difficult / Unlikely
Execution
from packages.exploit_feasibility import analyze_binary
result = analyze_binary(
binary_path="/path/to/binary",
vuln_type="buffer_overflow"
)
print(f"Verdict: {result['verdict']}")
print(f"Blockers: {result['blockers']}")
print(f"Suggestions: {result['suggestions']}")
Output
findings.json with feasibility and final_status:
{
"id": "FINDING-0001",
"feasibility": {
"status": "analyzed",
"binary_path": "/path/to/binary",
"verdict": "Difficult",
"chain_breaks": [
"ASLR randomizes code base",
"DEP prevents shellcode execution"
],
"what_would_help": [
"Info leak to defeat ASLR",
"ROP chain for code reuse"
]
},
"final_status": "Confirmed (constrained)"
}
Final Status Mapping
| Ruling Status | Feasibility Verdict | Final Status |
|---|
| Confirmed | Likely | Exploitable |
| Confirmed | Difficult | Confirmed (constrained) |
| Confirmed | Unlikely | Confirmed (blocked) |
| Confirmed | N/A (web vuln) | Confirmed |
| Ruled Out | - | Ruled Out |
CLI Usage
Full Pipeline
Run complete validation from scratch:
python3 -m packages.exploitability_validation \
--target /path/to/code \
--vuln-type buffer_overflow
With Pre-existing Findings
Validate findings from scanner output (skips Stage 0 and A):
python3 -m packages.exploitability_validation \
--target /path/to/code \
--findings scan_results.sarif
With Binary for Stage E
python3 -m packages.exploitability_validation \
--target /path/to/code \
--findings findings.json \
--binary /path/to/compiled/binary
Skip Stage E
python3 -m packages.exploitability_validation \
--target /path/to/code \
--skip-feasibility
Custom Working Directory
python3 -m packages.exploitability_validation \
--target /path/to/code \
--workdir /custom/output/path
Python API
Orchestrator
from packages.exploitability_validation import ValidationOrchestrator, PipelineConfig
config = PipelineConfig(
target_path="/path/to/code",
workdir=".out/validation-20260304/",
vuln_type="command_injection",
binary_path=None,
findings_file=None,
skip_feasibility=False
)
orchestrator = ValidationOrchestrator(config)
result = orchestrator.run()
print(f"Success: {result.state.completed_at}")
for stage, stage_result in result.state.stage_results.items():
print(f"{stage.name}: {stage_result.status}")
Convenience Function
from packages.exploitability_validation import run_validation
result = run_validation(
target_path="/path/to/code",
vuln_type="sql_injection",
findings_file="scanner_output.sarif"
)
The validation pipeline automatically converts SARIF format:
# Supported: SARIF 2.0 and 2.1.0
# From tools: Semgrep, CodeQL, others
config = PipelineConfig(
target_path="/path/to/code",
findings_file="semgrep_results.sarif" # Auto-detected format
)
SARIF Conversion
- Rule ID normalization:
engine.semgrep.rules.crypto.weak-hash → weak_hash
- CWE mapping:
CWE-89 → sql_injection
- Deduplication: By file:line:vuln_type
- Logical locations: Extracts function names
- Severity mapping: SARIF levels → internal severity
Validation Report
Final output: validation-report.md
# Exploitability Validation Report
## Summary
- Target: /path/to/code
- Vulnerability Type: buffer_overflow
- Started: 2026-03-04 12:00:00
- Completed: 2026-03-04 14:45:00
## Stage Results
- Stage 0 (Inventory): [OK] (12.3s)
- Stage A (One-Shot): [OK] (45.7s)
- Stage B (Process): [OK] (123.4s)
- Stage C (Sanity): [OK] (23.1s)
- Stage D (Ruling): [OK] (8.9s)
- Stage E (Feasibility): [OK] (15.2s)
## Findings Summary
- Total: 15
- Exploitable: 2
- Confirmed (constrained): 3
- Confirmed (blocked): 1
- Ruled Out: 9
## Confirmed Findings
### FINDING-0001: buffer_overflow in src/parser.c:134
- Function: parse_header
- Final Status: Exploitable
- Feasibility: Likely
- Chain Breaks: None
### FINDING-0003: format_string in src/logger.c:89
- Function: log_message
- Final Status: Confirmed (constrained)
- Feasibility: Difficult
- Chain Breaks: RELRO blocks GOT overwrite, PIE randomizes addresses
Output Style Guide
Per RAPTOR’s style conventions:
Human-Readable Status
- ✅
Exploitable (not EXPLOITABLE)
- ✅
Confirmed (not CONFIRMED)
- ✅
Ruled Out (not RULED_OUT)
- ✅
Proven / Disproven (not PROVEN / DISPROVEN)
No Colored Indicators
- ❌ Don’t use: 🔴/🟢 (perspective-dependent)
- ✅ Use: Plain text or
### Exploitable (7 findings)
- ✅ Other emojis OK: ⚠️, ✓, etc.
Best Practices
Start with SARIF input: Feed scanner output directly to validation to avoid manual finding transcription. The pipeline auto-converts and deduplicates.
Stage B is intensive: For large codebases with many “not_disproven” findings, Stage B can take hours. Consider filtering to high-severity findings first.
Stage E requires binary: If no compiled binary is available, Stage E is skipped. Memory corruption findings will be marked Confirmed without feasibility analysis.
Troubleshooting
Stage A produces all “not_disproven”
This is normal for complex vulnerabilities. Stage B will analyze them systematically.
Stage C sanity checks fail
Common causes:
- Scanner output has stale file paths
- Code changed since scanning
- Scanner hallucinated the finding
Fix: Re-run scanner on current codebase.
Stage E skipped unexpectedly
Check:
- Binary path is correct:
--binary /path/to/binary
- Binary is executable:
chmod +x /path/to/binary
- Vulnerability type is memory corruption
Integration Examples
From Semgrep
# 1. Run Semgrep
python3 packages/static-analysis/scanner.py \
--repo /path/to/code \
--policy_groups all
# 2. Validate findings
python3 -m packages.exploitability_validation \
--target /path/to/code \
--findings out/scan_*/combined.sarif
From CodeQL
# 1. Run CodeQL
python3 raptor_codeql.py \
--repo /path/to/code \
--scan-only
# 2. Validate findings
python3 -m packages.exploitability_validation \
--target /path/to/code \
--findings out/codeql_*/java_results.sarif \
--binary /path/to/binary.jar
From Autonomous Mode
Validation runs automatically in /agentic:
/agentic /path/to/code
# Automatically runs:
# 1. Static analysis (Semgrep/CodeQL)
# 2. Exploitability validation (this pipeline)
# 3. LLM analysis
# 4. Exploit generation
See Also