Documentation Index
Fetch the complete documentation index at: https://mintlify.com/gadievron/raptor/llms.txt
Use this file to discover all available pages before exploring further.
Overview
RAPTOR’s static analysis engine combines local security rules with Semgrep’s community packs for comprehensive code scanning. The scanner executes rules in parallel for improved performance and supports policy-based rule selection.Architecture
The scanner is located atpackages/static-analysis/scanner.py and orchestrates:
- Parallel rule execution with configurable worker pools
- Policy group selection for targeted scanning
- SARIF output format for standardized reporting
- Automatic deduplication across multiple rule sources
- Repository validation with safe git cloning
Policy Groups
Available Groups
RAPTOR organizes rules into policy groups that map to both local rules and Semgrep registry packs:| Group | Local Rules | Registry Pack | Focus |
|---|---|---|---|
crypto | Custom cryptography rules | category/crypto | Weak algorithms, key management |
secrets | Secret detection patterns | p/secrets | API keys, credentials, tokens |
injection | Injection vulnerability rules | p/command-injection | Command, SQL, LDAP injection |
auth | Authentication patterns | p/jwt | JWT issues, session handling |
ssrf | SSRF detection | p/ssrf | Server-side request forgery |
deserialisation | Unsafe deserialization | p/insecure-deserialization | Pickle, YAML, JSON issues |
logging | Logging security | p/logging | Log injection, sensitive data |
filesystem | Path traversal | p/path-traversal | Directory traversal |
flows | Dataflow analysis | p/default | Taint tracking |
sinks | Dangerous sinks | p/xss | XSS, dangerous functions |
all | All groups | All packs | Comprehensive scan |
Baseline Packs
These packs are always included regardless of policy group selection:CLI Usage
Basic Scan
Scan a repository with default crypto rules:Git Repository Clone
Scan a remote repository (clones automatically):Multiple Policy Groups
Combine multiple policy groups:Comprehensive Scan
Run all available policy groups:Sequential Mode
Disable parallel scanning (useful for debugging):Preserve Working Directory
Keep temporary clone directory for inspection:Parallel Execution
Worker Pool Configuration
The scanner uses a configurable thread pool:Performance Benefits
Parallel execution provides significant speedup:- 4 workers: 3-4x faster than sequential
- Per-rule timeout: 120 seconds (configurable)
- Total timeout: 900 seconds (15 minutes)
SARIF Output Format
Output Structure
Each scan produces multiple SARIF files:SARIF Schema
RAPTOR validates all SARIF output against the official schema:Merged Output
The scanner automatically merges and deduplicates findings:Scan Metrics
Generated Metrics
Every scan produces comprehensive metrics:Accessing Metrics
Repository Validation
URL Validation
Only trusted repository patterns are allowed:Safe Git Clone
Cloning uses restricted environment and timeouts:Configuration Examples
Custom Rule Directory
Add your own Semgrep rules:Environment Configuration
Integration with RAPTOR Pipeline
Automatic Invocation
Static analysis runs automatically in/agentic mode:
Phase Integration
The scanner is Phase 1 of the autonomous pipeline:- Static Analysis (scanner.py) → SARIF findings
- Exploitability Validation → Confirmed vulnerabilities
- LLM Analysis → Root cause analysis
- Exploit Generation → Proof-of-concept code
Output Consumption
SARIF output feeds downstream tools:Troubleshooting
Empty SARIF Output
If a scan produces no results:Timeout Issues
Increase timeouts for large codebases:Validation Failures
If SARIF validation fails:Best Practices
Parallel vs Sequential: Use
--sequential only for debugging. Parallel mode is 3-4x faster with no loss of accuracy.See Also
- CodeQL Analysis - Deep semantic analysis
- Exploitability Validation - Verify findings are exploitable
- Binary Fuzzing - Dynamic testing with AFL++