ARE — Autonomous Remediation Engine
The Autonomous Remediation Engine (ARE) takes a finding, reads the source code, writes a fix, builds it, verifies it against 7 scanners in 3 passes, and creates a Change Request if all checks pass. It uses claude-opus-4-8 for code analysis and patch generation.
What the ARE does and does not do
Does:
- Read actual source code (not just metadata)
- Generate working code patches (unified diff format)
- Build and compile the patched code
- Run 3 scanner passes to verify the fix is effective and introduces no regressions
- Create a Change Request for CCB approval
- Route dual PRs: one to the compliance vault, one to the customer's staging branch
Does not:
- Deploy to production (requires CCB approval and human review)
- Fix all finding types (structural refactors and PS/PE/AT controls require human work)
- Guarantee fixes for proprietary or obfuscated codebases
- Work without write access to the repository
The 11-Step Pipeline
These steps are derived from the orchestrator source. Every step is logged to remediation_job_steps in the database.
| Step | Name | Description |
|---|---|---|
| 1 | fetch_source | Downloads the repository tarball at the base branch commit SHA |
| 2 | detect_build_units | Detects all build systems in the repo (workspace-aware, recursive) |
| 3 | resolve_image_digests | Resolves container image tags to SHA-256 digests from registries |
| 4 | stage1_scan | Runs all 7 scanners against base source to establish finding baseline |
| 5 | claude_analysis | Sends findings + affected file contents to claude-opus-4-8 for analysis and patch generation |
| 6 | apply_fix | Applies the generated unified diff to produce the patched source tree |
| 7 | build_verification | Compiles/builds the patched code; self-heals up to 3 times on build failure |
| 8 | stage2_scan | Runs scanners against patched source to verify the target finding is gone |
| 9 | stage3_scan | Final scanner pass to confirm no regressions introduced by the fix |
| 10 | rampart_gate_check | Runs RAMPART gates against the sandbox branch to verify gate compliance |
| 11 | create_change_request | Creates a CCB Change Request; moves job to awaiting_approval status |
Step 7 (build verification) can repeat as self_heal_1, self_heal_2, self_heal_3 if the initial build fails. The AI is given the build error and generates a corrected patch. After 3 failed attempts, the job fails rather than proceeding with an unverified patch.
The 3 Verification Stages
The ARE applies the "fail-closed" principle: a fix that cannot be fully verified is not promoted to a Change Request.
| Stage | When | What is checked |
|---|---|---|
| Stage 1 | Before patch | Baseline scan — establishes what findings exist in the original source |
| Stage 2 | After patch | Verifies the target finding is absent from the patched source |
| Stage 3 | Final pass | Confirms no regressions — no new findings introduced by the fix |
If Stage 2 or Stage 3 fail, the job status is set to failed and a failure reason is recorded. No Change Request is created.
The 7 Scanners
All scanners run in parallel within each stage. Scanner output is normalized to a common schema before comparison.
| # | Scanner | Type | NIST Controls |
|---|---|---|---|
| 1 | Grype | SCA / CVE | SI-2, RA-5 |
| 2 | Semgrep | Code security patterns | SI-10, IA-5, SC-28 |
| 3 | OSV-Scanner | Cross-ecosystem vulnerability | SA-12, SR-4 |
| 4 | Trufflehog | Secrets in source | IA-5, CM-6 |
| 5 | Native audit | Language-native (npm audit, pip-audit, govulncheck, cargo audit, bundler-audit) | SI-2, RA-5 |
| 6 | Checkov | IaC misconfiguration | CM-6, CM-7 |
| 7 | Trivy | Container / IaC vuln + config | SI-2, CM-6 |
All 7 scanners run in parallel. A scanner that times out or is unavailable logs a warning and yields zero findings — it does not fail the job. Grype always runs and cannot be disabled.
Build System Support
The ARE auto-detects build systems using recursive manifest discovery:
| Language | Build System | Verification command |
|---|---|---|
| JavaScript / TypeScript | npm, yarn, pnpm | npm install && npm run build |
| Python | pip | pip install -r requirements.txt |
| Go | go | go build ./... |
| Rust | Cargo | cargo build |
| Ruby | Bundler | bundle install |
| Java | Maven, Gradle | mvn package / gradle build |
| C# | .NET | dotnet build |
Workspace-aware discovery handles monorepos — each build unit is identified and built independently.
Fix Strategy Types
The AI selects one of six fix strategies based on finding type:
| Strategy | When used |
|---|---|
version_bump | Dependency with a fixed version available |
code_fix | Code-level vulnerability (injection, hardcoded secret, etc.) |
config_change | IaC misconfiguration |
dependency_removal | Unused or unmaintained dependency |
secret_rotation | Hardcoded credential (removes from source; rotation still required) |
structural_refactor | Complex architectural issue (confidence threshold must be met) |
Job Lifecycle
Job created (status: pending)
│
▼
status: scanning (Steps 1–4)
│
▼
status: analyzing (Step 5)
│
▼
status: patching (Steps 6–7)
│
▼
status: verifying (Steps 8–10)
│
▼
status: awaiting_approval (Step 11 — CCB Change Request created)
│
(CCB approves)
▼
Dual PR routing:
→ Compliance vault internal PR (evidence)
→ Customer staging branch PR (code)Fail-Closed Behavior
If any step fails, the job moves to failed status with a failure_reason. Failure reasons are logged to both remediation_job_steps and CHRONICLE. No code is pushed to any branch when a job fails.
Common failure reasons:
| Reason | Cause |
|---|---|
no write access to repository | GitHub App lacks write permissions |
build verification failed after 3 attempts | The AI could not generate a buildable fix |
stage2 finding still present | The fix did not actually remediate the target finding |
stage3 regression detected | The fix introduced a new vulnerability |
job exceeded 30-minute total timeout | The full pipeline exceeded the 30-minute budget |
Change Control Board (CCB) integration
The ARE does not push code directly to any branch after verification. Instead, it creates a CCB Change Request and waits for approval. This is not a safety shortcut — it is the compliance-correct behavior under CM-3.
The Change Request created at Step 11 includes:
- The finding being remediated (scanner, CVE ID or rule ID, severity)
- The diff that will be applied
- Stage 1, 2, and 3 scanner results
- Build verification logs
- Risk assessment generated by the AI
- RAMPART gate check results for the sandbox branch
The CCB reviews these artifacts before approving. On approval, the ARE routes dual PRs.
Dual-PR routing and compliance significance
Every approved remediation creates two pull requests:
| PR | Repository | Branch | Purpose |
|---|---|---|---|
| Vault PR | reaegis/compliance-vault | main | Evidence archival; Cosign-signed; permanent |
| Implementation PR | Customer repository | Base branch | Actual code fix for human review and merge |
Why two PRs?
The separation between compliance evidence and customer code is intentional. The vault PR preserves a signed, immutable record of the remediation — what was found, what was fixed, and when — regardless of what happens to the customer's repository. If the customer modifies the fix before merging, or does not merge it at all, the vault PR still documents that a working, verified fix was generated.
This separation satisfies CM-3(b) (which requires maintaining records of change requests and their disposition) independently of whether the change is applied.
Finding types the ARE can and cannot fix
Can fix autonomously:
| Finding type | Example | ARE behavior |
|---|---|---|
| SCA CVE with fixed version | [email protected] → [email protected] | Version bump in manifest; rebuild |
| Hardcoded secret | API key in source file | Removes from source; adds to secrets manager reference |
| IaC misconfiguration | Terraform S3 bucket with public_read = true | Config change to public_read = false |
| Dependency with known vulnerability | requests<2.32.2 | Version pin or removal if unused |
| Insecure code pattern | SQL query built with string concatenation | Parameterized query rewrite |
Cannot fix autonomously (requires human work):
| Finding type | Why |
|---|---|
| PE/AT/PS control gaps | No source code to change; requires org process changes |
| Structural architectural findings | Risk of regression too high; confidence threshold not met |
| Secrets that have already been rotated | Source removal only; key rotation is out of scope |
| Proprietary/obfuscated dependencies | No source available to analyze |
| STIG findings that require system-level changes | Cannot modify OS configuration |
Confidence threshold
The ARE uses a confidence score (0.0–1.0) to decide whether to attempt a structural_refactor fix strategy. For all other strategies, it attempts the fix regardless of confidence. For structural_refactor:
- Confidence ≥ 0.85: Attempt the fix
- Confidence < 0.85: Report
insufficient_confidenceand exit gracefully without creating a PR
This threshold prevents the ARE from making large, complex code changes when it cannot reliably predict the outcome.
Prerequisites
- REAEGIS GitHub App installed on the repository with write access
- Repository has at least one build manifest (package.json, go.mod, requirements.txt, etc.)
- Finding is of a type that the ARE can remediate (SCA CVEs, code patterns, IaC misconfigs, secrets)
The ARE requires write access to push the sandbox branch and open pull requests. If the GitHub App installation does not have write access, the job fails immediately at Step 1 with "no write access to repository". This is intentional — the ARE does not work around missing permissions.
Monitoring ARE jobs
ARE jobs are visible in the Remediation tab of each finding. Each job shows:
- Current status (
pending,scanning,analyzing,patching,verifying,awaiting_approval,completed,failed) - Step-level logs for each of the 11 pipeline steps
- Scanner output from all 3 stages
- The generated diff (viewable inline before CCB approval)
- Links to the sandbox branch and Change Request
Failed jobs include a failure_reason that explains which step failed and why. The most common failure reasons and their remediation:
| Failure reason | Action |
|---|---|
no write access to repository | Check GitHub App permissions; ensure Contents: Write is granted |
build verification failed after 3 attempts | Check if the repository has a working build locally; complex builds may need manual intervention |
stage2 finding still present | The scanner did not confirm the fix; check scanner output and fix manually |
stage3 regression detected | The fix introduced a new finding; review the diff and fix manually |
job exceeded 30-minute total timeout | Large repositories or complex builds; contact support |
Related pages
- RAMPART Engine — How findings are created
- GitHub Integration — GitHub App permissions required for ARE
- CHRONICLE Engine — How ARE steps are audit-logged
- Change Control — CCB workflow and Change Request lifecycle