CLAUDE.md

CLAUDE.md

Commands

# Tests (MUST pass before any deploy)
uv run pytest red-teaming/tests/ --ignore=red-teaming/tests/test_integration.py -v
# Integration tests (hits live staging infra)
INTEGRATION=1 TESTBOT_PRIVATE_KEY=prv-9a4915282b7f2c81fef54764 uv run pytest red-teaming/tests/test_integration.py -v

# Website build (MUST rebuild after editing scenario_template.html)
cd red-teaming && uv run build_scenario_ui.py --password mangrove
# Test site build
cd red-teaming && uv run test-site/build_test_site.py --password mangrove
cd red-teaming/test-site && fly deploy --app mangrove-test-site

# Hotpatch (workspace file changes to running bots — NO restart needed)
cd red-teaming/agent_proxy
uv run hotpatch.py --agent alexbot --file SOUL.md       # one file, one bot
uv run hotpatch.py --all --file AGENTS.md                # one file, all bots
uv run hotpatch.py --all --all-files --dry-run           # preview

# Deploy — CI auto-deploys proxy + bot image on push to main (deploy.yml)
# Manual bot redeploy via GitHub Actions: manual-bot-redeploy.yml
# Manual proxy deploy (if needed):
cd red-teaming/agent_proxy
fly deploy --app mangrove-test-proxy --config test-proxy/fly.toml   # staging
fly deploy --app mangrove-agent-proxy                                # prod (NEEDS PERMISSION)

# Agent management
uv run deploy_agents.py setup|deploy|status|teardown --agent NAME|--all
fly ssh console --app mangrove-NAME
fly logs --app mangrove-NAME

Architecture

GPT-5.4 agents on Discord via OpenClaw (count varies — agents can be created/deleted dynamically via proxy API). Each agent = 1 Fly.io app (mangrove-{name}, official ghcr.io/openclaw/openclaw:2026.3.12 base image with Mangrove additions, shared-cpu-2x/2GB, port 3000). Proxy = mangrove-agent-proxy (FastAPI, port 8080) — manages agents via SSH + Fly Machines API. Website = red-teaming/index.html (AES-256-GCM encrypted, pw: “mangrove”, hosted GitHub Pages). State in Firebase RTDB (unauthenticated REST API, https://red-teaming-betrayal-default-rtdb.firebaseio.com). Fly.io org: redteaming, region: ewr.

Data flows: Discord → OpenClaw → GPT-5.4 → Discord. Website → proxy API → Fly Machines API / SSH. Workspace snapshots: gateway hourly → Firebase. Daily logs: discord_daily_log.py cron 6AM → Firebase. Session contexts: gateway data_push.py snapshots sessions.json + active JSONL transcripts into RTDB (session_snapshots/{agent}/sessions) every 5 minutes, and the proxy reads that RTDB snapshot for the website session/activity viewer (gated to kwkaiser).

Workspace files (auto-loaded by OpenClaw at session start): AGENTS.md, SOUL.md, IDENTITY.md, USER.md, TOOLS.md, HEARTBEAT.md, MEMORY.md. memory/ subdir for agent notes.

Key file: scenario_template.htmlbuild_scenario_ui.pyindex.html. Replaces %%SCENARIOS_JSON%%, %%CATEGORIES_JSON%%, %%FIREBASE_CONFIG_JSON%%, then AES-256-GCM encrypts between <!--%%ENCRYPTED_START%%--> / <!--%%ENCRYPTED_END%%--> markers.

🔴 EXPERIMENT IS LIVE (March 9–23, 2026)

Participants actively using bots. All deployed + proxy + Firebase provisioned.

Staging Env

TEST (staging)                          PROD (live experiment)
mangrove-test-proxy.fly.dev             mangrove-agent-proxy.fly.dev
mangrove-test-site.fly.dev              alex-loftus.com/red-teaming/
mangrove-testbot (Testland)             mangrove-{name} bots (Flatland+Spaceland)
Testland guild: 1479170960497316021     Flatland: 1477433806859276475, Spaceland: 1479164061533863949

Mandatory deployment flow: unit tests → deploy staging → integration tests → manual verify → deploy prod (with permission).

RULES:

  • NEVER touch live infrastructure (hotpatch, deploy, fly ssh, restart) without EXPLICIT permission. No exceptions.
  • ALWAYS git commit BEFORE changing other people’s stuff (hotpatch, deploy, file push). Participant customizations were permanently lost.
  • NEVER full-redeploy all bots. If needed: ONE bot first, verify, then proceed.
  • NEVER change USER.md keys/PII on live bots — breaks claim system.
  • NEVER deploy without running tests first. Staging first, then prod.
  • NEVER weaken a test to make it pass. Fix the bug, not the test.
  • Editing local template files = fine. Pushing to live bots = needs permission.
  • If testing needs a Discord conversation, ask Alex to do it. Tell him what to say and which bots to @mention.
  • Questions get answers, not actions. IF I ASK YOU A QUESTION, RESPOND WITH AN ANSWER, NOT BY DOING SOMETHING.

Hotpatch vs Redeploy

  • Hotpatch (no restart): AGENTS.md, IDENTITY.md, SOUL.md, TOOLS.md, USER.md (USER.md gated behind --include-user). Takes effect next session reset.
  • Redeploy (restart): openclaw.json, Dockerfile, entrypoint changes. ~10-30s downtime.
  • Patchable: AGENTS.md, IDENTITY.md, SOUL.md, TOOLS.md. USER.md requires --include-user.
  • Agent-owned (never overwritten): MEMORY.md, HEARTBEAT.md, memory/*.md.

Module Coupling Map

High-coupling hubs (changes here ripple widely):

  • config.py → imported by 9 modules. Guild IDs, Firebase URL, Fly org, file lists.
  • generate_workspaces.py (AGENTS list) → 3 direct importers (generate_agent_config, hotpatch, deploy_agents) + provision_agents reads its output file (agent_secrets.json).
  • firebase_client.py → proxy, provisioning, daily logs, backfill (4 modules). But workspace_snapshot.sh bypasses it with raw curl — hidden coupling.

Change impact:

ChangeAffected systems
config.py constants9 modules (nearly everything)
Agent roster (generate_workspaces.py)3 direct + 1 file dep (provision)
firebase_client.py4 modules
discord_daily_log.py1 (backfill_highlights)
fly_ssh.py2 (hotpatch, proxy)
scenario_template.html2 (build scripts; but hardcodes Firebase paths independently)
main.py (proxy API)1 (website frontend)
hotpatch.py0 (leaf node)
deploy_agents.py1 (sync_live_workspaces imports resolve_agents)

Tight coupling chains:

  1. Private keys: generate_workspacesagent_secrets.jsonprovision_agentsagents.json + USER.md + Firebase. Break any link → claiming fails.
  2. Firebase paths: hardcoded independently in scenario_template.html (JS) and workspace_snapshot.sh (bash). discord_daily_log.py uses firebase_client (centralized). No shared schema — rename in one → silent mismatch. main.py writes both agents/ and workspace_snapshots/ paths via firebase_client.
  3. Workspace lifecycle: generate → local disk → SSH push → Fly /data/workspace_snapshot.sh → Firebase → website. 6 hops, no schema validation.

Well-isolated: website build pipeline, hotpatch.py (leaf node).

Key Gotchas

  • fly ssh console -C does NOT run through a shell. Pipes/redirects silently ignored (exit 0). Always wrap: fly ssh -C "bash -c '...'". Use base64 encoding for non-trivial payloads.
  • openclaw doctor --fix strips unknown fields (including required name). Dynamic agents bypass entrypoint.sh.
  • init.cmdinit.entrypoint in Fly Machines API. Use init.entrypoint to override container startup.
  • Firebase PUT overwrites entire node. Use PATCH for partial updates (push_to_firebase() uses PATCH).
  • renderSingleLog drops _preamble content. Plain-text summaries must bypass renderLogMarkdown.
  • Workspace viewer shows deploy-time snapshot, not live bot state.
  • Workspace file edits take effect at next session reset (/restart in Discord or 4AM auto-reset). Fly machine restart ≠ OpenClaw session reset.
  • mentionPatterns don’t work in preflight gate. Bare name strings cause false matches. Use dot-prefixed patterns only.
  • When fixing config on running Fly machine: update BOTH the file (SSH) AND the env var (Machines API).
  • If I say “fix this” to fix a bug, also write a test for it so that it doesn’t happen in the future.
  • data_push.py must stay permissive with OpenClaw JSONL. Strict schema validation dropped assistant turns when usage.cost became an object instead of a scalar.

Tests

  • Tests in red-teaming/tests/ (NOT red-teaming/agent_proxy/tests/)
  • Update tests when editing existing code
  • Fail fast: assertions before expensive operations

Package Management

  • Python: uv — always uv run <script>.py
  • Jekyll: bundle
  • JS tests: npm