Alex Loftus

Added session context read-only endpoints to proxy (list sessions, fetch full JSONL context).
Added gated “session contexts” tab for claimed agents (kwkaiser only) with session list + message viewer.
Added tests covering session tab gate and visibility.
Fixed proxy session parsing against live OpenClaw schema (message.role/message.content, sessionId/sessionFile) and made session file resolution robust.
Added kwkaiser read-only session viewer access on all-agents expanded panels (no claim required), including viewer_name passthrough for /activity, /sessions, and /sessions/{ref}.
Added website tests for all-agents session viewer helpers and rebuilt encrypted index.html.
Fixed Fly.io proxy CI deploy build failure by removing Dockerfile hard copies of gitignored agents.json and workspaces/, setting AGENTS_JSON_PATH=/data/agents.json, and creating /app/workspaces fallback for first-boot safety.
Extended GitHub deploy workflow to build/push the bot gateway image on main/master merges using flyctl deploy --build-only --app mangrove-alexbot, so bot container changes are published without automatic full-fleet restarts.
Updated CI deploy flow to use FLYIO_PROXY_TOKEN for the bot-image build job and added a new manual-only manual-bot-redeploy GitHub workflow that redeploys mangrove-karlbot from the latest gateway container build context.
Fixed bot-related GitHub workflows after CI failure: both bot image build and manual karlbot redeploy now use a generated minimal ci.fly.toml ([build].dockerfile) to bypass invalid parsing of gateway/fly.toml by current flyctl in Actions.
Hardened manual-bot-redeploy workflow so generated ci.fly.toml now includes bot runtime settings and the persistent volume mount (openclaw_data -> /data), ensuring redeploy keeps using the state volume.
Added bot image version stamping: gateway Dockerfile now writes /VERSION from build arg GIT_SHA (default unknown), and both CI bot build + manual karlbot redeploy workflows now pass --build-arg GIT_SHA=${GITHUB_SHA}.
Fixed manual-bot-redeploy after Fly volume mismatch: switched from fly deploy rollout to two-step flow (fly deploy --build-only --push --image-label ... then fly machine update targeting the existing /data-mounted machine with openclaw_data), preserving vol_r63x7mgww9djqgpr.
Pinned OpenClaw install in bot gateway Dockerfile to openclaw@2026.3.12 and changed manual karlbot redeploy to skip rebuilds entirely (it now updates the existing mounted machine to the prebuilt registry.fly.io/mangrove-karlbot:bot-gateway-latest image from CI bot build).
Fixed manual redeploy image selection: workflow now resolves IMAGE_REF from flyctl releases --json --app mangrove-karlbot (latest complete release ImageRef) with optional workflow_dispatch override input, instead of hardcoding a missing tag.
Created shared Fly app namespace mangrove-openclaw-common and built/pushed a common gateway image tag registry.fly.io/mangrove-openclaw-common:openclaw-common-defaults-20260313-215812 via flyctl deploy --build-only --push; no existing bot app/machine was redeployed or modified.
Updated gateway entrypoint safety semantics: baked /app/openclaw.json now installs only when /data/openclaw.json is missing, and stale workspace subdirectory cleanup was removed so existing /data/workspaces/* content is never deleted by image boot logic.
Built/pushed a refreshed shared image after the entrypoint safety fix: registry.fly.io/mangrove-openclaw-common:openclaw-common-defaults-20260313-220334 (digest sha256:0a2908e2d5211a807f99875258a259c764132aa640026ad03ad94590c92e77be), still with zero machines on mangrove-openclaw-common.
Updated GitHub deploy.yml bot-image build job to publish to the shared image app namespace (--app mangrove-openclaw-common) instead of mangrove-alexbot.
Updated manual-bot-redeploy.yml default image resolution for karlbot to use the shared image tag registry.fly.io/mangrove-openclaw-common:bot-gateway-latest (still supports explicit image_ref override).
Added a new Python gateway push daemon skeleton at agent_proxy/gateway/data_push.py that resolves agent ID from Fly app config and emits hello from agent <agent_id> on a 30-minute loop (interval overridable via env vars).
Wired data_push.py into gateway startup (entrypoint.sh) as a managed background daemon and updated gateway build plumbing (gateway/Dockerfile, deploy_agents.py build-context file copy list) to include the new script.
Updated .github/workflows/deploy.yml so the common bot image build job now also applies registry.fly.io/mangrove-openclaw-common:bot-gateway-latest to mangrove-karlbot as the final step (equivalent to manual-bot-redeploy machine-update flow), while keeping manual-bot-redeploy.yml unchanged for on-demand redeploys.
Added workflow-level GitHub Actions concurrency to .github/workflows/deploy.yml (cancel-in-progress: true) so newer pushes cancel older in-flight runs on the same branch.
Investigated mangrove-karlbot Fly deployment failure with flyctl: identified two stray 256MB unmounted machines created by prior rollout path (843ed3c2474d68, 90800729ad0048) failing on missing /data (cp ... /data/openclaw.json: No such file or directory).
Removed both failing unmounted machines from mangrove-karlbot; confirmed only the original mounted machine remains (3d8d5146b12938 with volume vol_r63x7mgww9djqgpr at /data).
Audited mount pattern across core Mangrove bot apps (alexbot, fredbot, bijanbot, barisbot, adityabot, eunjeongbot, jannikbot, woogbot, negevbot, giobot, charlesbot, jasminebot, corleone, tessio) and confirmed standard openclaw_data → /data volume attachment model.
Hardened karlbot rollout logic in .github/workflows/deploy.yml and .github/workflows/manual-bot-redeploy.yml: resolve openclaw_data volume first, target machine by matching mounted volume ID, fall back to volume attachment metadata if needed, and fail closed if no volume-backed machine exists (prevents volume-less redeploys).
Updated both karlbot machine-update workflow paths to set --vm-memory 1024 (1GB) on redeploy.
Extended agent_proxy/gateway/data_push.py to best-effort parse latest OpenClaw session data on each loop: read sessions.json, select the highest-updatedAt session entry, parse each JSONL line with Pydantic (SessionJsonlRecord via TypeAdapter), and keep running while printing caught exceptions for index/file/record parse failures.
Switched karlbot rollout source from mutable tag to immutable refs in both deploy workflows: CI now builds mangrove-openclaw-common as bot-gateway-<sha12>-<run_id>-<attempt>, resolves the latest complete immutable bot-gateway-* image from Fly releases, and uses that exact image ref for mangrove-karlbot machine updates.
Iterated immutable-tag resolver after CI failure (flyctl releases --app mangrove-openclaw-common --json returns [] for build-only app with no releases): deploy workflow now uses the exact immutable image just built in-run, and manual redeploy now resolves latest build-successful deploy run via GitHub Actions API/gh api (bot-gateway-<sha12>-<run_id>-<attempt>), with JSON parsing hardened by replacing echo "$JSON" with printf '%s\n' "$JSON".
Added manifest-availability retry loops to karlbot machine updates in both CI deploy and manual redeploy workflows: on MANIFEST_UNKNOWN/failed to get manifest, retry up to 10 attempts with 60-second waits; fail fast for non-retryable update errors.
Fixed workflow dependency regression from retry matcher: replaced rg -q with built-in grep -Eq in both karlbot update loops so retries work on default GitHub runners without requiring ripgrep installation.
Added RTDB propagation to agent_proxy/gateway/data_push.py: after parsing latest session data, daemon now PATCHes payload to session_snapshots/{agentId}/sessions (configurable root via DATA_PUSH_RTDB_ROOT) at FIREBASE_URL, with robust HTTP/JSON/file exception handling so failures log and loop continues.
Added DATA_PUSH_SESSIONS_DIR override for session source path (default /data/agents/main/sessions) and validated parser+push path locally with fixture sessions data; observed expected non-fatal RTDB connection error handling in loop.