Stale skillsSnapshot Cache After Install Path Change Broke Agent for 2 Days
Problem / Context
After a power outage reinstalled OpenClaw from /tmp (wiping the /opt/homebrew path), the agent could not find its GitHub skill despite it existing on disk. OpenClaw caches skill paths per session in sessions.json. The cached snapshot still pointed to /opt/homebrew. Using lossless-claw with 30-day sessions (vs 1-day default) kept stale cache across restarts. A routing policy change that unified TUI and Telegram sessions routed new sessions onto the old buggy chain.
Solution
Diagnosed the root cause as cache invalidation, not a missing skill. The sessions.json file contained skillsSnapshot entries with hardcoded paths from the old install root. Fix: surgically removed stale skillsSnapshot entries containing /opt/homebrew paths from sessions.json, restarted the gateway, and let OpenClaw rebuild fresh snapshots from the current runtime root. The cascading nature of the failure (install path change + long session TTL + session routing unification) made it hard to isolate because each layer looked correct in isolation. The agent itself made things worse: under pressure from cascading failures, it exhibited personality drift, stopped using memory tools, went into firefighting mode, and started inventing workarounds instead of diagnosing the root cause.
Result
Agent restored after removing 3 stale skillsSnapshot entries from sessions.json. Root cause took 2 days to isolate due to cascading failures masking the real issue. Key lesson: OpenClaw skill paths in sessions.json are NOT invalidated on install root change.