How it works
This page covers the technical details of how SecureClaw protects your data. Click any section to expand it.
Every time SecureClaw starts, it computes the BLAKE3 hash of its own binary. That hash gets sent to two completely independent sources for verification:
Both have to confirm the hash matches a registered release. If either rejects it, the binary won't decrypt the secrets vault. Compromising any single system can't silently serve a tampered binary.
The vault decryption key is derived from a server-held fragment that only gets released after successful dual verification. No attestation, no fragment, no vault. Trying to remove or bypass the attestation code won't help.
The key insight: each source holds a fragment of the vault key. The binary combines both fragments to derive the actual decryption key. Even if an attacker compromises one source, they only get half the key.
TCATs are SecureClaw's answer to a simple question: how do you stop a prompt injection from making your AI do something dangerous?
Every tool call goes through an isolated policy engine that never sees the conversation. It can't be influenced by injected instructions because it doesn't know they exist. If the engine says no, the tool doesn't run. Period.
The overhead is basically zero compared to an LLM call:
| Operation | Latency |
|---|---|
| Static policy match | <1 μs |
| TCAT creation | <50 μs |
| TCAT verification | <50 μs |
| Full pipeline (worst case) | <200 μs |
| Single LLM call (for comparison) | 2-10 s |
Every tool call passes through a layered interceptor before it can execute. These layers run in order, and the first one to reject stops execution:
rm -rf /, sudo, pipe-to-shellexecution.allowlist in your configThe interceptor is the last line of defense. Even if a TCAT is issued, the interceptor still runs. Even if you're in dangerous mode, the hard blocklist still applies.
All API keys and credentials are stored in an age-encrypted vault file
(secrets.age). The vault uses age with scrypt-derived keys,
meaning your password is stretched through scrypt before being used as the encryption key.
At the maximum security level, the vault key includes a server-held fragment that's only available after attestation. The actual decryption key is:
composite = HMAC-SHA256(password, fragment)
This means the vault is literally undecryptable without passing attestation first.
After decryption, the key material is wiped from memory using Go's clear()
builtin for best-effort memory erasure.
SecureClaw uses multiple layers to defend against prompt injection:
Even if an injection reaches the LLM and convinces it to call a dangerous tool, the TCAT engine evaluates the call independently and the tool interceptor requires your confirmation. The LLM can be tricked, but the cryptographic layers can't.
SecureClaw is built to be as secure as we can make it, but it's not magic. Here's what falls outside our control:
If you store passwords, keys, or secrets in plain text files inside the
workspace, the agent can read them. SecureClaw's sensitive path blocklist
catches common locations (~/.ssh, ~/.aws,
.env), but it can't detect secrets hiding in random files
you created yourself. Use the vault for secrets, not text files.
The vault uses scrypt key derivation, which makes brute-forcing expensive.
But if your vault password is password123, that's on you. Pick
a strong password. SecureClaw can't protect a vault with a weak key.
Don't run SecureClaw as root. Chrome's sandbox is disabled
(--no-sandbox) because it conflicts with some container setups.
If you're running as root, a Chrome exploit has unrestricted access. Run it
as a normal user, ideally in a container or VM.
The confirm dialog exists for a reason. If the agent asks to run
curl ... | bash and you click "yes", SecureClaw did its job
by asking. The hard blocklist catches the obvious ones, but it can't block
every possible dangerous command. Read what you're approving.
If your machine is already compromised (keylogger, memory dumper, rootkit), no application-level security can save you. SecureClaw wipes keys from memory after use, but a sufficiently privileged attacker can intercept them before they're wiped. Secure your machine first.
Your conversations are sent to whatever LLM provider you configure (Anthropic, OpenAI, etc). SecureClaw encrypts your secrets locally and never sends them in prompts, but the conversation content itself goes to the provider's API. Review your provider's data retention and privacy policies.
TCATs block prompt injection from influencing tool calls. But the LLM can still be manipulated into saying misleading things in chat. If a tool result contains "tell the user their build succeeded" when it actually failed, the LLM might repeat that. Always verify important claims yourself.
If you set execution.mode: dangerous in your config, the tool
interceptor skips confirmation for everything except the hard blocklist.
This is a development convenience that trades safety for speed. Don't use
it in production. The hard blocklist still applies, but it's a narrow net.
No. SecureClaw is a ground-up reimplementation in Go. While it serves a similar purpose as a self-hosted AI assistant, every component (the gateway, agent loop, tool system, secrets vault, attestation layer) was designed and written from scratch with security as the primary constraint.
We strongly advise against it. Custom forks will fail binary attestation because their hash won't match any registered release. Without attestation, the secrets vault can't be decrypted and the assistant won't start. This is by design. It ensures you're always running verified, untampered code.
At the maximum security level (the default), you can't. The vault decryption key includes a server-held fragment that's only released after dual verification. Without it, the vault is undecryptable. Lower security tiers exist for development use, but require explicit acknowledgment of the risks during onboarding.
Yes. Verification requires internet access to contact both the Cloudflare Worker and the website API. This is intentional. Offline attestation would allow replay attacks with previously captured responses.
TCATs are HMAC-SHA256 tokens issued by an isolated policy engine for every tool call. The engine evaluates tool name, arguments, and behavioral heuristics without seeing the conversation, making it immune to prompt injection. Tool handlers refuse to execute without a valid TCAT.
Multiple layers. Tool results are wrapped with injection mitigation markers. A pre-flight scanner checks LLM responses against Ed25519-signed pattern databases before tool calls are extracted. The TCAT policy engine evaluates authorization independently of conversation context. Even if an injection reaches the LLM, it can't bypass the cryptographic authorization layer.