security model

how Mentat protects what it can't afford to lose.

The threat model is honest about what Mentat solves and what stays on the operator. This page walks through the gates between an LLM that can write code and a wallet that holds capital.

key custody

Wallet private keys and API tokens are encrypted at rest with Fernet (AES-128-CBC + HMAC). The encryption key lives in a single file outside the repository (.encryption_key), mode 600, owned by the bot user.

Keys never appear in logs, in chat history, or in tool responses. Tool calls that need a key load it into memory just-in-time and never echo it back to the model. If the encryption key file is missing, the bot refuses to start — it does not silently fall back to plaintext.

confirmation gates

Every action that moves funds, sends a message, posts to a public surface, or writes a file routes through a yes/no confirmation dialog persisted inpending_confirmations. The operator sees a one-line summary in Telegram: "send 5 SOL to ABC… for $725? yes/no". Until you reply, the action does not fire.

Phrase-gated actions add a second step. Setting up a new key, lifting a daily cap, or executing a trade above a configured size requires the operator to type a specific phrase (e.g. "I accept the risk"). This blocks any path where the model alone — even if jailbroken — could authorise a destructive action.

paper-mode default

Trading deputies ship paper-mode by default. Live mode is opt-in per-deputy and per-asset. The exchange call still happens in paper — we mark the position, simulate fills, accrue funding — but no order reaches the venue. Operators move to live only after watching paper behave for a known period.

sandbox for self-modifying code

When you ask Mentat for a capability it doesn't have, the self-modification deputy researches the API, generates code, and runs the result in a Docker container with no network and no mounted secrets. Tests run inside the sandbox. Only after they pass does the proposal surface as a diff for review.

Once you approve the diff, three rollback layers protect the live system:

  1. git commit (most reversible)
  2. filesystem snapshot of the touched files
  3. schema diff of any DB tables added or altered

A bad apply rolls back automatically on test failure or service crash. A bad apply that passes tests but misbehaves in production rolls back via a single Telegram command.

rate + cost gates

The bot enforces a daily USD cap (LLM API cost) and refuses to spend past it. The cap short-circuits before any API call is made — not after. Trading deputies have per-asset position caps and a loss-of-day floor; both close positions and pause the deputy when tripped.

what happens if the host dies

The bot is one Python process and one SQLite file. If the VPS dies mid-trade, open positions stay on the venue (Hyperliquid, Aave) — they don't depend on the bot to remain alive. On restart, the bot reconciles state from the venue and resumes monitoring.

Backups: the SQLite file is replicated nightly to encrypted offsite storage. The encryption key is not in the backup. To restore, you supply the encryption key and the latest snapshot — without both, the backup is useless to anyone, including us.

what the gate cannot do

The confirmation gate runs in the same Python process as the bot. If the host machine is compromised (someone has shell access), the gate can be bypassed by editing the bot's state directly. The gate protects against bot mistakes, not against an attacker with root.

Operator-hardening checklist lives in /docs — at minimum: SSH keys only, fail2ban, ufw, no root login, full-disk encryption, off-host encryption-key copy.

reporting a vulnerability

Email hello@mentatai.xyz with subject security. We reply within 24 hours. Disclose privately first; we will credit you on a public advisory once the fix ships if you want to be named.