I got myself banned.
Not from anything nefarious — from a viral open-source Telegram bot that relayed messages to Claude. The project was clever, but it was also a security disaster: public endpoints, leaked API keys, prompt injection vulnerabilities wide open, and Anthropic rightfully shutting down accounts that were being abused through it. I watched the GitHub issues pile up — account bans, credential exposure, runaway API costs — and thought: I want what this project promises, but I want it built right.
So I built Henry.
What Henry Is
Henry is a secure, single-user Telegram bot written in Go that relays your coding questions to multiple AI backends — Claude Code, Grok, and GPT — and returns answers with full conversation persistence. It runs as a local daemon on your machine. No public endpoints. No shared infrastructure. No attack surface beyond the Telegram long-polling connection you already trust with your phone.
The name is not an acronym. It is what it is.
The core idea is simple: I want to message my AI coding assistant from anywhere — my phone on the train, my iPad on the couch, my desktop at 2 AM — and have it maintain context, remember what we were working on, and give me production-quality answers without me ever opening a browser tab or SSH session.
Why Not Just Use the Claude App?
Fair question. Here is what Henry gives me that the standard interfaces do not:
-
Multi-model routing. I can switch between Claude, Grok, and GPT mid-conversation with a single command. Different models have different strengths. Claude excels at agentic coding and long-context reasoning. GPT is strong at quick code generation. Grok has a stateful interpreter. Henry lets me reach all of them from one chat thread.
-
Claude Code in headless mode. This is the critical differentiator. Henry does not call the Anthropic API directly. It invokes the Claude Code CLI in headless mode (
claude -p --output-format json). This means Claude has access to the full agent loop — tool use, file system access, multi-turn reasoning — not just a single completion endpoint. When I ask Henry to review code, it is not generating a response from a prompt; it is running an agent. -
MCP server integration. Through Model Context Protocol, Henry gives Claude access to my GitHub repositories, Google Calendar, Gmail, Things 3 task list, local git repos, and filesystem. I can message Henry “what’s on my calendar today?” or “create a PR for the auth fix on the henry repo” and it just works.
-
Persistent sessions across devices. Every conversation is stored in Supabase (PostgreSQL). I can start a debugging session on my laptop, continue it from my phone, and pick it back up at my desk — with full context preserved.
-
Security I actually trust. Because I built it, audited it, and run it on my own hardware.
The Architecture
Henry follows standard Go project conventions. The entire application compiles to a single binary with zero runtime dependencies beyond the claude CLI.
henry/
├── cmd/henry/ # Entry point, signal handling, graceful shutdown
├── internal/
│ ├── backend/ # Backend interface + Claude/Grok/GPT implementations
│ ├── config/ # Environment-based configuration
│ ├── router/ # Message routing and inline model selection
│ ├── sanitizer/ # Input validation and sanitization
│ ├── session/ # Supabase PostgreSQL persistence
│ ├── telegram/ # Telegram bot handler
│ ├── voice/ # Gemini transcription + ElevenLabs synthesis
│ ├── phone/ # Twilio call/SMS integration
│ └── embeddings/ # OpenAI text embeddings
└── docs/ # Architecture docs, PRD, task breakdown
The design philosophy is straightforward: every component is behind an interface, every feature is opt-in, and the core path from Telegram message to AI response is as short as possible.
The Backend Interface
The extensibility story is a simple Go interface:
type Backend interface {
Name() string
Send(ctx context.Context, sessionID string, message string) (Response, error)
ClearSession(sessionID string) error
IsAvailable() bool
}
A registry pattern holds all registered backends. Adding a new AI provider means implementing four methods and registering it. The router resolves which backend to use based on the user’s current model selection or an inline prefix (claude: explain this code).
Claude Code: The Headless Agent
This is the heart of Henry. Rather than making raw API calls to Anthropic, Henry shells out to the Claude Code CLI:
args := []string{
"-p",
"--output-format", "json",
"--dangerously-skip-permissions",
}
if c.mcpConfigPath != "" {
args = append(args, "--mcp-config", c.mcpConfigPath)
}
if hasSession {
args = append(args, "--resume", claudeSessionID)
}
The --resume flag is what makes multi-turn conversations work. Henry generates a UUID for each new session, passes it to Claude on the first message, and then resumes that session on every subsequent message. The Claude session ID is persisted to Supabase so it survives process restarts.
The JSON response from Claude includes cost tracking, duration metrics, turn counts, and the session ID — all of which Henry logs and stores:
type ClaudeJSONResponse struct {
Type string `json:"type"`
CostUSD float64 `json:"cost_usd"`
IsError bool `json:"is_error"`
DurationMS int `json:"duration_ms"`
NumTurns int `json:"num_turns"`
Result string `json:"result"`
SessionID string `json:"session_id"`
}
This gives me full visibility into what every conversation costs and how long it takes. When you are running an AI agent that can autonomously execute multi-turn tool chains, knowing the cost per interaction is not optional — it is table stakes.
The Message Flow
Every message follows a deterministic path:
- Authorization — Telegram user ID checked against the single authorized user. Everyone else gets rejected immediately.
- Sanitization — Input is validated for length (configurable, default 10,000 chars), control characters are stripped, whitespace is normalized. Empty messages are rejected.
- Session resolution — The router looks up or creates a session in Supabase.
- Backend routing — The message is dispatched to the appropriate AI backend. Inline model prefixes are parsed and respected.
- Conversation logging — Both the user message and assistant response are persisted to the
messagestable. - Response formatting — Claude’s markdown output is cleaned up for Telegram’s subset of markdown support. Messages exceeding Telegram’s 4096-character limit are intelligently split at newline or space boundaries.
No step is skippable. No shortcut path exists for “trusted” input. The sanitizer runs on every message, the auth check runs on every update, and the session store logs every interaction.
Security: Lessons from the Projects That Got It Wrong
I wrote about OpenClaw and the agentic exploitation risks recently. Henry exists in part because those risks are real, and the projects that inspired it handled them poorly.
What the viral relay bots got wrong:
- Public HTTP endpoints. If your bot has a webhook URL, it has an attack surface. Henry uses Telegram long-polling exclusively — it initiates all connections outward. Nothing listens.
- Shared API keys. Some projects had users sharing a single Anthropic key or, worse, submitting their own keys to a public server. Henry uses the Claude CLI’s own authentication. No API keys are stored or transmitted.
- No input sanitization. Prompt injection is trivial when user input flows directly to an LLM. Henry sanitizes every message before it reaches any backend.
- No authorization. Some bots were open to any Telegram user. Henry checks the user ID on every single update. If you are not the configured user, you get a rejection message and a log entry.
- Verbose error messages. Leaking stack traces, file paths, or configuration details in error responses is a classic information disclosure vulnerability. Henry’s error messages are user-facing and sanitized.
What Henry gets right:
- Single-user by design. This is not a multi-tenant SaaS product. It is a personal tool. The threat model is correspondingly simple: protect the single authorized user’s data and prevent unauthorized access.
- Local-only operation. The binary runs on your machine. There is no cloud deployment, no container orchestration, no ingress controller. The attack surface is the Telegram Bot API polling connection and the Supabase database connection — both outbound, both TLS-encrypted.
- Environment-based secrets. All configuration lives in a
.envfile that never gets committed. The.env.exampletemplate documents every variable without exposing values. - Defense in depth. Even if someone somehow bypassed Telegram’s own security and injected a message, they would still hit the user ID check, the sanitizer, and the rate limiter before reaching any AI backend.
The Optional Modules: Voice, Phone, Embeddings
Henry’s core is lean — Telegram in, AI out, sessions persisted. But the modular architecture supports opt-in capabilities that extend it significantly.
Voice transcription uses Google’s Gemini API. Send Henry a voice message on Telegram and it transcribes it to text, then routes the transcription to your selected AI backend. This is surprisingly useful when you are debugging on your phone and want to describe a problem verbally rather than thumb-typing code.
Voice synthesis uses ElevenLabs. Henry can read responses back to you. Configurable voice IDs, stability parameters, and model selection. I use this less frequently, but it is useful for long explanations when I am away from a screen.
Phone integration via Twilio lets Henry make calls and send SMS. The TwiML helper functions support Say, Play, Gather, and Record operations. This is the most experimental module — the use case is receiving urgent alerts or interacting with Henry via phone when Telegram is not available.
Text embeddings via OpenAI enable semantic search across conversation history. This lays the groundwork for a future feature: intelligent context retrieval, where Henry can pull relevant past conversations into the current session’s context window.
Each module is gated by a feature flag in the configuration. If you do not set VOICE_TRANSCRIPTION_ENABLED=true and provide a GEMINI_API_KEY, the voice module never initializes. Zero overhead for features you do not use.
Building It: Go as the Right Choice
I chose Go for Henry for the same reasons I choose it for most infrastructure projects:
- Single binary deployment.
go build -o henry ./cmd/henryand you are done. No runtime, no dependency hell, no container required (though you can containerize it if you want). - Cross-compilation.
GOOS=linux GOARCH=amd64 go buildgives me a Linux binary from my Mac. Henry runs on my MacBook locally and on a Linux server for always-on operation. - Concurrency model. Each Telegram message is handled in its own goroutine. The Claude CLI can take 30+ seconds for complex agentic tasks. Go’s goroutines handle this naturally without blocking other messages.
- Strong standard library. HTTP clients, JSON parsing, context propagation, signal handling — Go’s standard library covers 90% of what Henry needs. The external dependencies are minimal: the Telegram bot library, the PostgreSQL driver, UUID generation, and dotenv loading.
The pgx/v5 driver with connection pooling (min 2, max 10 connections, PgBouncer compatible) handles the Supabase persistence. The schema auto-initializes on first run — no migration tool, no setup step beyond providing the connection string.
What I Actually Use It For
Henry is not a toy. It is my daily driver for coding assistance. Here is what a typical day looks like:
Morning commute: I message Henry from my phone with the previous day’s TODO items. “Review the PR I opened on the infrastructure repo yesterday and summarize the changes.” Henry, via MCP’s GitHub integration, fetches the PR, reads the diff, and gives me a summary.
At my desk: I switch to a multi-turn debugging session. “The AlloyDB migration script is failing on the foreign key constraints for the audit tables. Here’s the error…” Henry maintains context across 15-20 messages as we work through the issue together.
Late evening: An idea hits me. I message Henry from my iPad: “Sketch out a Go function that takes a Kubernetes pod spec and validates it against our security policies. Use the OPA constraint framework.” Henry generates the code with the full agentic loop — researching the OPA API, writing the function, and suggesting test cases.
All of these sessions are persisted. I can run /status to see message counts and session history. I can /clear to start fresh. I can switch to GPT for a second opinion with gpt: what do you think of this approach?.
What Is Next
Henry is an MVP that works well for its intended purpose. The roadmap includes:
- Grok and GPT backend implementations. The stubs are in place. The interface is defined. It is a matter of implementing the API clients.
- Rate limiting. The configuration supports it, but the enforcement layer is not yet wired up.
- Usage analytics. Henry already tracks cost per Claude interaction. Aggregating this into a daily/weekly report is straightforward.
- Automatic model fallback. If Claude is rate-limited or down, fall back to GPT automatically.
- Rich formatting. Collapsible sections and inline diffs in Telegram responses for long code reviews.
Try It Yourself
Henry is MIT licensed and designed for exactly one type of user: a developer who is comfortable with Go, environment variables, and running a daemon. If that is you, the setup takes less than 20 minutes:
- Clone the repo and
go build -o henry ./cmd/henry - Create a Telegram bot via @BotFather
- Get your Telegram user ID via @userinfobot
- Set up a Supabase project (free tier works)
- Copy
.env.exampleto.env, fill in your values - Run
./henry
That is it. One binary. One .env file. Your own private AI coding relay that nobody else can access, nobody else can abuse, and nobody can shut down.
I named it Henry because every good tool deserves a name, and every engineer deserves an assistant they can trust. Henry is mine. Build yours.