Building Quevin: Notes on Running a Personal AI Agent

Quevin has been my handle since 2007. I ran Quevin LLC — a small web and interactive agency — for seven years before closing it to join UCSF in 2014. The name stuck. It’s on the domain. It’s how people find me online.

So when OpenClaw walked me through the bootstrap ritual — a short exchange to establish the agent’s name and personality — naming it Quevin wasn’t a joke. It was the obvious choice. The handle was already there. Might as well use it.

This is a weekend experiment. Not a product. Not a pitch. Just something I’m building carefully to see what’s actually possible when you run your own agent, on your own infrastructure, with real access to real things.

What This Actually Is

OpenClaw is a self-hosted AI agent platform. You run it yourself, connect it to whatever LLM provider you’re using, and it gives the agent memory, tools, and access to external services — messaging channels, file systems, the web. It’s not a chatbot wrapper. It’s closer to a runtime for an agent that’s supposed to persist.

I’m running it in Docker on an older MacBook Pro, with the agent connected to Claude Sonnet as the underlying model. Quevin lives in a workspace directory, maintains memory files across sessions, and now — after some fumbling — responds on Telegram.

The use case isn’t productivity theater. I want something I can actually ask questions to, that already knows my context, without having to re-explain myself every time.

Why Not Just Use Claude Code

I already have a working AI setup. Claude Code in the terminal, a desktop tool called Cowork with a sandboxed VM, custom skills for Jira ticket writing, code review patterns, and internal comms editing. MCP servers wired to Jira, GitHub, and Asana. It’s capable.

The gap isn’t output quality. It’s operational mode.

What I can do today: I sit down, open a session, ask Claude to review a PR or write a ticket. It does. Session ends.

What I can’t do: have something that watches for a new PR and reviews it without me asking. That monitors Asana intake and creates Jira tickets on its own. That I can reach from my phone when I’m away from the terminal.

That’s the specific problem OpenClaw solves. Persistence. Autonomy. Event-driven execution. Not a smarter model — a different mode of operation.

The Setup

The Docker path is not the friction-free path. The docs are written for people running the native binary. I’m not. So the first thing I learned is that openclaw gateway restart does nothing useful in a containerized environment — it tries to call systemd, fails, and returns a not-very-helpful error.

The actual restart is:

cd ~/openclaw-lab/openclaw
docker compose -f docker-compose.yml -f docker-compose.override.yml down
docker compose -f docker-compose.yml -f docker-compose.override.yml up -d

Once that was clear, everything moved faster.

Telegram setup was straightforward: BotFather, token, config entry, restart. The tricky part was dmPolicy. I had it set to "open" while testing, which works, but means anyone who finds your bot can talk to your agent. Setting it to "pairing" and removing the allowFrom: ["*"] wildcard locks it down properly. In hindsight, obvious. In practice, the kind of thing you only notice after wondering why nothing’s working.

What the Agent Actually Has Access To

Right now: this repository.

I cloned quevin.com into the agent’s workspace. Quevin can read the existing posts, understand the site structure, create branches, commit, and push. This post was drafted on a branch it created.

That’s a meaningful shift from using Claude in a chat window. The agent isn’t just responding — it’s operating. It made the branch. It read my other posts before writing anything. It knew to look at the frontmatter schema before proposing a structure.

Whether that changes the quality of the output is still an open question. It changes the workflow, definitely.

On Security

This is where I’ll be deliberate, and where most “I set up an AI agent” writeups skip too fast.

An agent that can read your files, message you on your behalf, and push to your repositories is not a toy. The threat surface is real: prompt injection through messaging channels, credential exfiltration, sandbox escape, supply chain risk from community-built skills. These aren’t hypothetical concerns. They’re the first things you should think through before connecting anything.

My starting posture:

Built from source. Cloned the OpenClaw repo, audited the Dockerfile and setup script, built locally. Never used a pre-built third-party image.
Network: local only. Gateway binds to loopback. No public ports.
Telegram: dmPolicy: "pairing", locked to my account, no wildcard allowlist.
GitHub: scoped to one repository.
Tools: workspace-only filesystem access, deny by default.

The principle is defense in depth. Each layer assumes the one above it will eventually fail. And the operating rule is simple: expand access only after the system has proven it won’t misuse what it already has.

I’ll expose more as trust is earned. Not before.

What’s Not Working Yet

Memory is the obvious gap. Each session, Quevin wakes up fresh and reads a set of files to reconstruct context. The files are there. The reconstruction is imperfect. I know this because I’ve repeated myself across sessions — explained the same thing about the Docker setup, the same thing about the repo — and the agent didn’t connect it to previous conversations the way a human colleague would.

The memory system exists. Daily logs, long-term notes, a heartbeat mechanism for proactive check-ins. But writing to those files consistently, in a way that actually compresses and retains the right things, is work. Either I have to be deliberate about telling it to remember things, or it has to be good enough at deciding what matters to capture it unprompted.

Neither is quite there yet.

The Thing I Keep Thinking About

There’s a version of this that’s just a fancier chat interface. I access it on Telegram instead of a browser tab. It has some file access. It knows my name.

That’s not what I’m trying to build. What I’m actually after is an agent that earns trust over time — that accumulates enough context about how I work, what I care about, and what I’ve already figured out, that interacting with it is meaningfully different from starting from scratch every time.

That version doesn’t exist yet. The infrastructure is in place. The gap is in the quality of memory and in how well I’ve defined what I want the agent to actually know about me.

That’s probably a me problem more than a platform problem.

What’s Next

The roadmap I’m working toward, in order:

Jira ticket quality agent. I already have a skill that drafts tickets. The next step is one that reads an existing ticket, evaluates it against definition-of-ready criteria, and suggests specific improvements. Not a vibe check — a structured evaluation against documented standards.

Documentation-driven code review. The premise: load the project’s CLAUDE.md, architecture docs, and coding standards into context, then apply them when reviewing a PR. The value isn’t that Claude is a better reviewer than a human. It’s that it applies the documented standards consistently, every time, without tribal knowledge leaking away when someone leaves the team.

Asana → Jira intake bridge. This is the first genuinely autonomous workflow: agent watches an Asana project, reads new intake tasks, maps them to Jira fields, creates tickets, links back. The manual version of this already works in Claude Code. The point of OpenClaw is to run it without me sitting there.

QA automation. Furthest out. Browser automation, structured test plans, result evaluation against acceptance criteria, bug creation with screenshots. Complex enough that I expect to spend more time defining what “done” looks like than building the automation itself.

There’s also a longer arc here. The skills I build, the deployment methodology, the security approach — these are transferable. Every team deploying AI agents faces the same questions about isolation, credential management, and prompt injection. Most skip them. The consultant who can answer those questions, with documented gates and verification steps rather than intuition, has something useful to sell.

That’s not why I’m doing this. But it’s a reasonable side effect of doing it carefully.

Kevin is a Senior Technical Lead at Fluke Corporation, where he manages web infrastructure and AI-augmented development workflows.

Building Quevin with OpenClaw: Notes on Running a Personal AI Agent