Making our AI coding agent the only way we build our product
AnyFrame built an AI coding agent named Gilfoyle that ships code to production autonomously. The agent runs in a sandbox, is triggered via Discord, and can open PRs, run tests, and deploy. The company plans to make Gilfoyle the sole developer of its product, creating a self-improving loop.
Our newest team member is named Gilfoyle. He's an intern, and this week he shipped more code to production than most of the humans on the team. One of his pull requests added an entire feature, database migration and all. We merged it without anyone else touching a line. He built the page you're reading, too. And he did all of it under his own commit identity.
"Gilfoyle" isn't a person: he's our internal developer agent, an AI in a cloud sandbox with our codebase checked out and a shell open. We named him after the Silicon Valley character, the most arrogant, deadpan systems engineer on TV, demoted to intern and somehow still our most productive headcount.
The punchline: we built AnyFrame, a control plane for sandboxed AI agents, then pointed AnyFrame at AnyFrame. Gilfoyle runs on the product he ships.
What AnyFrame is
You define an agent: a repo, an install command, a system prompt, skills (reusable playbooks), and connections to tools like Slack, Linear, or GitHub. Then you boot it into an isolated cloud sandbox that can actually do the work. Not a chatbot that drafts suggestions: a real machine with your code and a terminal, that opens PRs, runs your tests, and ships. If you've wished Claude Code or Codex lived on a server your team could @mention and trust with private repos, that's Gilfoyle.
Hiring him took ten minutes
We onboarded him the way any customer would, in three steps.
- Create a template. A template is the blueprint: the repo (our monorepo),
the install command, the skills, and a system prompt carrying our conventions and his personality. Anything spun up from it inherits the lot.
The template: define the blueprint once, reuse it for every agent.
- Create the agent. Spin an agent off that template, pick the Claude
runtime (the model that actually drives him), and connect GitHub so he can clone and push under his own commit identity.
Gilfoyle's config: runtime, repo, skills, triggers, and permissions in one place.
- Point Discord at him. Switch on AnyFrame's Discord integration and aim it
at the agent. No glue code, no service to babysit.
The whole wiring: a Discord server, pointed at the Gilfoyle agent.
Now the workflow is just @mention him in any channel. He spins up a thread, boots a fresh sandbox with the repo cloned, and streams the work back. Follow-up messages go to the same sandbox; if it's evicted between turns, the next message silently resumes from a snapshot. One thread = one sandbox = one unit of work, so three people can put him on three tasks at once. He's an intern who clones himself.
A typical brief: @ him, describe the change in plain English, get back to your day.
Giving him the whole company in one checkout
Two deliberate choices gave us the leverage: what he works on, and how he proves it.
AnyFrame isn't one repo: it's a backend, a dashboard, an SDK, and a few integrations evolving together. Point an agent at one and it's blind to the rest, so we stitched everything into a single monorepo, one checkout with every repo, each still pushing to its own remote. A task spanning the API, its client, and the SDK is one coherent piece of work, not three context switches he can't make.
Every repo as a submodule, one place to work from.
Then there are skills. The one that earned our trust is the proof-of-work skill: every sandbox ships with Chromium and Playwright baked in, so when Gilfoyle changes something you can see, he boots the dev server, drives a real browser to the page, and staples a screenshot to the work. He doesn't say the banner's fixed; he shows it fixed, on a phone-width viewport. When a picture isn't enough, AnyFrame tunnels his sandbox to the public internet and he drops a live URL to his running branch, before a line has merged.
A real PR comment, proving a change with a screenshot, of this very post.
That collapses the review loop: an intern who announces he's done is a liability; one who hands you a screenshot and a link to try is a colleague.
What he actually does
He ships code, from one-line fixes to whole features. Small end: "the first-sandbox banner wraps to three lines on mobile, fix it" (commit b1a4382), or correcting a wrong TTL in the docs. At the other end, PR #128, recurring agent session scheduling, was opened autonomously under his own identity and merged: a database table, a background scheduler, a migration, and the UI below. Nobody else wrote a line.
A feature he shipped: any agent can now carry a cron schedule. The first one we set up wakes every weekday at 9am, hunts dead code, and opens a draft PR, no human in the loop.
He answers from context, too: "where do we set the idle-reaper timeout?" comes back with a file path and a commit hash, not vibes.
The plan, and what's next
This is more ambitious than "we have a neat bot": we want Gilfoyle to be the only way we develop AnyFrame. Every feature, every fix, routed through the intern who lives in the product. That makes us the most demanding user of our own product, continuously: the instant he can't do something a developer needs, it's a bug report written in our own blood, landing the same week a customer would feel it. So the loop self-reinforces, we use him to build AnyFrame, that surfaces what he's missing, we have him build that too. Features come from this loop, not a whiteboard. (Scheduling, above, was on this list until he shipped it.) Next up:
Take tickets, not just messages. Connect Linear so he picks up issues and moves cards himself; the backlog becomes his queue.
Keep his checkout fresh. Pull and rebase on a schedule so he's never building on a week-old tree.
Survive a pause with processes intact. A resumed sandbox restores the disk, but anything he had running is gone; today he relaunches it by hand.
Remember across tasks. Persistent memory so each task starts smarter, an intern who improves every week instead of resetting every morning.
None of this is glamorous, and that's the point. An agent that can talk about your code is a parlor trick; one you'd make your only path to production has to clear a hundred boring bars, identity, auth, isolation, snapshots, permissions, trust. Living inside that constraint is how we find them. Gilfoyle isn't our mascot. He's our forcing function.
So: who would your Gilfoyle be, and what would you point him at?
Want to hire your own intern? At anyframe.dev, create a template, spin an agent off it, and switch on the Discord integration. You'll have a sandboxed dev agent in your team chat before your coffee's cold. Naming him Gilfoyle is optional but encouraged.
← back to all posts