Over the past few months it's become increasingly clear to me that every team will be running on agents. The hard part is of course deciding how deep I wanted the agents to reach inside of our team. So in my interviews, this was one of the biggest questions I asked. Across the 23 YC teams I interviewed, a consistent pattern emerged: founders use AI “employees” to write code, triage ops, enrich leads, monitor models, and even draft strategy—while keeping humans squarely in the loop for what truly matters. It seems like the definition of busy work is shifting more and more. It's interesting, the most efficient users of AI employees draw crisp boundaries around where agents run free versus where human judgment stays close, similar to managing human employees. The nice thing about agents, however, is that they're not seeking more enriching projects, better experience on their resume, or any "human" needs (as far as we know), so confining them to one task works much better than with traditional employees. It's also worth noting that the boundary where agents can operate seems to move each week as humans build trust and technology improves.
One thing that surprised me in these interviews is how quickly founders stopped talking about AI as a tool and started talking about it more like staff. Not in a hypey “replace everyone” sense, but in a very practical sense: this agent handles that job, this one owns this workflow, this one runs overnight.
The teams getting the most leverage here are not just prompting better. They’re designing internal systems. They give agents a narrow mandate, the right context, and access to the tools they actually need, then they let them run in sequence the same way you’d structure work across people on a team. Some are using agents to validate leads, research accounts, and draft outreach on dedicated machines. Others are pushing the same model into research, with AI systems that review papers, identify gaps, and kick off experiments. And inside engineering, recurring tasks are being turned into reusable skills instead of re-solving them from scratch every time. What stood out to me is that these are not one-off demos. They’re stable internal roles. Once founders see a task repeated enough times, the instinct is increasingly: why is a human still doing this manually?
“Our agents use internal endpoints to scrape docs and onboard new APIs. What used to take a day—or a couple of days—now takes an hour or even minutes. We’ve already converted that into a reusable skill.”
Founders who get leverage from AI aren’t outsourcing thinking—they’re redesigning their organizations around continuous, supervised automation. The common moves: draw bright lines between core and non-core work, treat agents like hires with living documentation, wire always-on loops, and pick tools that fit your workflow rather than chasing hype. The outcome is not fewer decisions; it’s better ones, made with more surface area covered by agents and more time reserved for the judgment calls only founders can make. Do that, and AI stops being a tool—and starts becoming your operating system.
More founder insights
Explore other topics to learn how YC founders build their products.
Browse all topics
Join hundreds of companies using Human Behavior to turn user sessions into actionable insights. Set up in 90 seconds.
The shift here is subtle but important. The best founders are not asking where AI can assist them in random moments. They’re asking which parts of the company can be operationalized into repeatable agent-owned work. That framing feels a lot more useful — and a lot more durable.
After that, the next thing founders seem to want is pretty obvious: they want the company to keep moving when they’re not actively at their desk. A lot of the best setups I saw were really about rhythm — giving agents enough context and structure that useful work keeps happening overnight, between meetings, or whenever the team is offline.
This is where documentation starts to matter a lot more than most people expect. The founders getting good results are usually not relying on raw model intelligence alone; they’re building a persistent layer of context around it. Claude.md files, architecture docs, product notes, checklists, examples of good decisions and bad ones — all of that becomes part of the operating system. Then they split work across sub-agents with narrower responsibilities and cleaner handoffs. Some teams want to wake up to a queue of PRs. Others have specialized agents for different parts of the stack. Some run heartbeat workflows throughout the night. And many have the simple but important habit of turning every repeated mistake into an instruction the system remembers next time. That pattern came up a lot in interviews: if an agent fails in a predictable way, the answer is not to complain about AI. The answer is to update the training ground it operates inside.
“We try to max Claude Code as much as possible. I want a backlog the agent can run in a loop overnight so we wake up and review five PRs every morning. We’re not there yet, but that’s the goal.”
“What I love with Claude is sub‑agents. We’ve programmatically trained them on our product, security, compliance, auth—so anyone can use Claude and start shipping. Think of them as a team that understands the product better than we do.”
“We keep a shared Claude.md—everything we’re working on, plus an architecture file. Before work, we know exactly what to do. It’s like having specialized engineers we continuously train and update.”
“Create a Claude.md with instructions sent on every request—design patterns, anti‑patterns, company context. When the AI makes a mistake, add it so it doesn’t happen again. It’s like onboarding an intern.”
What this changes is the cadence of the company. Instead of AI being something you occasionally reach for, it becomes part of the background process of shipping. The teams that seem furthest along are the ones that have made context, feedback, and overnight execution feel routine.
Probably the strongest pattern across all these conversations was not blind optimism. It was restraint. The founders using agents most aggressively usually had the clearest sense of where they did not want them operating unchecked. But the more interesting nuance is that this boundary is not fixed. It moves as agents get better, as workflows become more observable, and as founders build trust through repetition. The line is still real—it just isn’t permanent.
That boundary looked pretty consistent in the present tense. Agents are great for scaffolding, research, front-end work, repetitive implementation, and getting a first pass done quickly. But when something is core to the product, difficult to debug, or risky enough that failure would be painful in front of a customer, founders tend to pull humans back in close. One especially useful version of the current rule is simple: keep AI away from the live production pipeline, use it more freely on the edges. Another common compromise is to make the agent think before it builds. First generate the plan, answer the architecture questions, make sure a human actually understands what is about to happen—then let the model write code. What I’d add now is that this is less a static philosophy than a moving frontier. Founders are not drawing one permanent line between “AI work” and “human work.” They’re running a continuous process: let agents handle a bounded task, observe where they fail, add guardrails, and then expand the surface area. Even within the same company, different people delegate to different degrees based on their own trust in the system. And across teams, there’s a clear expectation that as agents become more reliable, they’ll be trusted with more free rein. That feels like a much more useful framing than “use AI” or “don’t use AI.” The real question is not only where speed is worth the added surface area of mistakes today, but how to set up your organization so that boundary can safely move tomorrow.
“We use Cursor build mode with a plan doc—answer all the questions first: how are you parsing, chunking, retrieving, where does it live, is it secure? Once that’s clear, we say, “Okay, go build it.” We audit the code, but not a lot. We still look at the code so when something breaks, we can immediately fix it.”
“For our prod pipeline, it’s all handwritten. Any velocity gains would be eaten by risk of breakage, and that’s something I never want. For front end and utilities that aren’t critical to our live bidding pipeline, a lot of it is AI‑assisted.”
“I think different members of our team leverage AI to different degrees. I think it’s a matter of personal preference. For me, I still have trust issues and I need to see the code and I need to know exactly what’s going on.”
The founders getting the most out of agents are not the ones delegating the most. They’re the ones drawing the line clearly—and then revisiting it as the tools improve. In practice that means letting agents absorb more and more of the busy work over time, while humans stay closest to the decisions and systems they may have to defend when something breaks.