The Embedded Developer

Every experienced developer carries knowledge about their codebase that lives nowhere else. Not in the documentation. Not in the architecture diagrams. In the space between decisions — why the connection pooling works this way, why that migration strategy exists, what happened the last time someone tried the obvious approach.

This knowledge is institutional. It accumulates through incidents, refactors, and hard-won lessons. It’s the reason a senior engineer can glance at a pull request and say “this will break under concurrent writes” while a new hire sees nothing wrong. It’s the difference between code that works and code that works correctly in this specific system, with its specific history and constraints.

When that engineer leaves, the knowledge walks out with them. It doesn’t live in the README. It doesn’t live in the wiki last updated eighteen months ago. It lives in Slack threads nobody will search, in code review comments attached to long-merged PRs, in the oral tradition of a standup where someone once said “we tried that, here’s why it broke.”

This has always been expensive. It’s about to become something worse, because we’re handing the keyboard to agents that have none of it.

The permanent new hire

Every developer who has joined a new project knows the feeling. You clone the repository, read the README, open the code and start forming a mental model. For the first weeks you’re operating on general knowledge and pattern matching. You write code that is plausible but subtly wrong — not syntactically, not logically, but contextually. You use the ORM in a way that works but violates the team’s transaction boundaries. You add a test that passes but follows a pattern the team abandoned six months ago. You solve the problem correctly in isolation and incorrectly in this system.

Good teams have mechanisms for this. Code review catches the mistakes. A patient colleague explains the history. Over time you absorb the institutional knowledge and start making decisions that reflect the system’s actual constraints.

AI agents are permanent new developers. They join fresh every session. They have expansive general knowledge and zero institutional knowledge. They don’t know that your SQLite wrapper requires single-writer semantics because you hit a deadlock in production. They don’t know that a convenience constructor was removed in v0.17.0. They don’t know that migration tests must use BEGIN IMMEDIATE because two developers discovered the race condition independently, three months apart.

Without that knowledge, agents make the same mistakes every new developer makes. Faster, more confidently, and at greater scale. They produce code that looks right, passes the linter, and breaks in the specific ways that only someone who has been on the project would anticipate.

The role shift

Something else is happening simultaneously. Developers aren’t just getting AI assistants. They’re becoming technical leads.

The ratio is shifting. Where a developer once wrote code directly, they increasingly describe intent and review output. They define what should happen and evaluate whether it did. They’re architects and code reviewers, the people who set direction and maintain quality, not the people who type the implementation.

This is the natural trajectory of a profession that has always moved toward higher abstraction. We stopped writing machine code. We stopped managing memory manually. We stopped writing boilerplate. Each transition freed developers to think at a higher level. Agents are the next transition.

But a tech lead is only as effective as their team’s understanding of the codebase.

If you’re leading developers who understand the system deeply, who know the history, the constraints, the failure modes, you can operate at the level of intent. You say “add rate limiting to the API” and trust the implementation will respect the existing middleware chain, use the right storage backend, follow the error-handling patterns the team has established.

If you’re leading developers who are seeing the codebase for the first time, every session, you’re not a tech lead. You’re a micromanager. Specifying implementation details that should be obvious to anyone who knows the system. Catching mistakes in review that should never have been made. Spending your time on the gap between what the agent knows and what it needs to know.

That gap is institutional knowledge. The question is where it lives.

What the embedded developer actually means

Instead of asking how we document the codebase for agents, ask: what if the codebase could teach agents how to work on it?

Not a README that describes the system from the outside. Not a flat instruction file compressing everything into a single context load. The accumulated expertise of everyone who has ever worked on the project, structured so the right knowledge surfaces at the right time, scoped to the right context, available to any agent that opens the repository.

This is what the embedded developer actually means. Not a person. Not an LLM. Everything necessary to make any capable model behave as a practitioner who knows this system, shares its values, and acts on behalf of the humans behind it. When you activate it, the agent doesn’t start from zero. It starts with the institutional knowledge that took years to accumulate, and it behaves accordingly.

An agent opens a project and queries for guidance scoped to the file it’s about to modify. It learns that this directory uses a specific testing pattern. It learns the database layer has concurrency constraints requiring a particular migration strategy. It learns that a dependency removed a convenience function two versions ago and here is the replacement. It learns what a senior developer would tell a junior developer sitting next to them — except the senior developer recorded it once and it’s available to every agent, every session, from that point forward.

This isn’t documentation. Documentation describes a system for human readers navigating at their own pace. The embedded developer is behavioral, operational knowledge, structured for machine consumption, scoped to specific contexts, and delivered on demand. It doesn’t describe what the system is. It shapes how the agent acts within it.

Three properties that make it work

The embedded developer has three properties that distinguish it from documentation, wikis, or flat instruction files.

The first is scoping. The knowledge isn’t global. A developer working in the database layer gets database layer guidance. A developer working on the API gets API guidance. The project doesn’t dump everything into the context window and hope the agent determines what’s relevant. The right knowledge surfaces because the query includes the context — the file being edited, the topic being addressed, the package being modified. Scoping is what makes institutional knowledge scale. Without it, every new piece of guidance dilutes everything else.

The second is co-location. The knowledge lives next to the code it describes. When someone changes the migration strategy, the skill describing it is right there, in the same directory, visible in the same pull request. A reviewer can say “you changed the concurrency model but didn’t update the skill” the same way they’d say “you changed the API but didn’t update the test.” The knowledge is part of the definition of done.

The third is version control. The knowledge has history. git blame tells you who taught the project this lesson and when. You can see the moment someone added “this constructor was removed in v0.17.0” and trace it to the pull request where they spent an hour discovering that. The knowledge isn’t a static artifact. It’s a living record that moves with the codebase through branches, merges, and releases.

No external wiki, no shared knowledge base, no flat instruction file gives you all three. The embedded developer does because the knowledge isn’t about the code. It’s part of the code.

Skills that travel

There’s a fourth property that extends the embedded developer beyond a single repository: skills that travel with dependencies.

When your team installs a library, two things would ideally arrive. The code itself, and the library author’s operational expertise about how to use it correctly. Not reference documentation — the practical, opinionated guidance a maintainer would give you if they were pairing with you. “Don’t use connection pooling defaults with this driver.” “This constructor was removed in v0.17.0, construct the result struct directly.” “If you’re migrating from v2, here is the pattern that replaces the old middleware chain.”

This guidance currently lives in blog posts that go stale, Stack Overflow answers pinned to the wrong version, and GitHub issues that only surface if you know to search for them. It’s too operational for reference documentation and too library-specific for a project’s own skills. It falls through the gap.

If the library ships these skills alongside its code, they arrive when the library is installed and update when it’s upgraded. In a monorepo where different services depend on different versions of the same library, each service gets skills matched to its resolved version. The guidance is never copied. It’s never stale. It’s part of the dependency graph.

For library authors this changes the relationship with consumers entirely. Today you write documentation and hope someone reads it. With skills you write behavioral guidance and know it will be delivered directly into the agent’s context, scoped to the exact code that uses your library, at the exact version installed.

You’re not shipping code. You’re shipping the developer who knows how to work with it.

The interrogation model

This changes how you interact with your codebase.

Today you open a project and start reading. You form a mental model, make assumptions, ask a colleague when you get stuck. Knowledge flows from people to you, mediated by conversation and shared context.

With the embedded developer, you interrogate the project directly. “What do you know about how testing works in this area?” is not a question directed at a person. It’s a query against the project’s accumulated behavioral knowledge, and the answer is grounded in the project’s actual practices rather than generic best practices from training data.

The project has opinions. It has learned from its mistakes. It knows which approaches have been tried and abandoned. It can articulate all of this to any agent, any developer, at any time, without depending on a specific person being available or remembering the details correctly.

This is what it means to operate as a technical lead rather than an implementer. You’re not micromanaging an agent that knows nothing about your system. You’re directing an agent that has internalized the same judgment your best developer carries — and that acts, at every step, as a representative of the humans behind it.

Building it

This is not a theoretical framework. It’s the direction Skillex is taking.

The skill files Skillex manages — scoped, structured, indexed into a queryable registry, served through MCP or CLI — are the substrate for the embedded developer. The scoping rules that determine which skills apply to which paths are what makes the knowledge contextual. The dependency scanning that discovers skills in installed packages is what makes it travel. The version control that tracks every change is what makes it trustworthy.

I’m building Skillex with Skillex. The repository carries its own skills — how to work on the database layer, how the query engine’s response contract works, what patterns the MCP server follows, how testing is structured. An agent that opens the repository queries those skills and immediately operates with the institutional knowledge of the project. It’s both the reference implementation and the proof of concept.

The skills will evolve. Teams will discover what knowledge is most valuable to encode. Library authors will learn what guidance their consumers’ agents need most. The patterns for writing effective skills will become clearer as more teams adopt the practice.

But the core idea is simple and durable: a codebase should carry its own expertise. Not as documentation for humans to browse, but as behavioral knowledge for agents to embody. Every lesson learned, every pattern established, every mistake avoided — captured in version control, scoped to where it matters, evolving with the code.

The embedded developer isn’t a person. It’s what the project knows about how to work on itself. And any capable model that queries it doesn’t just know your system. It acts like someone who has always been on your team.