Making your company agent-readable

Knowledge engines and building traces

Jun 02, 2026

Meta made headlines a few weeks back by publicly recognizing that they were using a particular type of (spy) software to track their employees’ mouse movements, clicks, keystrokes, and screens to train AI agents.

The “Model Capability Initiative,” rolled out amid significant heat, is their slightly dystopian attempt to build something specific that agents currently can’t see: the traces of how work actually happens.

Frontier models can already do a lot of knowledge work, and they keep getting better. The knowledge they have to plug into inside a company is broken: scattered APIs, half-broken integrations, undocumented exceptions, knowledge stranded in chat threads and inboxes. That gap is the biggest blocker to AI generating real economic value today.

I wrote a few weeks ago about diffusion, coordination, and traces being the real drags on enterprise AI value.

There are traces that currently live in people’s heads. Individually or collectively. The memory of precedents, exceptions, overrides, organizational understanding and cross-system context that a group of people has but aren’t really documented anywhere.

These traces are a hard problem, but probably the second half of a broader one.

The first half of the problem is how agents can access, curate, retrieve, and ingest the data that already lives in a company’s existing systems, consistently and at scale.

A problem so relevant that Y Combinator has made it a core theme of its next batch. It also explains why roughly 85% of an agent’s effort today goes into retrieving knowledge, not reasoning over it.

The brute-force tax

A typical mid-sized enterprise has its data scattered across many systems. Salesforce and HubSpot for customers. NetSuite for finance. Workday for people. Slack and Teams for chat. Outlook and Gmail for email. Notion and Confluence for knowledge. SharePoint and Google Drive for files. Jira for projects and tickets. Snowflake and Postgres for warehouses. Code repos. Local spreadsheets. It goes on and on.

Some of these expose APIs. Many don’t, or expose half-broken ones designed for humans rather than agents.

What an agent actually does today, given a task, is brute force its way through. It breaks the query apart, pulls chunks from wherever it can reach, tries to make sense of conflicting fragments, fails, retrieves more, retries.

The result is: It is slow (turns take minutes). It is expensive (token cost compounds with every retrieval). It is unreliable (same question, different answer; cross-document references fail silently). And it is ungoverned (no audit trail back to source, nothing for compliance to inspect).

A new layer in the AI stack

Andrej Karpathy frames the underlying mental model: he calls LLMs a new kind of operating system, with the context window playing the role of RAM. If that’s right, then most of the AI conversation today is happening at the application layer (which model, which prompt, which agent framework) while a substrate layer, making sense of data, is missing.

The reasoning layer that sits above the database, and that increasingly treats the database as infrastructure, is where a new generation of companies is being built.

That is a16z’s framing of the system of intelligence thesis. Pinecone calls the same layer the knowledge engine. YC calls it the company brain. Different names, same idea: a surface that sits between the operating system (the LLM) and the disk (whatever combination of vector DB, warehouse, and operational store holds the data), translating one into something the other can use.

Whatever the name, it has five jobs. They are the capabilities a company’s knowledge engine needs to be readable to agents at scale.

Accessibility

Tiered, programmatic, permission-aware access to every system that holds business data. Most enterprise systems were designed with the human as the central interactor and validator. Their APIs reflect that bias (when they exist at all).

An agent-readable layer inverts the assumption: programmatic access is the default, the human interface is the secondary surface.

Integration

A consistent, AI-readable, continuously updated representation of what the underlying systems contain. A live layer that re-indexes when source data shifts, normalizes formats across systems, and encodes the relationships between entities (customer to deal, deal to contract, contract to billing record, billing record to revenue line).

One-time ETL jobs into a vector DB go stale the moment a source schema changes. Indexing is the easy part.

Curation

The knowledge layer hands the agent a typed artifact shaped for its task. The same 10-K filing should compile into a financial-metrics artifact for the market-intelligence agent and a risk-disclosure artifact for the compliance agent, not because a human pre-tagged it for both, but because the knowledge layer knows what each agent’s task is and constructs accordingly.

Curation moves the reasoning work from retrieval-time, where it is slow and expensive, to compile-time, where it is amortized across every query that follows. This is the most important piece of this layer.

Retrieval

A declarative query language between the agent and the knowledge substrate. Today, an agent asking for information has no vocabulary to specify what it actually wants: the answer rather than twenty chunks, with field-level citation, with confidence scores, under a 500-millisecond budget.

SQL gave that vocabulary to relational databases. GraphQL gave it to connected APIs. The agentic stack needs the equivalent. Pinecone’s KnowQL is one attempt at the spec, with six declared primitives (intent, filter, output shape, provenance, confidence, budget). Whether KnowQL becomes the standard or something else does, the category is real.

Ingestion

Structured, cited, auditable handoff from knowledge engine into agent. The agent receives typed data with provenance attached, ready to act on, instead of a wall of text it has to re-parse and pretend to trust.

Citation by construction means a regulator can ask “why did your agent say that?” and the answer is mechanical, not narrative. In financial services, healthcare, and legal, this is the precondition for any deployment that touches a customer record.

Six places this is being built

Naming the layer is one thing; the harder question is who is building it. Six emerging approaches, each at a different scale and with a different theory of who curates.

Proto. Garry Tan’s GBRain is the smallest viable version of the idea: a private repository large enough to be useful, structured enough to be ingested, private enough to feel like a brain. No retrieval pipeline, no embeddings, no graph. Just a corpus a long-context model can read end to end. It works at one-person scale because the curator and the consumer are the same human.

Second brain. The Obsidian-style approach extends the same instinct one level up, with cross-linked notes maintained by a single curator. It works for individuals and for startups small enough to share one cognitive frame. Once the curation cost distributes across thousands of people, the graph degrades into a folder hierarchy with hyperlinks. The instinct is right. The mechanism does not survive the transition from one curator to many.

Hyperscaler. Microsoft is building this in plain sight, with names that reveal the aspiration. Work IQ explicitly positions itself as “the intelligence layer that personalizes Microsoft 365 Copilot,” combining three sub-layers (data, context, skills/tools) on top of Microsoft Graph, with federated MCP connectors that read live data without indexing.

Google’s answer started as Agentspace and is now part of Gemini Enterprise. Both bet that customers already paying for the productivity suite will accept the intelligence layer as a natural extension. The bet assumes the customer is willing to consolidate, which most large enterprises might not want to.

Operational incumbent. Palantir’s Foundry Ontology is the version of this layer that has been in production longest, in the customers who needed it earliest. Defense, pharma, banking, manufacturing, energy.

Palantir frames the layer as a decision engine, modeling the enterprise as four connected primitives: data, logic, action, and security. Agents navigate semantic objects, invoke forecasting and optimization models as governed tools, stage proposed writes back to source systems as scenarios, and inherit the policies that govern human users.

Two implications. The first: the layer worth building is wider than retrieval; logic and action belong in it too. The second: someone has been quietly shipping it for a while now.

Vendor. The most explicit attempt to make this a category is Pinecone Nexus, launched in May. Pinecone inverts how retrieval has been done since RAG became standard: instead of running reasoning at retrieval time, a Context Compiler (an autonomous coding agent) writes the curate() and query() code that turns raw documents into typed artifacts at build time. The agent talks to the engine through KnowQL.

Pinecone reports tokens-per-task collapsing from ~40,000 to ~2,000 and latency from minutes to under 500 milliseconds on its own benchmark (numbers are self-reported and worth discounting). The structural argument: move the work from runtime to build time, return cited typed answers.

Custom. Some institutions refuse to wait for the knowledge layer to standardize.

McKinsey’s Lilli indexes 40+ curated knowledge sources, 100,000+ documents and interview transcripts, and a network of experts across 70 countries.

Morgan Stanley was OpenAI’s only strategic client in wealth management at the GPT-4 launch, building an internal LLM that returns answers “with links to the source documents” (citation by construction, two years before the category got a name).

Bloomberg GPT trained a 50-billion-parameter model on a 363-billion-token corpus of financial documents Bloomberg had been curating for forty years.

The pattern across all three: the curation work was already done by humans for decades, and the AI layer harvests that prior investment. Custom builds work because the curation work was already done. Enterprises that haven’t done it cannot replicate the result.

Closing the loop

The biggest blocker to AI generating real economic value in the enterprise lives downstream of the model, in a substrate that was built for humans and not yet for anything else.

Most companies have never been queried before by anything other than a human, and the result (incomplete APIs, scattered systems, undocumented exceptions, knowledge stranded in inboxes) is the broken knowledge layer an agent has to work with.

A company that becomes agent-readable is a company that has built (or bought, or rented) a layer with all five capabilities: accessibility across heterogeneous systems, continuously normalized integration, task-aware curation, declarative retrieval, and structured ingestion. Without that layer, every agent your company runs pays the brute-force tax on every query, forever.

This is the first half of the agent problem. Make your company readable first. The rest follows.

In our AI Labs at Visa Consulting, we can help your business become agent-readable and AI-ready. Get in touch if interested.

Discussion about this post

Ready for more?