Skip to content

Knowledge Bases

A knowledge base is the set of markdown files the AI loads on every interaction in a workspace or project. Where a skill is conditional (the AI invokes it when the description matches), a knowledge base is always-on context; it's the layer that gives the AI a baseline understanding of your stack, conventions, and domain.

What's in a knowledge base?

Each workspace and project ships with a small, opinionated set of well-known files. From workspace-knowledge.tsx and the defaults in services/api/src/context/defaults-core.ts:

File What it carries
knowledge.md Tech stack, conventions, domain terms
instructions.md Rules for the AI to follow
identity.md Brand voice and personality
soul.md Core values and mission
memory.md Persistent facts and preferences
user.md User context and background
plan.md Project roadmap and milestones

You can also add custom files: anything ending in .md with a lowercase alphanumeric base name (e.g. style-guide.md, api-docs.md).

How to create a knowledge base

Workspace-level

  1. Open Workspace Settings → Knowledge.
  2. The default files appear, each with a description. Click any file to open it in the editor.
  3. Type your content. Auto-save fires 2.5 seconds after you stop typing.
  4. Click + Add Knowledge File to create a new well-known or custom file.

Project-level

  1. Open the project editor.
  2. In the sidebar, switch to the Knowledge tab (see apps/web/src/modules/editor/sidebar/knowledge-tab.tsx).
  3. Same auto-saving editor, scoped to this project.

Behind the scenes, the workspace UI talks to /workspaces/:id/context/... and the project UI talks to /projects/:id/context/... (see services/api/src/routes/context.ts). All content is stored in PostgreSQL as plain markdown.

Ingest format

Today, Doable's knowledge bases are plain markdown only.

  • Filename rules: lowercase, alphanumeric + ./-/_, must end with .md. Max 64 chars. Enforced by the filenameSchema Zod schema in routes/context.ts.
  • Size cap: 50,000 characters per file (contentSchema).
  • Encoding: UTF-8.

There's no automatic conversion for PDFs, DOCX, or HTML; paste the markdown rendering yourself. If you have a long reference, link out to it from the markdown and the AI can fetch the link via its built-in fetch tool when needed.

Coming soon: vector retrieval

Doable's PostgreSQL setup includes the pgvector extension (see CLAUDE.md), but at the time of writing knowledge bases use prompt injection rather than embedding-based retrieval. Every file is concatenated into the system prompt at session start (capped by contentSchema). For large bodies of reference text, split them into multiple smaller .md files and lean on the AI to call read_file when it needs depth. Embedding-based retrieval is on the roadmap; this page will be updated when it lands.

Retrieval mechanism (today)

When a chat session starts, the context injector (services/api/src/context/injector.ts and its sibling in ai/context/injector.ts) reads the relevant files for the workspace/project and wraps them into sections of the system prompt:

[PROJECT IDENTITY]   # identity.md
[PROJECT SOUL]       # soul.md
[USER CONTEXT]       # user.md
[INSTRUCTIONS]       # instructions.md
[PROJECT KNOWLEDGE]  # knowledge.md
[MEMORY]             # memory.md
[PLAN]               # plan.md
[+ any custom files]

Empty files are skipped. The injector also honors the current AI mode (build, plan, etc.) and may emphasize different files. For example, plan mode foregrounds plan.md and knowledge.md.

For the user, the practical implication: anything in these files is in the AI's working memory on every turn. Keep them tight. A 50 kB knowledge.md is fine; six 50 kB files is enough to crowd out the conversation itself.

How skills and MCP servers reference a knowledge base

The most common pattern: a skill body references a knowledge file by name and tells the AI to read it for details.

---
name: "API design review"
description: "Use when reviewing a new API endpoint."
---

When reviewing a new endpoint:

1. Check it against the conventions in `knowledge.md` (auth, error shapes, naming).
2. Cross-reference `api-docs.md` for the canonical list of existing endpoints.
3. Flag any deviation in your review.

Because knowledge.md is already in the system prompt, the AI doesn't need to call read_file; it just consults the section. For custom files, it can call read_file if they're attached to the project.

MCP servers don't read knowledge files directly. If you want an MCP tool to act on knowledge, expose a tool that takes the knowledge as an argument and let the AI feed it in.

Best practices

  • One canonical fact per place. Don't duplicate the tech stack in knowledge.md, instructions.md, and a custom file. The AI gets confused, and you get drift.
  • Markdown lists beat prose. "We use Hono. Tests live in __tests__. CI runs on push to main." Three bullet points are easier for the AI to retrieve than a paragraph.
  • Don't put secrets in memory.md. It's plain text in the database, readable by every member of the workspace. Use secrets storage instead.
  • Use plan.md as the single source of truth for the current roadmap. Update it as you ship. The AI will follow it.
  • identity.md and soul.md move the dial on tone. If the AI's writing voice is off, edit identity.md first.

Custom files vs skills

Question Use a knowledge file Use a skill
Is this true always, in every conversation?
Should the AI consult it only for specific tasks?
Is the content > 5 kB? Prefer skill (saves context) --
Do I need to attach companion files (code samples, schemas)?

When in doubt: start with a knowledge file. Promote to a skill when the file becomes too big or too specific.